Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module system #1020

Open
wants to merge 105 commits into
base: dev
Choose a base branch
from
Open

module system #1020

wants to merge 105 commits into from

Conversation

gensofubi
Copy link
Contributor

@gensofubi gensofubi commented May 19, 2023

Haz3l module system.
Current progress:
Type Module defined (with detailed information as name-type pair), keyword module and operator . implemented,

Type annotation for modules implemented. Use a pair of curly parentheses around a tuple of patterns. Typically the tuple should contain only TypeAnn patterns, but Var/Tag will also be accepted and treated as hole-typed members.

Type Inconsistency warnings includes both the pattern (which define the member) and the definition expression.
For member name inconsistency, accessing members shown in module's typeann but not defined will be treated as a emptyhole (supposing users are to implement later), accessing members defined but not shown in module's typeann will be treated as free variables.

P~M0}3R$CFZ%9@ORYXX$NUH

Basic Type member implemented. Type member in module type annotation implemented.
For constructors defined by Sum types in module, either access directly by dot access, or access under analytic mode (dot access not necessary).

8``TU` P}~5IH75ADL9NYMP

Provided warnings for using type members that are not defined, and inconsistency for type members. Created documentations for module-related keywords.

Some current rules of typing and casting:
Module type consistency: module types are consistent only if they share the same member name list, and the types of every member are consistent. Therefore, the ground type of a module is one where all members have hole types.

For module creation, the definition part will always be packaged as the exact same module type of the annotation as we mentioned amove. So nothing to worry about casting.

Casts will be added on module aliases. Cast calculation will happen on dot operators.

0c7316442d59814e378104002eb3efb8

@cyrus- cyrus- changed the title Haz3l module module system May 20, 2023
@cyrus- cyrus- added the in-development for PRs that remain in development label Jun 10, 2023
@cyrus-
Copy link
Member

cyrus- commented Jun 10, 2023

@gensofubi as you make progress on this, can you update the PR description documenting what is now possible

@gensofubi gensofubi changed the base branch from dev to adt-defs-after June 10, 2023 20:10
@cyrus-
Copy link
Member

cyrus- commented Mar 30, 2024

The types being reported for member variables of modules seem to be inconsistent.

In the following, the type is N.N (expecting just N as long as the alias N = N.N is in scope):

module N = 
  type N = String in 
  let z : N = "Abc"  in   
in 
N.z

In the following, where the module is nested deeper, however, the type is String (I would expect either M.N.N or M.N):

module M =
  module N = 
    type N = String in 
    let z : N = "Abc"  in   
  in 
in 
M.N.z

@gensofubi
Copy link
Contributor Author

@cyrus- These issues have been addressed

@Negabinary
Copy link
Contributor

I finally understand some amount of casting, so I found a few casting bugs - they're pretty similar to some bugs I found recently in dev so they won't be too hard to fix but casting's always annoying...

image

Problem # 1: This one is because of the definition of a ground type. Specifically ground types should have the property that if two ground types are consistent then they are equal.

e.g. Int & Int are consistent and equal (fine)
String & Int are consistent and not equal (fine)
[?] & [?] are consistent and equal
[Int] & [?] are consistent, but [Int] is not ground so this is fine

If you have a basic type like int, string, then it is ground, but if you have a more complex type like list, functions, sums or modules, then only the version that only has holes should be ground

i.e. [?], ? -> ?, +X(?) + Y, {T1: ?, T2:?} should be ground
but [Int], String -> ?, +X(Int) + Y(?), {T1:Int, T2:?} should not be ground

you can think of the ground type as representing just the top-level type - i.e. we know it's a module but we don't know what kind of module it is.

Problem # 2: once you fix problem # 1, I think you'll find that this example doesn't show an error, but it does get stuck later. This is because of the "canonical forms" thingy in the appendix of the 2019 paper. I'll probably try and write it out in the codebase because a lot of us are tripping up over this, and it's kinda hidden at the bottom of that paper.

It would be nice if we could say "if it has type module, and it is a value, then it is a Module(...)"; unfortunately it's not quite that simple with casts...

"If it has type Int, and it is a value, then it is an Int(...)" is true. "If it has type [...], and it is a value, then it is a List(...)" is unfortunately not true. This is because the list might have some casts from [X] to [Y] outside it. When you try and open up a list you have to apply a cast from X to Y for every element before you can look inside. This is what I tried to do with the unbox_list function in Transition.re

"If it has type {...}, and it is a value, then it is a Module(...)" is also not going to be true. Once you fix problem # 1, the casts around module values will stop disappearing, and so now we'll have to deal with them. In particular if you have a module {x=4}:<{x:Int} => {x:?} > then when you access M.x, you need to add a cast to it and return x<Int=>?>

I know this stuff is complicated, and I barely understand it myself, so if you want to meet at some point to discuss I'd be happy to.

code for reproducing example:

module T = 
  let x = 5 in 
in
module S : {x: } = T in
module U : {x:String} = S in

U.x ++ "hello"

should evaluate to

FailedCast(5, Int, String) ++ "hello"

on an unrelated note:

image

idk whether the desired behaviour is to evaluate this or to get stuck with some sort of error message, but this evaluation is just hanging atm

module T = 
  let x = 5 in  
in
let s :   = T in
s

@cyrus-
Copy link
Member

cyrus- commented Apr 12, 2024

marking as draft until above casting issues are resolved

@cyrus- cyrus- marked this pull request as draft April 12, 2024 19:06
@cyrus-
Copy link
Member

cyrus- commented May 1, 2024

@gensofubi can you merge with dev when you get a chance

@cyrus-
Copy link
Member

cyrus- commented May 1, 2024

Getting a strange run-time cast failure on this example:
image

@cyrus- cyrus- marked this pull request as ready for review May 8, 2024 23:42
@Negabinary
Copy link
Contributor

Back with another type bug sorry!!!

An important property of type consistency is that it's a symmetric relationship. (A ~ B ⇔ B ~ A) and unfortunately that doesn't hold for modules at the moment. In particular {x:Int} is consistent with {?} but not the other way around.

At first it seems fine because whenever we need a {?} we can always use a {x:Int}, but the problem comes when you look at the type that a function takes. (This problem is more generally called covariance and contravariance https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science); but we in Hazel avoid it entirely by making consistency symmetric)

let f : {x:Int} -> {x:Int} = fun M -> M in
let g : {?} -> {x:Int} = f in

This type checks fine because {x:Int} is consistent with {?}; but if you look at what the function g now does, it allows us to use any module as an {x:Int}.

Here's an example where I've tricked the type checker into letting us use {y=4} as an {x:Int} using this g function:

image

let f : {x:Int} -> {x:Int} = fun M -> M in
let g : { } -> {x:Int} = f in

module M =
  let y = 4 in 
in

g(M).x

@Negabinary
Copy link
Contributor

Looks good now! I wasn't able to find any more casting bugs - not letting users define rho types feels like a good practical workaround for things like {x:?, …}->{x:?, y:?}->{y:?, …} for now.

@cyrus-
Copy link
Member

cyrus- commented May 30, 2024

@gensofubi reviewing now, but there is a merge conflict when you get a chance

@cyrus-
Copy link
Member

cyrus- commented Jul 25, 2024

@gensofubi I'm a bit confused about what is the difference between a let binding with a module signature in the type position and a module binding. The error message in this screenshot is quite confusing:

image

@disconcision
Copy link
Member

@cyrus- syntax notes / questions:

  1. Is the only reason for a separate 'module' definition form to avoid ambiguity with a let which uses a nullary constructor as its pattern?
  2. What is your view for what a 'hazel program' should be in the world tree model? i.e. if i just want to start writing some code without regard for interface, what is my top-level? am i implicitly (or explicitly) wrapped in a module (with some placeholder name)? or implictly in some kind of block form?
  3. remind me about your position on the existing pseudo-top-level definition forms (let .. in and and type .. in). do we want to separately retain (both) these? if new scratch pads as per 2 are module-wrapped, then it seems at least that we don't strictly need a pseudo-definition type form any more. (unless we're keen on retaining the ability to define types inside functions, AND are not also replacing function bodies with some kind of curly-braces form)
  4. As per the last point, it feels to me like you can basically also use curly braces to create pseudo-top-level definition contexts which would obviate the need for separate let..in, but at the cost of maybe creating some pseudo-ambiguous edge cases between module and function definitions (pursuant on above selections).

For concrete illustration here's a version of the contents of a scratch pad which:

  1. retains separate module def form
  2. is implicitly module-wrapped, or implicitly block-wrapped... doesn't seem to make a difference in this model
  3. removes existing let..in and type..in forms
  4. introduces a curly-braces block form which is interpreted as a block unless it's the body of a module definition form. semantics are: returns the value of the trailing expression, if any, or null if there's a trailing definition form.
module M : {
  type T = Int;
  let x : T;
  let y: T
} = {
  type T = String;
  let x = "yo"
};
type R = String;
let f: Int -> R = fun x -> {
  let q = string_of_int(x);
  "q: " ++ q
};
f(4)

in the above module blocks and non-module blocks are both semi-colon separated lists of convex forms, including prefix-shaped definition forms (ie the semicolon is not part of let/type) taking a right unidelimited child. i've opted against having an optional trailing semicolon, but this seems possible if-desired.

the above is almost achievable using the current syntax model... the only thing not trivially doable is the let in the module signature. if that's not possible, we could temporarily replace it with val/var/whatever.

Anyway I know the above is a bit different than we talked about, but I'm not totally clear on what your exact preferred model is, so this is intended as starting point for discussion. My goals are to (within reason) stem keyword proliferation and avoid having incidentally-different ways of doing the same thing. But I'm also ~ fine with keeping let...in and using var/val/whatever for the block-level let.

I'm curious if there's a reasonable way of finessing block versus module forms in some way which obviates the need for a separate module definition keyword, assuming some other solution were found for the constructor issue above. like perhaps curly braces is a module IFF there trailing item is a definition. Perhaps not a good idea though, would have to guard against weird error message when you accidentally do this when defining a function etc.

@disconcision
Copy link
Member

disconcision commented Aug 18, 2024

Screenshot 2024-08-17 at 10 05 42 PM

after messing around a bit i think it'd be possible (with several days of work) to get syntax in the screenshot to work (without the sort errors currently displayed). of the four issues described in the comments i know how to reasonably fix three of them, and the other seems like it must be straightforwardly possible, though not immediately obvious to me.

this uses "typ" and "val" and as such retains the old "type" and "let" without change. "typ" needs a better name if we're going to keep these as alternates and retain the existing forms; i still think it's a better idea to ditch the old ones and let typ be type and val be let.

iirc you were talking about the option of making the forms like ["let", "=", ";"] as opposed to the infix ";". I experimented with this; there doesn't seem to be any good way unless we make a subsystem where multi-delimiter forms can change to other multi-delimiter forms. i.e. we need module M: {typ T=Bool|} to be grow a semicolon if we want to add more definitions etc. seems like a bit more work than the other option with advantages unclear to me at the syntax level. as an alternative we could perhaps pretend that things are like this for maketerm parsing and indication decoration purposes (as we pretend for list lits and cases).

@Negabinary Negabinary mentioned this pull request Aug 23, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Basic module system
4 participants