Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse error when std.import'ed script contains executable statements #24

Open
aghast opened this issue May 24, 2022 · 3 comments
Open

Comments

@aghast
Copy link
Contributor

aghast commented May 24, 2022

Consider two files:

importer.hsh

let mcve = std.import("mcve.hsh")

mcve.hsh

std.print("Hello")
[
	[ 1, 2],
	[ 3, 4],
]

Configured thus, when I invoke hush importer.hsh I get these errors:

$ hush importer.hsh
Error: /home/aghast/Code/aghast/hush-larn/mcve.hsh (line 3, column 8) - unexpected ',', expected ']'
Error: /home/aghast/Code/aghast/hush-larn/mcve.hsh (line 4, column 8) - unexpected ',', expected expression
Error: /home/aghast/Code/aghast/hush-larn/mcve.hsh (line 5, column 0) - unexpected ']', expected expression
Panic in importer.hsh (line 1, column 21): failed to import module (/home/aghast/Code/aghast/hush-larn/mcve.hsh)

But if I comment out the print statement, there is no problem.

My expectation is that the inner script (mcve.hsh) is to be executed immediately, and the last expression evaluated will provide the return value for the std.import() function. But what appears to be happening is some kind of different parse mode(?) where the syntax is slightly different? I'm not sure if this is a bug, (IMO: yes!) or a documentation failure, or what.

@gahag
Copy link
Collaborator

gahag commented May 25, 2022

Thanks for reporting! The issue is actually caused by another matter, which can surely be a source of confusion. Let me format your code differently, and you'll see what is going on:

Original source with some whitespace removed:

std.print("Hello")[ [ 1, 2], [ 3, 4], ]

Which is equivalent to:

let var = std.print("Hello")
var[ [ 1, 2], [ 3, 4], ]

As the language has no semicolons, this gets parsed as the indexing operator, and not an array declaration. This is very similar to what happens in Lua, and it surely seems very weird at a first glance. This was known from the start when I was designing the language, and in my experience using Lua there is very seldomly a case where you'll want to write a statement that will cause the ambiguity, such as [ [ 1, 2], [ 3, 4], ]. But I see your point, you want to use it as the return value for the module. Currently, you can work it around as:

std.print("Hello")
let ret = [
	[ 1, 2],
	[ 3, 4],
]
ret

And it should work. By the way, there is no alternative parser mode for modules, and the issue is reproducible with a single Hush script. Also note that the issue won't happen if you're returning a dictionary, as the @[ ] syntax prevents the ambiguity.

We should most likely add a semicolon symbol to allow explicit separation of statements, so that the user can use it in cases like this. Please, let me know what you think.

@aghast
Copy link
Contributor Author

aghast commented May 26, 2022

Adding a semicolon seems like it would work. But you've tried to keep semicolons out of the language, or at least out of the non-command-block parts.

With the current model, there's no way to DWIW without adding a throwaway symbol. But if the newline had a higher precedence than [, you could get the indexing behavior by moving the [ up to the previous line or just using an open paren at the start of the atom:

 (std.print("x")
  [1, 2])

Which is weird, because [] generally has super-high precedence vs other operators. So I think this is a "what do you want your language to look like" question. If you feel strongly about no newlines, then make the rule that if it looks like the start of an expression, you have to have a continuation character or open brackets or something to advance to the next line.

Or go one step further, and dictate that some kind of expression-stack entry is required to continue: open brackets, binary operators, whatever. This would bias the language to use more parens and break expressions the (IMO) wrong way:

    f = some_long_name(
        args)

    g = a +
        b

    h = x \
        + y

@gahag
Copy link
Collaborator

gahag commented May 26, 2022

Yeah, I'm not very fond of newlines having syntactical meaning. I believe optional semicolons would be the most aligned with what we already have, and the least disturbing approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants