Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal (3 options) for concise inline function syntax #5

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

michaelhkay
Copy link
Member

Proposes various options for simplified syntax for declaring inline
functions, and makes recommendations

Proposes various options for simplified syntax for declaring inline
functions, and makes recommendations
@rhdunn
Copy link

rhdunn commented Oct 16, 2018

The text for this is a copy of #3.

(Entire document replaced; previous version was the wrong document)
@michaelhkay
Copy link
Member Author

Hopefully corrected now. I'm still getting used to this way of working.

@ChristianGruen
Copy link
Member

I would clearly be in favor of having options of this proposal accepted. My thoughts on…

Option 3

If we started from scratch, this would probably be the clear winner: Users can name their variables as they like, and the syntax is well-established.

I expected that the necessity of the unbounded look-ahead would make it a clear no-go. It’s good to have your assessment; maybe I’ll give it a try and gain some experience.

Option 2

One criticism on this approach could be that it is not as general-purpose as some users may believe at first sight. If the last parameter of a function is not referenced, the code cannot be refactored:

Example:

declare function local:inc-filter($numbers, $filter) {
  for $n in $numbers
  let $i := $n + 1
  where $filter($i)
  return $i
};
local:inc-filter((0 to 3), function($n) { boolean($n mod 2) }),
local:inc-filter((0 to 3), function($n) { true() })

Rewritten function calls:

local:inc-filter((0 to 3), -> boolean($1 mod 2),
local:inc-filter((0 to 3), function($n) { true() }) (: no rewriting possible :)

On the other hand, in some cases, this may indicate to a user that there is a better way of writing the original code.

Apart from this restriction, I really like the option: It’s even more concise than Option 3, and…

Option 1

…I would appreciate if we could use the same syntax for focus functions.

@adamretter
Copy link
Member

adamretter commented Oct 21, 2018

In general I support the idea of a simpler syntax for inline functions.

I am still chewing through the detail of the proposal but I have a comments and questions:

Focus Functions

  1. Personally I don't like the fn{ EXPR } approach, only because of the existing use of the fn prefix association with the namespace for functions in the standard library. I worry it could lead to user confusion. However something like f{ EXPR } would be fine for me.

  2. I really like -> EXPR. Very neat :-)

  3. When I read the syntax -> EXPR, the narrator in my head reads it as "apply". Perhaps that is a better name than "Focus Functions"?

  4. Are there any rules for empty sequences? Is an empty sequence on the LHS passed to the function on the RHS? If not, then this actually seems to me to be the classic function programming map operator as implemented in Lisp, Haskell, Scala, etc. If so, we should maybe consider if what we actually want here is a map operator, if not perhaps we additionally want a map operator.

  5. Like other constructs that rely on the context item, it breaks down if you want to do more complex things like joins. It's a simple syntax for simple cases.

    How about if we also implicitly bound the context item to some variable or symbol ($$), this would allow the context item to be used in both the implicit manner and explicitly through $$. Otherwise a form which allows it to be explicitly bound like $x -> ($x + 1), looks like your Arrow Syntax with Declared Parameters proposal anyway.

Short Inline Functions

  1. I am wondering in terms of both parsing and static analysis about the process of establishing the function signature of the inline function? As there is no parameter list, it seems that I would need to either:
    1. parse the body of the inline function, which could be substantial.
    2. examine each pass-by-reference or application of the inline function and infer the parameters from that.
      Either way seems laborious. @michaelhkay I guess there is an obvious simpler approach which I have missed?

Syntax with Declared Parameters

  1. Likely I haven't had enough coffee yet, but why do we need unbounded lookahead here? Is it because we are using the same syntax as a sequence constructor for the parameter list?

  2. So this seems like the obvious syntax to me. I like it very much, but that is probably because I am familiar with it.

I did a quick look at the languages I use the most (and a couple that I respect) to see how their syntax represents such things:

Java

  1. parameter -> expression
  2. (parameters) -> expression
  3. (parameters) -> {
        body
    }
    

Scala

  1. parameter => expression
  2. (parameters) => expression
  3. (parameters) => {
        body
    }
    

C++

  1. [captures](parameters) -> returnType {
        body
    }
    

Slightly different to others! You have to specify any captures, these are the variables that are captured and used inside the closure. I suspect this offers better support for strong static analysis during compilation. The returnType may be omitted.

Haskell

  1. \parameters -> expression

A nice piece of trivia, the \ was chosen as it is meant to remind users of the greek lambda character λ.

Lisp

  1. lambda (parameters) (body)

Rust

  1. |parameters| expression
  2. |parameters| {body}

I understand that you can also place the characters -> after the |parameters|.

@michaelhkay
Copy link
Member Author

michaelhkay commented Oct 23, 2018

Thanks for the detailed comments. Some observations:

  • reading -> as "apply" seems like a misreading. It's all about declaring the function, not about applying it. Some of your other comments, e.g. regarding empty sequences, also seem to be about function application rather than declaration.
  • Inferring the signature of a "short inline function". I don't think this is particularly difficult. You read "->" so you know you're expecting an expression in which $N parameter references are meaningful; you parse the expression with that knowledge, perhaps using a variant of the standard code for resolving variable references; when you've parsed that expression, you know what parameter references are present; you find the maximum value of $N, and then you have a function signature of function($_1 as item()*, $_2 as item()*, .... $_N as item()*) as item()*.
  • My inclination is that any syntax (like ($x, $y) -> ($x+$y)) that involves declaring the argument names probably isn't a big enough improvement on what we have today to be worth including. If we did it, this is probably the syntax I would go for, despite the problems of arbitrary look-ahead in the parsing. But the lookahead could cause problems for some parser technologies. (I've observed that syntax-directed editors for Java struggle with it too.)

While forming these ideas I did consider mechanisms from other languages (though my survey wasn't as wide as yours). The idea of "focus functions" is influenced by the ability to write _+1 as a function in Scala, which seems neat in the case of single-argument functions, though I really don't like the idea of _+_ as a 2-argument function (I've never seen a satisfactory explanation of the rules). I think $1+$2 is much clearer for that case.

Another thing I looked at is dropping the "->" prefix, so we recognize $1+1 and $1+$2 as functions merely because of the presence of the parameter references $1 and $2. But I don't think that works, how do we know whether ($1, $1) is a function that returns its first argument repeated twice, rather than an expression that returns a sequence of two functions? The only way to do this seems to be context-sensitivity, whereby we recognize an expression as a function by virtue of the fact that it appears in a context where a function is required. That's very alien to the XQuery/XPath tradition so I didn't pursue it further.

@adamretter
Copy link
Member

adamretter commented Oct 24, 2018

reading -> as "apply" seems like a misreading. It's all about declaring the function, not about applying it. Some of your other comments, e.g. regarding empty sequences, also seem to be about function application rather than declaration.

Hmm, I can see your point here. I don't think I am communicating my ideas very well on this issue.

Given the example:

filter(//employee, ->@salary gt 20000)

I can see that the thing on the RHS is going to be evaluated once for each thing on the LHS. Perhaps with is a better verb than apply.

It also reminds very much of the Simple Map Operator... How bad would a syntax like filter(//employee, !@salary gt 20000) be? ...runs and ducks ;-)

@benibela
Copy link

benibela commented Jan 12, 2019

Another language is

Kotlin

{ parameters -> expression }

{ expression }

If the parameters are omitted, it uses an implicit default parameter it -> , which we probably would use . for. And you can write the lambda after the function, i.e., function({...}), function() {...} or function {...} is the same.

That syntax would work well in XQuery (unless {...} would become an abbreviation for map {...}. The map prefix is really pointless). The above example would be filter(//employee, { @salary gt 20000 }) or filter(//employee, { foo -> foo/@salary gt 20000 }) or perhaps //employee => filter { foo -> foo/@salary gt 20000 }

@adamretter
Copy link
Member

Lately I have been writing a lot of nested anonymous functions for a research paper.

I realised that there may be a shortcoming with the "Short Inline Functions" form. How would we handle nested inline functions?

An example which is a function that both takes a function and returns a function. I wonder what this would look like when rewritten as "Short Inline Functions":

declare function other:something($f as function(xs:string, xs:integer) as function(xs:QName, xs:QName) as xs:string) as xs:integer external;

other:something(function($a, $b) {
  function($c, $d) {
     "place holder"
  }
});

@michaelhkay
Copy link
Member Author

michaelhkay commented Jan 16, 2019 via email

@ChristianGruen
Copy link
Member

I still like the 2 proposed options a lot.

Option 3 would surely be the most general one. It seems we can handle it in our implementation, but I don’t know if that will be true for everyone else who might be interested in supporting this feature in the future?

@adamretter
Copy link
Member

@michaelhkay Okay understood, and I agree. I wasn't trying to criticise, rather I was trying to understand if there was a clever way it could be achieved with the "Short Inline Functions" form, that I hadn't understood... I guess not.

@adamretter
Copy link
Member

@ChristianGruen I would certainly be interested in whatever options we can agree on :-)

@rhdunn
Copy link

rhdunn commented Jan 16, 2019

I like the idea of using the Kotlin-style syntax as a variant of option 3, replacing it with .. This resolves the issue of unbounded lookup, as the concise function is initiated by a {. This is also unambiguous in the current syntax as {...} is only allowed in direct element content, and other uses require a type indicator (such as array { ... }).

Simple case (both equivalent):

{ 1 }
{ -> 1 }

Implicit context item -- single parameter function:

sort(employee, { @salary })
{ . * 2 }
{ -> . * 2 }

Explicit (named) context item -- single parameter function:

sort(employee, { $e -> @salary })
{ $n -> $n * 2 }

Multi-parameter function:

{ $k, $v -> $k }
{ ($k, $v) -> $k }
{ $k as xs:string, $v -> $k }

Combined with sequence, map, and array decomposition:

{ ${key, value} -> $k }
{ $[key, value] -> $k }
{ $(key, value) -> $k }

Multiple parameters combined with sequence, map, and array decomposition:

{ $x, ${key, value} -> $k }
{ ($x, $(key, value)) -> $k }

@michaelhkay
Copy link
Member Author

michaelhkay commented Jan 16, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants