Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC/Final Parameters #18

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DavePearce
Copy link
Member

@DavePearce DavePearce commented Oct 17, 2017

@DavePearce
Copy link
Member Author

DavePearce commented Oct 17, 2017

Right, so have implemented the checking algorithm. It looks like almost all of the benchmarks are affected in some way (examples below). This makes the decision pretty difficult.

008_scc

function addEdge(Digraph g, nat from, nat to) -> Digraph:
     // First, ensure enough capacity
     int max = math::max(from,to)
     g = resize(g,max+1)
     // Second, add the actual edge
     g[from] = array::append(g[from],to)
     // Done
     return g

009_lz77

function match(byte[] data, nat offset, nat end) -> (int length)
 // Position to search from within sliding window
 requires (end - offset) <= 255
 // Returned match size cannot exceed sliding window
 ensures 0 <= length && length <= 255:
     //
     nat pos = end
     u8 len = 0
     //
     while offset < pos && pos < |data| && data[offset] == data[pos] && len < 255:
         //
         offset = offset + 1
         pos = pos + 1
         len = len + 1
     //
     return len

010_sort

function sort(int[] items, int start, int end) -> sortedList:
     if (start+1) < end:
         int pivot = (end + start) / 2
         int[] lhs = sort(items,start,pivot)
         int[] rhs = sort(items,pivot,end)
         int l = start
         int r = pivot
         int i = start
         while i < end && l < pivot && r < end where l >= 0 && r >= 0 && i >= 0:
             if lhs[l] <= rhs[r]:
                 items[i] = lhs[l] 
                 l=l+1
             else:
                 items[i] = rhs[r] 
                 r=r+1
             i=i+1
 ...

011_codejam

function parseJobs(nat pos, ascii::string input) -> (Job[] jobs, nat npos):
     //
     int|null nitems
     //
     pos = parser::skipWhiteSpace(pos,input)
     nitems,pos = parser::parseInt(pos,input)
     if nitems is nat:
         return parseNumJobs(nitems,pos,input)
     else:
         return [EMPTY_JOB;0],pos

@utting
Copy link

utting commented Oct 18, 2017 via email

@DavePearce
Copy link
Member Author

Thanks Mark --- that's a vote for "do it" then! Anyone else? @zmthy I think you're on the "go for it" side as well?

@zmthy
Copy link

zmthy commented Oct 18, 2017

Yeah, it's clearly the right thing to do. I initially thought that it was strange that you couldn't update the attributes of the parameters as well but that's more a consequence of your use of Java array syntax for immutable lists.

I'm not sure about the change to references in postconditions. The ' character is typically used to represent the 'next' state rather than the 'old' state.

@utting
Copy link

utting commented Oct 18, 2017 via email

@DavePearce
Copy link
Member Author

Hey Folks,

Ok, well, that seems pretty clear then. I just need to talk with Lindsay about whether or not he's happy for me to roll it out now.

Regarding the \old notation, hmmm that's interesting Mark. The good news is that this RFC does not need to make the decision on that. It's just informing me for a later RFC to introduce some way to express "oldness". Food for thought!

@DavePearce
Copy link
Member Author

@klapaukh thoughts?

@bambinella
Copy link

bambinella commented Oct 19, 2017

I've always thought that read-only should be the default for parameters. I've discussed this with C-style programmers before and one cultural objection is that it can sometimes be convenient to normalize the parameter before using it (e.g. turning it into something canonical). But I don't think that is a very good argument against it…

e.g. they want to do:

vector = normalize(vector)
…
dosomething(vector)

Whereas creating a new variable could lead to mistakes in larger functions:

nvector = normalize(vector)
…
dosomething(vector)

But I guess it can be fixed later on with an annotation like:

nvector = normalize(vector)
@forget(vector)
…
dosomething(vector)  // error: using a symbol name that is marked as unusable

@DavePearce
Copy link
Member Author

Right, I have now updated an initial batch of the failing test cases to work with this RFC. However, I have to confess, that final parameters make things ugly at times. Also, I realised the benefits compared with an explicit final modifier are not as great. Overall, I am less enthusiastic that I was before. See here:

Whiley/WhileyCompiler@8b254e0

Some example transformations:

Good

function shr(byte b, int i) -> byte:
    byte tmp = b
    while i > 0:
        tmp = tmp << 1
        i = i - 1
    return tmp

becomes:

function shr(byte b, int n) -> byte:
    int i = n
    byte tmp = b
    while i > 0:
        tmp = tmp << 1
        i = i - 1
    return tmp

I like that. Likewise, this:

function sort2(int x, int y) -> (int u, int v)
ensures u == y && v == x:
    x, y = swap(x, y)
    return x,y

becomes this

function sort2(int x, int y) -> (int u, int v)
ensures u == y && v == x:
    u, v = swap(x, y)
    return u,v

Ugly

This, on the other hand, I don't like:

function shuffle(Deck deck, Random rand, nat count) -> (Deck ndeck, Random nrand):
    //
    nat i = 0
    int from
    int to
    //
    while i < count:
       from,rand = next(rand)
       to,rand = next(rand)
       deck = swap(deck,from,to)
       i = i + 1
    //
    return deck,rand

which becomes:

function shuffle(Deck deck, Random rand, nat count) -> (Deck ndeck, Random nrand):
    //
    nrand = rand
    ndeck = deck
    //
    nat i = 0
    int from
    int to
    //
    while i < count:
       from,nrand = next(nrand)
       to,nrand = next(nrand)
       ndeck = swap(ndeck,from,to)
       i = i + 1
    //
    return ndeck,nrand

There is a general pattern where you end up assigning the parameter to the result at the beginning, which is just ugly. Another example of this:

function inc(byte b) -> (byte r):
    //
    int i = 0
    byte mask = 1b
    while i < 8 && ((b&mask) == mask):
        i = i + 1
        b = b & ~mask        
        mask = mask << 1
    //
    return b | mask

becomes:

function inc(byte b) -> (byte r):
    //
    byte tmp = b
    int i = 0
    byte mask = 1b
    while i < 8 && ((tmp&mask) == mask):
        i = i + 1
        tmp = tmp & ~mask        
        mask = mask << 1
    //
    return tmp | mask

Hmmmmmmm, thoughts needed.

Cloning

Regarding the clone optimisation I referred to in the RFC, the problem is that you now cannot distinguish between the case of a mutable versus immutable parameter. That is, when the final keyword is given you can assume it's immutable. Otherwise, that it's mutable. But, with final parameters, you lose these benefits and must always assume they are mutable.

This distinction may sound insignificant, but in the end its not. Consider these two implementations:

function set(int[] xs, int i, int v) -> (int[] ys):
   xs[i] = v
   return xs


function get(final int[] xs, int v) -> (int y):
   return xs[i]

The question is: what is the calling convention here? If we have a final parameter then the convention is that ownership of the given data structure is held by the caller (i.e. we are borrowing it). But, if the parameter is non-final, then the convention is that ownership is passed to the caller.

The issue of ownership affects what the caller will do in different situations. Consider this call:

int r = f(xs)
...

Now, either xs is live after the statement or not. If it is live then, in the case that f() may mutate xs we need to clone it. If its not live then, even if f() may mutate it we can still pass a reference. This decision can only be taken at the call-site.

What's a good solution here? Well, yes, we could introduce a mut keyword to make parameters mutable. This would then allow the function in question to communicate information to the call-site. But, I'm definitely not going down that route!!!

The other option is simply to say that this:

function f(final int x) -> (int r):
    assume P
    BODY
    assert Q[E/r]
    return E

Only holds when x is final

@klapaukh
Copy link

Also a thought on immutable parameters. If you are worried about student confusion then I think the shallow nature of the immutability still allows for the same confusion. Because you might still want to say something about *x in the precondition, where x is a parameter, but *x is still mutable, even with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants