RFC/Final Parameters #18

DavePearce · 2017-10-17T02:44:28Z

Rendered: https://github.com/DavePearce/RFCs/blob/f891303bed73ea61171a4818be0ff7aeecea5461/text/0000-final-parameter.md

@bambinella @zmthy @utting: interested in your thoughts on this.

DavePearce · 2017-10-17T21:37:22Z

Right, so have implemented the checking algorithm. It looks like almost all of the benchmarks are affected in some way (examples below). This makes the decision pretty difficult.

008_scc

function addEdge(Digraph g, nat from, nat to) -> Digraph:
     // First, ensure enough capacity
     int max = math::max(from,to)
     g = resize(g,max+1)
     // Second, add the actual edge
     g[from] = array::append(g[from],to)
     // Done
     return g

009_lz77

function match(byte[] data, nat offset, nat end) -> (int length)
 // Position to search from within sliding window
 requires (end - offset) <= 255
 // Returned match size cannot exceed sliding window
 ensures 0 <= length && length <= 255:
     //
     nat pos = end
     u8 len = 0
     //
     while offset < pos && pos < |data| && data[offset] == data[pos] && len < 255:
         //
         offset = offset + 1
         pos = pos + 1
         len = len + 1
     //
     return len

010_sort

function sort(int[] items, int start, int end) -> sortedList:
     if (start+1) < end:
         int pivot = (end + start) / 2
         int[] lhs = sort(items,start,pivot)
         int[] rhs = sort(items,pivot,end)
         int l = start
         int r = pivot
         int i = start
         while i < end && l < pivot && r < end where l >= 0 && r >= 0 && i >= 0:
             if lhs[l] <= rhs[r]:
                 items[i] = lhs[l] 
                 l=l+1
             else:
                 items[i] = rhs[r] 
                 r=r+1
             i=i+1
 ...

011_codejam

function parseJobs(nat pos, ascii::string input) -> (Job[] jobs, nat npos):
     //
     int|null nitems
     //
     pos = parser::skipWhiteSpace(pos,input)
     nitems,pos = parser::parseInt(pos,input)
     if nitems is nat:
         return parseNumJobs(nitems,pos,input)
     else:
         return [EMPTY_JOB;0],pos

utting · 2017-10-18T01:20:02Z

David I think this change is an excellent idea - it is nice that it simplifies the semantics a bit, and it will also simplify the Boogie translator slightly (Boogie does not allow assignments to parameters, so I had to work around that by copying those parameters). And well done on getting the Dutch National Flag to verify! I must try that with Boogie and see how it goes... Cheers Mark PS. My translator log messages reckon that there are 68 occurrences of parameters that are mutated in tests/valid/*.whiley. $ grep -i input OUT54.txt MUTATED INPUT x : arr MUTATED INPUT xs : int[] MUTATED INPUT b : byte MUTATED INPUT b : byte MUTATED INPUT i : int MUTATED INPUT b : byte MUTATED INPUT i : int MUTATED INPUT b : byte MUTATED INPUT i : int MUTATED INPUT b : byte MUTATED INPUT i : int MUTATED INPUT tree : BTree MUTATED INPUT tree : BTree MUTATED INPUT tree : BTree MUTATED INPUT xs : int[] MUTATED INPUT items : int[] MUTATED INPUT xs : nat[] MUTATED INPUT list : nat[] MUTATED INPUT x : int MUTATED INPUT x : int MUTATED INPUT x : int MUTATED INPUT x : int MUTATED INPUT y : int MUTATED INPUT v : char MUTATED INPUT state : BufferState MUTATED INPUT v : char MUTATED INPUT state : BufferState MUTATED INPUT p : &a:int MUTATED INPUT q : &a:int MUTATED INPUT x : &a:int MUTATED INPUT y : &b:int MUTATED INPUT b : byte MUTATED INPUT a : int[] MUTATED INPUT xs : int[][] MUTATED INPUT ls : int[][] MUTATED INPUT list : nint[][] MUTATED INPUT list : nint[][] MUTATED INPUT _this : Sum MUTATED INPUT _this : Sum MUTATED INPUT _this : Ptype MUTATED INPUT _this : Ptype MUTATED INPUT _this : pState MUTATED INPUT _this : &Queue MUTATED INPUT _this : MyProc MUTATED INPUT _this : MyProc2 MUTATED INPUT _this : &MyProc2 MUTATED INPUT _this : &Queue MUTATED INPUT _this : Ptype MUTATED INPUT x : rec MUTATED INPUT r : rec2 MUTATED INPUT x_ptr : &int MUTATED INPUT y_ptr : &int MUTATED INPUT x : &&int MUTATED INPUT x : &int MUTATED INPUT str : int[] MUTATED INPUT e : expr MUTATED INPUT x : (int|null) MUTATED INPUT l : list MUTATED INPUT v1 : int[] MUTATED INPUT list : LinkedList MUTATED INPUT src : nat[] MUTATED INPUT offset : nat MUTATED INPUT l : LinkedList MUTATED INPUT items : int[] MUTATED INPUT xs : int[] MUTATED INPUT items : int[] MUTATED INPUT n : int MUTATED INPUT n : int

…

________________________________ Dr Mark Utting Senior Lecturer, ICT School of Business | Faculty of Arts, Business and Law University of the Sunshine Coast Tel: +61 7 5459 4495 | Email: [email protected]

________________________________ From: David Pearce <[email protected]> Sent: 18 October 2017 07:37:23 To: Whiley/RFCs Cc: Mark Utting; Mention Subject: Re: [Whiley/RFCs] RFC/Final Parameters (#18) Right, so have implemented the checking algorithm. It looks like almost all of the benchmarks are affected in some way (examples below). In fact, the Dutch National Flag benchmark was affected, and I correct it in the version I sent around. Code Examples: 008_scc function addEdge(Digraph g, nat from, nat to) -> Digraph: // First, ensure enough capacity int max = math::max(from,to) g = resize(g,max+1) // Second, add the actual edge g[from] = array::append(g[from],to) // Done return g 009_lz77 function match(byte[] data, nat offset, nat end) -> (int length) // Position to search from within sliding window requires (end - offset) <= 255 // Returned match size cannot exceed sliding window ensures 0 <= length && length <= 255: // nat pos = end u8 len = 0 // while offset < pos && pos < |data| && data[offset] == data[pos] && len < 255: // offset = offset + 1 pos = pos + 1 len = len + 1 // return len 010_sort function sort(int[] items, int start, int end) -> sortedList: if (start+1) < end: int pivot = (end + start) / 2 int[] lhs = sort(items,start,pivot) int[] rhs = sort(items,pivot,end) int l = start int r = pivot int i = start while i < end && l < pivot && r < end where l >= 0 && r >= 0 && i >= 0: if lhs[l] <= rhs[r]: items[i] = lhs[l] l=l+1 else: items[i] = rhs[r] r=r+1 i=i+1> ... ============================================================================ 011_codejam ============================================================================> function parseJobs(nat pos, ascii::string input) -> (Job[] jobs, nat npos): // int|null nitems // pos = parser::skipWhiteSpace(pos,input) nitems,pos = parser::parseInt(pos,input) if nitems is nat: return parseNumJobs(nitems,pos,input) else: return [EMPTY_JOB;0],pos — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#18 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AJcDjqCcEuMZnCSUajoSQz7OASgKLQF9ks5stR4TgaJpZM4P7gL0>. USC, Locked Bag 4, Maroochydore DC, Queensland, 4558 Australia. CRICOS Provider No: 01595D Please consider the environment before printing this email. This email is confidential. If received in error, please delete it from your system.

DavePearce · 2017-10-18T20:13:40Z

Thanks Mark --- that's a vote for "do it" then! Anyone else? @zmthy I think you're on the "go for it" side as well?

zmthy · 2017-10-18T22:58:35Z

Yeah, it's clearly the right thing to do. I initially thought that it was strange that you couldn't update the attributes of the parameters as well but that's more a consequence of your use of Java array syntax for immutable lists.

I'm not sure about the change to references in postconditions. The ' character is typically used to represent the 'next' state rather than the 'old' state.

utting · 2017-10-18T23:38:40Z

David and everyone I agree with Timothy that changing ' to mean the *initial* state seems unfortunate. Primed variables are used for the final state in Z and UTP, and giving them the opposite meaning is likely to increase confusion when people start using Whiley. Is there an alternative character (like ^) that could be used for the initial state? Or perhaps even the more explicit *\old{x}? Cheers Mark

…

________________________________ Dr Mark Utting Senior Lecturer, ICT School of Business | Faculty of Arts, Business and Law University of the Sunshine Coast Tel: +61 7 5459 4495 | Email: [email protected]

________________________________ From: Timothy Jones <[email protected]> Sent: 19 October 2017 08:58:36 To: Whiley/RFCs Cc: Mark Utting; Mention Subject: Re: [Whiley/RFCs] RFC/Final Parameters (#18) Yeah, it's clearly the right thing to do. I initially thought that it was strange that you couldn't update the attributes of the parameters as well but that's more a consequence of your use of Java array syntax for immutable lists. I'm not sure about the change to references in postconditions. The ' character is typically used to represent the 'next' state rather than the 'old' state. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#18 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AJcDjoivrHR-mrU6FTAkZ1zbHqQTBQLmks5stoKcgaJpZM4P7gL0>. USC, Locked Bag 4, Maroochydore DC, Queensland, 4558 Australia. CRICOS Provider No: 01595D Please consider the environment before printing this email. This email is confidential. If received in error, please delete it from your system.

DavePearce · 2017-10-19T00:28:19Z

Hey Folks,

Ok, well, that seems pretty clear then. I just need to talk with Lindsay about whether or not he's happy for me to roll it out now.

Regarding the \old notation, hmmm that's interesting Mark. The good news is that this RFC does not need to make the decision on that. It's just informing me for a later RFC to introduce some way to express "oldness". Food for thought!

DavePearce · 2017-10-19T01:11:13Z

@klapaukh thoughts?

bambinella · 2017-10-19T08:52:17Z

I've always thought that read-only should be the default for parameters. I've discussed this with C-style programmers before and one cultural objection is that it can sometimes be convenient to normalize the parameter before using it (e.g. turning it into something canonical). But I don't think that is a very good argument against it…

e.g. they want to do:

vector = normalize(vector)
…
dosomething(vector)

Whereas creating a new variable could lead to mistakes in larger functions:

nvector = normalize(vector)
…
dosomething(vector)

But I guess it can be fixed later on with an annotation like:

nvector = normalize(vector)
@forget(vector)
…
dosomething(vector)  // error: using a symbol name that is marked as unusable

DavePearce · 2017-10-19T22:11:19Z

Right, I have now updated an initial batch of the failing test cases to work with this RFC. However, I have to confess, that final parameters make things ugly at times. Also, I realised the benefits compared with an explicit final modifier are not as great. Overall, I am less enthusiastic that I was before. See here:

Whiley/WhileyCompiler@8b254e0

Some example transformations:

Good

function shr(byte b, int i) -> byte:
    byte tmp = b
    while i > 0:
        tmp = tmp << 1
        i = i - 1
    return tmp

becomes:

function shr(byte b, int n) -> byte:
    int i = n
    byte tmp = b
    while i > 0:
        tmp = tmp << 1
        i = i - 1
    return tmp

I like that. Likewise, this:

function sort2(int x, int y) -> (int u, int v)
ensures u == y && v == x:
    x, y = swap(x, y)
    return x,y

becomes this

function sort2(int x, int y) -> (int u, int v)
ensures u == y && v == x:
    u, v = swap(x, y)
    return u,v

Ugly

This, on the other hand, I don't like:

function shuffle(Deck deck, Random rand, nat count) -> (Deck ndeck, Random nrand):
    //
    nat i = 0
    int from
    int to
    //
    while i < count:
       from,rand = next(rand)
       to,rand = next(rand)
       deck = swap(deck,from,to)
       i = i + 1
    //
    return deck,rand

which becomes:

function shuffle(Deck deck, Random rand, nat count) -> (Deck ndeck, Random nrand):
    //
    nrand = rand
    ndeck = deck
    //
    nat i = 0
    int from
    int to
    //
    while i < count:
       from,nrand = next(nrand)
       to,nrand = next(nrand)
       ndeck = swap(ndeck,from,to)
       i = i + 1
    //
    return ndeck,nrand

There is a general pattern where you end up assigning the parameter to the result at the beginning, which is just ugly. Another example of this:

function inc(byte b) -> (byte r):
    //
    int i = 0
    byte mask = 1b
    while i < 8 && ((b&mask) == mask):
        i = i + 1
        b = b & ~mask        
        mask = mask << 1
    //
    return b | mask

becomes:

function inc(byte b) -> (byte r):
    //
    byte tmp = b
    int i = 0
    byte mask = 1b
    while i < 8 && ((tmp&mask) == mask):
        i = i + 1
        tmp = tmp & ~mask        
        mask = mask << 1
    //
    return tmp | mask

Hmmmmmmm, thoughts needed.

Cloning

Regarding the clone optimisation I referred to in the RFC, the problem is that you now cannot distinguish between the case of a mutable versus immutable parameter. That is, when the final keyword is given you can assume it's immutable. Otherwise, that it's mutable. But, with final parameters, you lose these benefits and must always assume they are mutable.

This distinction may sound insignificant, but in the end its not. Consider these two implementations:

function set(int[] xs, int i, int v) -> (int[] ys):
   xs[i] = v
   return xs


function get(final int[] xs, int v) -> (int y):
   return xs[i]

The question is: what is the calling convention here? If we have a final parameter then the convention is that ownership of the given data structure is held by the caller (i.e. we are borrowing it). But, if the parameter is non-final, then the convention is that ownership is passed to the caller.

The issue of ownership affects what the caller will do in different situations. Consider this call:

int r = f(xs)
...

Now, either xs is live after the statement or not. If it is live then, in the case that f() may mutate xs we need to clone it. If its not live then, even if f() may mutate it we can still pass a reference. This decision can only be taken at the call-site.

What's a good solution here? Well, yes, we could introduce a mut keyword to make parameters mutable. This would then allow the function in question to communicate information to the call-site. But, I'm definitely not going down that route!!!

The other option is simply to say that this:

function f(final int x) -> (int r):
    assume P
    BODY
    assert Q[E/r]
    return E

Only holds when x is final

klapaukh · 2017-10-19T22:18:32Z

Also a thought on immutable parameters. If you are worried about student confusion then I think the shallow nature of the immutability still allows for the same confusion. Because you might still want to say something about *x in the precondition, where x is a parameter, but *x is still mutable, even with this.

Initial draft

f891303

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC/Final Parameters #18

RFC/Final Parameters #18

DavePearce commented Oct 17, 2017 •

edited

Loading

DavePearce commented Oct 17, 2017 •

edited

Loading

utting commented Oct 18, 2017 via email

DavePearce commented Oct 18, 2017

zmthy commented Oct 18, 2017

utting commented Oct 18, 2017 via email

DavePearce commented Oct 19, 2017

DavePearce commented Oct 19, 2017

bambinella commented Oct 19, 2017 •

edited

Loading

DavePearce commented Oct 19, 2017

klapaukh commented Oct 19, 2017

RFC/Final Parameters #18

Are you sure you want to change the base?

RFC/Final Parameters #18

Conversation

DavePearce commented Oct 17, 2017 • edited Loading

DavePearce commented Oct 17, 2017 • edited Loading

008_scc

009_lz77

010_sort

011_codejam

utting commented Oct 18, 2017 via email

DavePearce commented Oct 18, 2017

zmthy commented Oct 18, 2017

utting commented Oct 18, 2017 via email

DavePearce commented Oct 19, 2017

DavePearce commented Oct 19, 2017

bambinella commented Oct 19, 2017 • edited Loading

DavePearce commented Oct 19, 2017

Good

Ugly

Cloning

klapaukh commented Oct 19, 2017

DavePearce commented Oct 17, 2017 •

edited

Loading

DavePearce commented Oct 17, 2017 •

edited

Loading

bambinella commented Oct 19, 2017 •

edited

Loading