Auto witgen #2071

chriseth · 2024-11-11T16:41:13Z

No description provided.

chriseth · 2024-11-13T12:16:28Z

This can now generate the following code for the binary machine:

//Known inputs: main_binary::operation_id, main_binary::A, main_binary::B
let main_binary::sel[1]_u3 = 0
let main_binary::sel[1]_u2 = 0
let main_binary::sel[1]_u1 = 0
let main_binary::sel[1]_d0 = 0
let main_binary::sel[1]_d1 = 0
let main_binary::sel[1]_d2 = 0
let main_binary::sel[1]_d3 = 0
let main_binary::sel[2]_u3 = 0
let main_binary::sel[2]_u2 = 0
let main_binary::sel[2]_u1 = 0
let main_binary::sel[2]_d0 = 0
let main_binary::sel[2]_d1 = 0
let main_binary::sel[2]_d2 = 0
let main_binary::sel[2]_d3 = 0
let main_binary::operation_id_u1 = main_binary::operation_id_d0
let main_binary::A_byte_u1 = (main_binary::A_d0 & 0xff000000) / 16777216;
let main_binary::A_u1 = main_binary::A_d0 & 0xffffff;
let main_binary::B_byte_u1 = (main_binary::B_d0 & 0xff000000) / 16777216;
let main_binary::B_u1 = main_binary::B_d0 & 0xffffff;
let mut lookup_20_3: T = 0.into();
fixed_lookup_machine.process_lookup_direct((20, vec![LookupCell::Input(&(main_binary::operation_id_d0)), LookupCell::Input(&(main_binary::A_byte_u1)), LookupCell::Input(&(main_binary::B_byte_u1)), LookupCell::Output(&mut lookup_20_3)]))
let main_binary::C_byte_u1 = lookup_20_3
let main_binary::sel[0]_d0 = 1
let main_binary::operation_id_u2 = main_binary::operation_id_u1
let main_binary::A_byte_u2 = (main_binary::A_u1 & 0xff0000) / 65536;
let main_binary::A_u2 = main_binary::A_u1 & 0xffff;
let main_binary::B_byte_u2 = (main_binary::B_u1 & 0xff0000) / 65536;
let main_binary::B_u2 = main_binary::B_u1 & 0xffff;
let mut lookup_20_3: T = 0.into();
fixed_lookup_machine.process_lookup_direct((20, vec![LookupCell::Input(&(main_binary::operation_id_u1)), LookupCell::Input(&(main_binary::A_byte_u2)), LookupCell::Input(&(main_binary::B_byte_u2)), LookupCell::Output(&mut lookup_20_3)]))
let main_binary::C_byte_u2 = lookup_20_3
let main_binary::operation_id_u3 = main_binary::operation_id_u2
let main_binary::A_byte_u3 = (main_binary::A_u2 & 0xff00) / 256;
let main_binary::A_u3 = main_binary::A_u2 & 0xff;
let main_binary::B_byte_u3 = (main_binary::B_u2 & 0xff00) / 256;
let main_binary::B_u3 = main_binary::B_u2 & 0xff;
let mut lookup_20_3: T = 0.into();
fixed_lookup_machine.process_lookup_direct((20, vec![LookupCell::Input(&(main_binary::operation_id_u2)), LookupCell::Input(&(main_binary::A_byte_u3)), LookupCell::Input(&(main_binary::B_byte_u3)), LookupCell::Output(&mut lookup_20_3)]))
let main_binary::C_byte_u3 = lookup_20_3
let main_binary::A_byte_u4 = main_binary::A_u3
let main_binary::B_byte_u4 = main_binary::B_u3
let mut lookup_20_3: T = 0.into();
fixed_lookup_machine.process_lookup_direct((20, vec![LookupCell::Input(&(main_binary::operation_id_u3)), LookupCell::Input(&(main_binary::A_byte_u4)), LookupCell::Input(&(main_binary::B_byte_u4)), LookupCell::Output(&mut lookup_20_3)]))
let main_binary::C_byte_u4 = lookup_20_3
let main_binary::sel[0]_u3 = 0
let main_binary::sel[0]_u2 = 0
let main_binary::sel[0]_u1 = 0
let main_binary::sel[0]_d0 = 0
let main_binary::sel[0]_d1 = 0
let main_binary::sel[0]_d2 = 0
let main_binary::sel[0]_d3 = 0

chriseth · 2024-11-21T15:45:18Z

Ran an extended version of binary_large_test.asm that just loops infinitely and extended the degree of the main machine to 2**20. The binary machine is jit-computed. The compilation time is included in the statistics below.

 == Witgen profile (12469602 events)
   43.7% (   12.4s): Main Machine
   25.9% (    7.3s): Secondary machine 0: main_binary (BlockMachine)
   25.0% (    7.1s): FixedLookup
    5.5% (    1.5s): witgen (outer code)
    0.0% ( 470.0ns): range constraint multiplicity witgen
  ---------------------------
    ==> Total: 28.293288704s

Used rows in binary: 8161892
Used rows in main: 4194304

Witgen takes 800 ns for one lookup to the binary machine (this includes the time spent in the fixed lookup machine)

The same .asm file run on current main:

 == Witgen profile (36955286 events)
   61.4% (   39.0s): Secondary machine 0: main_binary (BlockMachine)
   20.3% (   12.9s): Main Machine
   15.9% (   10.1s): FixedLookup
    2.5% (    1.6s): witgen (outer code)
    0.0% ( 412.0ns): range constraint multiplicity witgen
  ---------------------------
    ==> Total: 63.642381077s
    ```

chriseth · 2024-11-21T15:58:05Z

executor/src/witgen/jit/jit_processor.rs

+    }
+}
+
+fn process_lookup<'b, 'd, T: FieldElement, Q: QueryCallback<T>>(


Make this extern "C"?

chriseth · 2024-11-21T15:58:34Z

executor/src/witgen/jit/jit_processor.rs

+        // TODO this applies a shift. Maybe we could do it much earlier?
+        latch_row: usize,
+        mutable_state: &'b mut MutableState<'a, 'b, T, Q>,
+        process_lookup: fn(&'b mut MutableState<'a, 'b, T, Q>, u64, Vec<LookupCell<'c, T>>) -> bool,


Change LookupCell to use "C" repr

chriseth · 2024-11-22T16:59:12Z

Now I got it down to roughly 600. I think we can save another 100ms by improving how the known bits are stored (padded to a u32 for each new row).

chriseth · 2024-11-22T19:39:43Z

Yep, now we have 540ns per block

chriseth · 2024-11-23T13:21:27Z

We should change process_lookup_direct not to take a Vec, because then we need to allocate. Instead, we should take a mutable ref to a slice and create it on the stack.

chriseth · 2024-11-25T21:04:07Z

Using mutable refs to slices instead of vectors for process_lookup_direct reduced the time for block witgen (including the 4 fixed lookups) to 490 ns.

…it-witgen

chriseth force-pushed the jit-witgen branch from 1a86dfa to 5cadc2a Compare November 16, 2024 18:44

chriseth changed the base branch from main to call_jit_from_block November 16, 2024 18:45

chriseth added 2 commits November 16, 2024 18:55

Auto witgen.

8464bce

integrate

7790924

chriseth force-pushed the jit-witgen branch from 5cadc2a to 7790924 Compare November 16, 2024 19:40

chriseth and others added 11 commits November 16, 2024 20:58

questionable glue code.

d99d780

try compile

4233f1d

use get set

574eda0

missing file.

cb917f6

Some glue code.

65abd05

some ops.

5aeb270

FFI trickery

0a94bab

Some more hacky glue.

df6b6c4

hacketyhack

afe9656

fixetifix

e8d4743

fix grow

b812c9b

chriseth commented Nov 21, 2024

View reviewed changes

Set known inside generated code.

71c513a

Pad known to rows

6085c0c

comment out prints.

e3089a3

chriseth and others added 4 commits November 23, 2024 19:36

Prepare for dynamic machines.

f4968e3

Codegen for dynamic machines.

9f5d77c

Some TODOs.

24d837c

Avoid allocation for direct lookup.

1b4c45e

Merge branch 'jit-witgen' of ssh://github.com/powdr-labs/powdr into j…

3742eac

…it-witgen

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto witgen #2071

Auto witgen #2071

chriseth commented Nov 11, 2024

chriseth commented Nov 13, 2024 •

edited

Loading

chriseth commented Nov 21, 2024 •

edited

Loading

chriseth Nov 21, 2024

chriseth Nov 21, 2024

chriseth commented Nov 22, 2024

chriseth commented Nov 22, 2024

chriseth commented Nov 23, 2024

chriseth commented Nov 25, 2024

Auto witgen #2071

Are you sure you want to change the base?

Auto witgen #2071

Conversation

chriseth commented Nov 11, 2024

chriseth commented Nov 13, 2024 • edited Loading

chriseth commented Nov 21, 2024 • edited Loading

chriseth Nov 21, 2024

Choose a reason for hiding this comment

chriseth Nov 21, 2024

Choose a reason for hiding this comment

chriseth commented Nov 22, 2024

chriseth commented Nov 22, 2024

chriseth commented Nov 23, 2024

chriseth commented Nov 25, 2024

chriseth commented Nov 13, 2024 •

edited

Loading

chriseth commented Nov 21, 2024 •

edited

Loading