-
Notifications
You must be signed in to change notification settings - Fork 33
Script to convert proto graphs to Beam #230
Comments
Is it correct to understand that the input to Apache Beam is the group of templates found in whitenoise-core/validator-rust/prototypes/components ? |
Yes! Each json file needs to have an equivalent Beam representation. To be DP, there must be no two neighboring datasets (any two datasets that differ by one row) where the runtime succeeds on one dataset and fails on the other. If a runtime does not implement a component, then it will fail on every dataset-- which is great, because this means runtimes (like beam) do not need to implement every component. You can also ignore components that have no concrete implementation- like min, max, dp_xxx, to_xxx. |
The overall signature is- given a computation graph, privacy definition and release, return a release. About the overall function (let's call it, distribute_release), I can help with the portion that traverses the graph and calls into the lower level functions (next comment) |
From my initial glance at Beam, you might expect each beam component you implement to take in a PCollection for each argument, and elementary python types for each option. Each component implementation will need similar arguments as the rust runtime:
The privacy definition can generally be ignored for now. The only relevant info inside the privacy definition are the user preferences to force constant time, constant memory, etc. We haven't matured to support that yet. Let me know if you see issues with this proposed code structure! |
Using the example of the analysis notebook, I'm going to pose an example here. attempt 4 - succeeds!there is an example using dp_mean and dp_variance. |
More likely, translate materialize -> cast -> clamp -> impute -> resize -> (mean, variance) Into a beam pipeline. I'm happy to add custom data loading components if beam doesn't make the same assumptions materialize does. |
This could just be written in Python, although I would be interested in learning how the Beam communications layer works first.
The text was updated successfully, but these errors were encountered: