-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caphash optimization #45
base: master
Are you sure you want to change the base?
Conversation
@@ -15,10 +15,10 @@ bcs_transformation_parameters<FieldT, MT_root_hash> default_bcs_params( | |||
params.hash_enum = hash_type; | |||
/* TODO: Push setting leaf hash into internal BCS code. Currently 2 is fine, as leaf size is internally unused. */ | |||
const size_t leaf_size = 2; | |||
params.leafhasher_ = get_leafhash<FieldT, MT_root_hash>(hash_type, security_parameter, leaf_size); | |||
params.leafhasher_ = get_leafhash<MT_root_hash, FieldT>(hash_type, security_parameter, leaf_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this just a bug before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I switched the order to be consistent with similar functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was here before was right I believe https://github.com/scipr-lab/libiop/blob/master/libiop/bcs/hashing/hash_enum.hpp#L33-L34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still believe the template ordering before was right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, you switched the template ordering of the internal command. Currently there are inconsistent orderings throughout, I believe the old <FieldT, MT_root_hash>
was consistently used everywhere. We should consistently be using one or the other, I don't really have a preference for which one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched all instances of the leaf hash functions to be the same as the hashchain. If you insist, I can change it back, but it is currently consistent as well.
f5fb79d
to
c04d91c
Compare
params.hashchain_ = | ||
get_hashchain<FieldT, MT_root_hash>(hash_type, security_parameter); | ||
params.cap_hasher = get_cap_hash<MT_root_hash, FieldT>(hash_type, security_parameter); | ||
params.hashchain_ = get_hashchain<FieldT, MT_root_hash>(hash_type, security_parameter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed above, we are switching <FieldT, MT_root_hash>
to <MT_root_hash, FieldT>
in leafhasher
(now cap_hasher
here).
Do we want to apply the same refactoring to hashchain
? We probably should do that in a separate PR for ease of testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'd prefer to keep it the same for this PR, unless it causes merge conflicts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's acceptable, I'd like to keep them both the current order for now. We can refactor both at the same time in a future PR. This PR is already quite big, and it's been a long time since I wrote this so I don't remember all the context
@@ -103,6 +126,8 @@ class merkle_tree { | |||
|
|||
hash_digest_type get_root() const; | |||
|
|||
/* These two functions do not currently work if the given positions aren't sorted or | |||
have duplicates, AND the tree is set to be zero knowledge. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this one a new limitation that is caused by the present PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't remember. It might be new due to the cap hashes. If so, then the issue will not be present if you don't use the new cap hash optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I highly recommend explicitly sorting the positions within this function and lift such limitation, as it is unnatural.
libiop/bcs/merkle_tree.tcc
Outdated
@@ -200,32 +211,39 @@ std::vector<std::vector<FieldT>> merkle_tree<FieldT, hash_digest_type>::serializ | |||
template<typename FieldT, typename hash_digest_type> | |||
void merkle_tree<FieldT, hash_digest_type>::compute_inner_nodes() | |||
{ | |||
// TODO: Better document this function, its hashing layer by layer. | |||
std::size_t n = (this->num_leaves_ - 1) / 2; | |||
/* n is the first index of the layer we're about to compute. It starts at the bottom layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bottom layer or the layer above the bottom layer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right it's the bottom inner node layer
std::vector<pos_and_digest_t> S; | ||
S.reserve(positions.size()); | ||
|
||
std::vector<hash_digest_type> leaf_hashes; | ||
if (this->make_zk_) { | ||
for (auto &leaf : leaf_contents) | ||
{ | ||
/* FIXME: This code is currently incorrect if the given list of positions is not | ||
sorted or has duplicates. This could be fixed if both positions and leaf_contents | ||
are sorted before the leaf hashes are calculated, which would require refactoring. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a problem before this PR, or is it a new one? (My impression is that the way we add ZK is reasonably adding this limitation. Is the original okay when the positions are not sorted?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ same as above. If possible, explicitly sort the positions and lift this limitation.
@@ -474,54 +515,79 @@ bool merkle_tree<FieldT, hash_digest_type>::validate_set_membership_proof( | |||
std::swap(S, new_S); | |||
} | |||
|
|||
// Add the cap, including the root's direct children and the root. | |||
// The only elements should be the cap (not including the root). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the cap include the root, or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I think the cap doesn't include the root. It's been such a long time since I wrote this code so I can only guess.
throw std::invalid_argument("Poseidon only supported for 128 bit soundness."); | ||
} | ||
poseidon_params<FieldT> params = get_poseidon_parameters<FieldT>(hash_enum); | ||
/* security parameter is -1 b/c */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify "-1" and "b/c"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what it means, it was copied from elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then this line should be removed.
/* We explicitly place this on heap with no destructor, | ||
as this reference has to live after the function terminates */ | ||
std::shared_ptr<algebraic_vector_hash<FieldT>> hash_class = | ||
std::make_shared<algebraic_vector_hash<FieldT>>(permutation, security_parameter - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why security_parameter - 1
instead of security_parameter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea, it was also copied from elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you refer to the source?
} | ||
|
||
/** Constructs a merkle tree with leaf size 2. Generates and verifies membership proofs for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the convention for ** or * in libiop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if there's an existing convention, the one I use is /** if it's documentation that the user might want to see on hover, and /* if it's a multiline comment that's only useful to see in the location where it's added (// if it's a single line comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather, that's the philosophy I have now. Looks like I didn't stick to it at the time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considered this one resolved.
I did a pass of the code, and most are smaller points. Let me know when you want me to reach out to the repo's owner to coordinate merging this PR. |
Cap hashing was implemented for merkle trees. The algebraic and non-algebraic cap hashes were both implemented in
hash_enum.tcc
, as well as the dummy algebraic hash. See issue #41.A bug was fixed that merkle tree membership proof generation and verification didn't work when the input queries weren't sorted and unique. They now work when the tree is not zero knowledge, but when the tree is set to be zero knowledge, the code still fails.
The hash count method was updated to count each two-to-one hash as 2 units and the cap hash as 1 unit per input.
More tests were added, to test cap hashing, and to test large trees with randomly generated query leaves. The code no not test unsorted non-unique queries since that still fails for zero knowledge trees.
No snark protocols currently take advantage of the cap hash; they still have cap size set to 2.
Changes are complete, but not ready to merge because I started working based off of PR #43, so that should preferably be merged first.Ready for review