You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DISCLAIMER: this issue may contain inaccurate or false statements. Its sole purpose is to investigate feasibility of possible attack ideas for n-RLN. It is WIP.
Introduction
In the Shamir's Secret Sharing (SSS) polynomial of n-RLN, all (secret) coefficients are computed from the same secret a0 (and other public values) using the Poseidon hash function.
In Poseidon, each round consists of an non-linear SBOX followed by a linear mix which can be modeled in a simplified manner as
$$x \rightarrow k_{m,i}\cdot(x+k_{r,i})^p$$
with $p=3,5$ and $k_m, k_r$ derived from round constants (in practice few other caveats exist, but we keep its formulation simple for ease of exposition).
It follows that by iterating the round function, we obtain the sequence of transformations
These iterations can be modeled with a set of multivariate polynomials. By working over the multivariate polynomial ring defined over $\mathbb{F}_p$ with variables $x_0,\ldots,x_n$, we define the polynomials
It follows that the ideal generated by $I \cup \{x_0 - \bar{v}\}$ contains the polynomial $x_n - Hash(\bar{v})$.
The intuition behind this is that by adding the polynomial $x_0 - \bar{v}$ to $I$, we're assigning the value $\bar{v}$ to $x_0$ which iteratively defines all values of the variables $x_i$ in $I$ up to $x_n$.
Attacking 2-RLN
In the following, we take as toy example 2-RLN, that is Shamir's secret shares come from a degree-2 polynomial
$$y = \bar{a}_0 + \bar{a}_1\cdot x + \bar{a}_2\cdot x^2$$
where $\bar{a}_0$ represents the identity secret and $\bar{a}_1 = Hash(\bar{a}_0)$ and $\bar{a}_2 = Hash(\bar{a}_1)$ (in practice the hash is computed together with other public values, but we omit this for now).
The relation between the values $\bar{a}_0, \bar{a}_1, \bar{a}_2$ can be then modeled with the two sets of polynomials
over the polynomial ring $\mathbb{F}_p[a_0,a_1,a_2,x_{1,1},\ldots,x_{1,n},x_{2,1},\ldots,x_{2,n}]$.
From what said before, we have that the ideal generated by $I$ and the polynomial $a_0 - \bar{a}_0$, will contain the polynomials $a_1 - \bar{a}_1$ and $a_2 - \bar{a}_2$, since assigning a value to $a_0$ will iteratively assign a value to $a_1$ and $a_2$ according to the relations defined by $I$.
Note that $a_0$ denotes a polynomial variable, while $\bar{a}_0$ is a value, that is an element of the underlying field $\mathbb{F}_p$.
Modeling Shares
In order to recover the secret value $\bar{a}_0$ (and $\bar{a}_1,\bar{a}_2$), in 2-RLN we need 3 shares, i.e. 3 random $(x,y)$ pairs coming from evaluating
$$y = \bar{a}_0 + \bar{a}_1\cdot x + \bar{a}_2\cdot x^2$$
We assume we only know two random shares $(\bar{x}_1,\bar{y}_1)$, $(\bar{x}_2,\bar{y}_2)$ and we model their values using the polynomial variables $a_0,a_1,a_2$ as
If we evaluate the polynomials in $S$ is the values $\bar{a}_0,\bar{a}_1,\bar{a}_2$ that were originally used to compute the shares, all of them will be equal to 0.
In other words, $(\bar{a}_0,\bar{a}_1,\bar{a}_2)$ is a solution to $S$.
Is there enough information?
Can we recover the secret value $\bar{a}_0$ by just knowing 2 shares out of 3?
The theory behind Shamir's secret sharing ensures that it would be impossible (that is, all values from $\mathbb{F}$ are likely possible for $\bar{a}_0$), but it assumes the polynomial coefficients to be generated randomly, hence they are independent from each other.
In n-RLN, instead, such coefficients all depend on $\bar{a}_0$ through a non-linear relation given by the employed hash function.
In other words, the unique solution to the polynomial system $\mathcal{P} = I_1 \cup I_2 \cup S$ (i.e. assignments to all variables so that all polynomials evaluate to 0) is defined exclusively by assigning the correct value $\bar{a_0}$ to the variable $a_0$.
This means that the unique solution to the system $\mathcal{P}$, i.e. a polynomial system in $2n+3$ variables, depends on a single variable assignment, that is $a_0$.
Informally, this means that the total entropy of the solution is just $\log{p}$ bits rather than $3\cdot \log{p}$, as it would be if $\bar{a}_0, \bar{a}_1, \bar{a}_2$ were computed randomly.
While the polynomials in $I_1$ and $I_2$ constrain only the coefficients without of the SSS polynomial, without revealing any bits on their actual assignments, the share polynomials in $S$, instead, do.
Indeed, since 3 shares would suffice to recover 3 independent coefficients in text-book Shamir's Secret Sharing, this means that each share reveals $\approx \log p$ of information about the secret coefficients all together.
The intuition behind the attack is that since the solution to the system $\mathcal{P}$ has just $\log p$ bits of entropy, then 2 random shares would provide -on average- enough information (in principle $\approx 2\log p$ when coefficients are independent) in order to be able to recover $\bar{a}_0$.
The way we attempt to recover $\bar{a}_0$, is by searching a solution to $\mathcal{P}$.
Finding a solution
The standard method to compute a solution to a multivariate polynomial system is by computing Gröbner basis.
We skip the technical details of such method, but we only mention that the ideal generated by $\mathcal{P}$ has minimum-degree generators of the form
where $\bar{a}_0,\ldots,\bar{x}_{2,n}$ are the assignments for $a_0,\ldots,x_{2,n}$ so that all polynomials in $\mathcal{P}$ evaluate to 0.
Such set $B$ results to be a Groebner basis for $\mathcal{P}$, hence finding it will automatically provide a solution to the system, including recovering the secret value $\bar{a}_0$.
There exists few valid algorithms to compute Groebner basis (Buchberger, F4, F5, etc.), however their computational complexity is hard to estimate and there is no guarantee that even simple systems could be solved quickly.
Attacking 2-RLN: PoC implementation
With the aim to make the above ideas a bit more concrete and ease discussion, we provide a Sage PoC implementation of the Groebner basis approach over a reduced-round version of an hash function that mocks Poseidon hash.
The following script can be tested online on Sagecell. The prime $p$ employed is the order of the curve BN254.
The script will return a base $B$ for $\mathcal{P}$ consisting of degree-1 polynomials (i.e. assignments to each variable) that will solve the whole system $\mathcal{P} = \{r_{1,1},r_{1,2},r_{1,3},r_{1,4},r_{2,1},r_{2,2},r_{2,3},r_{2,4}, share_1, share_2\}$.
Food for thought (take carefully!)
On local installations of Sage, the warning "falling back to very slow implementation" is raised when computing the above Groebner basis. For such reason, the number of hashing rounds is kept low in the above PoC, but with better implementations it might be possible to target an higher number of rounds.
Faugère's F4 and F5 algorithms are presently the most efficient algorithms for computing Gröbner bases, and allow to compute routinely Gröbner bases consisting of several hundreds of polynomials, having each several hundreds of terms and coefficients of several hundreds of digits.
In the case of Poseidon, we have ~64 rounds for a full hash: this corresponds to a total of $2\cdot 64 + 2$ polynomials in $2\cdot 64 + 3$ (dependent) variables of degree at most 5 and coefficients of roughly ~80 digits each.
So it might be indeed possible to attack n-RLN for any $n>1$ with just 2 shares (or slightly more, depending on $n$) by using an efficient implementation of the F4/F5 algorithm.
All the above seems to suggest that we should not lower or we should be extremely careful in lowering the number of rounds of Poseidon hash in order to reduce ZK circuit constraints and gas cost, as was originally suggested in Rln-Relay: Lowering Poseidon round waku-org/nwaku#724.
In principle the attack should work also for 1-RLN, i.e. standard RLN. But in tests we were not able to fully recover $\bar{a}_0$, even in a hash reduced-round scenario. The reason probably resides in the fact that 1 share does not suffice (roughly speaking, it doesn't provide enough entropy) to feasibly find a solution to the system of equations. Nevertheless, there might be cases where the attack might work, considering also the relevance of carefully tuning the parameters of the Groebner basis finding algorithm (not done during tests).
The text was updated successfully, but these errors were encountered:
DISCLAIMER: this issue may contain inaccurate or false statements. Its sole purpose is to investigate feasibility of possible attack ideas for n-RLN. It is WIP.
Introduction
In the Shamir's Secret Sharing (SSS) polynomial of n-RLN, all (secret) coefficients are computed from the same secret
a0
(and other public values) using the Poseidon hash function.In Poseidon, each round consists of an non-linear SBOX followed by a linear mix which can be modeled in a simplified manner as
with$p=3,5$ and $k_m, k_r$ derived from round constants (in practice few other caveats exist, but we keep its formulation simple for ease of exposition).
It follows that by iterating the round function, we obtain the sequence of transformations
where$x_n$ corresponds to our final hash.
These iterations can be modeled with a set of multivariate polynomials. By working over the multivariate polynomial ring defined over$\mathbb{F}_p$ with variables $x_0,\ldots,x_n$ , we define the polynomials
It follows that the ideal generated by$I \cup \{x_0 - \bar{v}\}$ contains the polynomial $x_n - Hash(\bar{v})$ .$x_0 - \bar{v}$ to $I$ , we're assigning the value $\bar{v}$ to $x_0$ which iteratively defines all values of the variables $x_i$ in $I$ up to $x_n$ .
The intuition behind this is that by adding the polynomial
Attacking 2-RLN
In the following, we take as toy example 2-RLN, that is Shamir's secret shares come from a degree-2 polynomial
where$\bar{a}_0$ represents the identity secret and $\bar{a}_1 = Hash(\bar{a}_0)$ and $\bar{a}_2 = Hash(\bar{a}_1)$ (in practice the hash is computed together with other public values, but we omit this for now).
The relation between the values$\bar{a}_0, \bar{a}_1, \bar{a}_2$ can be then modeled with the two sets of polynomials
over the polynomial ring$\mathbb{F}_p[a_0,a_1,a_2,x_{1,1},\ldots,x_{1,n},x_{2,1},\ldots,x_{2,n}]$ .
From what said before, we have that the ideal generated by$I$ and the polynomial $a_0 - \bar{a}_0$ , will contain the polynomials $a_1 - \bar{a}_1$ and $a_2 - \bar{a}_2$ , since assigning a value to $a_0$ will iteratively assign a value to $a_1$ and $a_2$ according to the relations defined by $I$ .
Note that$a_0$ denotes a polynomial variable, while $\bar{a}_0$ is a value, that is an element of the underlying field $\mathbb{F}_p$ .
Modeling Shares
In order to recover the secret value$\bar{a}_0$ (and $\bar{a}_1,\bar{a}_2$ ), in 2-RLN we need 3 shares, i.e. 3 random $(x,y)$ pairs coming from evaluating
We assume we only know two random shares$(\bar{x}_1,\bar{y}_1)$ , $(\bar{x}_2,\bar{y}_2)$ and we model their values using the polynomial variables $a_0,a_1,a_2$ as
If we evaluate the polynomials in$S$ is the values $\bar{a}_0,\bar{a}_1,\bar{a}_2$ that were originally used to compute the shares, all of them will be equal to 0.
In other words,$(\bar{a}_0,\bar{a}_1,\bar{a}_2)$ is a solution to $S$ .
Is there enough information?
Can we recover the secret value$\bar{a}_0$ by just knowing 2 shares out of 3?$\mathbb{F}$ are likely possible for $\bar{a}_0$ ), but it assumes the polynomial coefficients to be generated randomly, hence they are independent from each other.
The theory behind Shamir's secret sharing ensures that it would be impossible (that is, all values from
In n-RLN, instead, such coefficients all depend on$\bar{a}_0$ through a non-linear relation given by the employed hash function.$\mathcal{P} = I_1 \cup I_2 \cup S$ (i.e. assignments to all variables so that all polynomials evaluate to 0) is defined exclusively by assigning the correct value $\bar{a_0}$ to the variable $a_0$ .
In other words, the unique solution to the polynomial system
This means that the unique solution to the system$\mathcal{P}$ , i.e. a polynomial system in $2n+3$ variables, depends on a single variable assignment, that is $a_0$ .
Informally, this means that the total entropy of the solution is just$\log{p}$ bits rather than $3\cdot \log{p}$ , as it would be if $\bar{a}_0, \bar{a}_1, \bar{a}_2$ were computed randomly.
While the polynomials in$I_1$ and $I_2$ constrain only the coefficients without of the SSS polynomial, without revealing any bits on their actual assignments, the share polynomials in $S$ , instead, do.
Indeed, since 3 shares would suffice to recover 3 independent coefficients in text-book Shamir's Secret Sharing, this means that each share reveals$\approx \log p$ of information about the secret coefficients all together.
The intuition behind the attack is that since the solution to the system$\mathcal{P}$ has just $\log p$ bits of entropy, then 2 random shares would provide -on average- enough information (in principle $\approx 2\log p$ when coefficients are independent) in order to be able to recover $\bar{a}_0$ .
The way we attempt to recover$\bar{a}_0$ , is by searching a solution to $\mathcal{P}$ .
Finding a solution
The standard method to compute a solution to a multivariate polynomial system is by computing Gröbner basis.
We skip the technical details of such method, but we only mention that the ideal generated by$\mathcal{P}$ has minimum-degree generators of the form
$$B = \begin{cases}a_0 - \bar{a}0\\a_1 - \bar{a}1\\\ldots\\x{2,n}-\bar{x}{2,n}\end{cases}$$
where$\bar{a}_0,\ldots,\bar{x}_{2,n}$ are the assignments for $a_0,\ldots,x_{2,n}$ so that all polynomials in $\mathcal{P}$ evaluate to 0.
Such set$B$ results to be a Groebner basis for $\mathcal{P}$ , hence finding it will automatically provide a solution to the system, including recovering the secret value $\bar{a}_0$ .
There exists few valid algorithms to compute Groebner basis (Buchberger, F4, F5, etc.), however their computational complexity is hard to estimate and there is no guarantee that even simple systems could be solved quickly.
Attacking 2-RLN: PoC implementation
With the aim to make the above ideas a bit more concrete and ease discussion, we provide a Sage PoC implementation of the Groebner basis approach over a reduced-round version of an hash function that mocks Poseidon hash.
The following script can be tested online on Sagecell. The prime$p$ employed is the order of the curve BN254.
The script will return a base$B$ for $\mathcal{P}$ consisting of degree-1 polynomials (i.e. assignments to each variable) that will solve the whole system $\mathcal{P} = \{r_{1,1},r_{1,2},r_{1,3},r_{1,4},r_{2,1},r_{2,2},r_{2,3},r_{2,4}, share_1, share_2\}$ .
Food for thought (take carefully!)
On local installations of Sage, the warning "falling back to very slow implementation" is raised when computing the above Groebner basis. For such reason, the number of hashing rounds is kept low in the above PoC, but with better implementations it might be possible to target an higher number of rounds.
By quoting Buchberger's algorithm Wikipedia page:
In the case of Poseidon, we have ~64 rounds for a full hash: this corresponds to a total of$2\cdot 64 + 2$ polynomials in $2\cdot 64 + 3$ (dependent) variables of degree at most 5 and coefficients of roughly ~80 digits each.
So it might be indeed possible to attack n-RLN for any$n>1$ with just 2 shares (or slightly more, depending on $n$ ) by using an efficient implementation of the F4/F5 algorithm.
All the above seems to suggest that we should not lower or we should be extremely careful in lowering the number of rounds of Poseidon hash in order to reduce ZK circuit constraints and gas cost, as was originally suggested in Rln-Relay: Lowering Poseidon round waku-org/nwaku#724.
In principle the attack should work also for 1-RLN, i.e. standard RLN. But in tests we were not able to fully recover$\bar{a}_0$ , even in a hash reduced-round scenario. The reason probably resides in the fact that 1 share does not suffice (roughly speaking, it doesn't provide enough entropy) to feasibly find a solution to the system of equations. Nevertheless, there might be cases where the attack might work, considering also the relevance of carefully tuning the parameters of the Groebner basis finding algorithm (not done during tests).
The text was updated successfully, but these errors were encountered: