Implement initial ReLU/sign function via polynomial approximation #658

j2kun · 2024-04-29T18:05:02Z

Given that ReLU(x) = x (0.5 + 0.5 sgn(x)), this reduces to approximating the sign function, and this paper appears to have the state of the art: https://eprint.iacr.org/2020/834

Also note

max(u, v) = ((u+v) + (u-v)sign(u-v)) / 2
min(u, v) = -max(-u, -v) = ((u+v) - (v - u)sign(v - u)) / 2

The text was updated successfully, but these errors were encountered:

j2kun · 2024-04-29T18:05:33Z

Also cf. https://openreview.net/pdf?id=Hq16Jk2bVlp and https://eprint.iacr.org/2021/1688 which use these approximations.

j2kun · 2024-04-29T23:57:04Z

The paper linked above 2020/834 does not release source code, but there is a reference implementation in LattiGo: https://github.com/tuneinsight/lattigo/blob/4cce9a48c1daaa2dd122921822f5ad70cd444156/he/hefloat/minimax_composite_polynomial.go#L124

j2kun · 2024-04-30T16:52:23Z

The paper https://eprint.iacr.org/2019/1234 is a precursor to https://eprint.iacr.org/2020/834, but also seems to explain more of the motivation behind the composite polynomials.

j2kun · 2024-05-01T03:54:13Z

An example of generating a well-fitting polynomial using lolremez: samhocevar/lolremez#28 (comment)

Another tool: https://github.com/pychebfun/pychebfun

j2kun · 2024-05-01T13:58:28Z

Outline sketch:

Use an existing Remez solver to find any old approximation to sgn or max of some arbitrary degree, e.g., from Implement initial ReLU/sign function via polynomial approximation #658 (comment)
Implement a lowering to that fixed polynomial, and run an e2e test
Use a Paterson-Stockmeyer approach to minimize mul ops (https://www.csd.uwo.ca/~mmorenom/HPCA-ACA-2017/Sivan_Toledo.ACA-2017-Talk.pdf has some notes, looking for a better source)

Various improvements based on more recent research that would be worth splitting into separate tickets.

Multi-interval Remez solver for a better sgn approximation: Algorithm 2 of https://eprint.iacr.org/2020/834
Domain extension polynomial to improve sqrt(n) muls to log(n): https://eprint.iacr.org/2022/280

j2kun · 2024-05-02T02:59:35Z

Thanks to Seonhong Min for sending me https://eprint.iacr.org/2018/462, in which it shows that BFV achieves polynomial approximations via a fixed-point approximation, not a floating point one. I think there is also some complexity there in that evaluating a fixed point polynomial approximation also requires a rounding step, but not always, see sec 2.5

Maokami · 2024-05-02T12:01:28Z

I also had an interest in this issue a while back, so I know a paper worth sharing: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10155408
It's a follow-up paper by the authors of https://eprint.iacr.org/2020/834, and there's an implementation: https://github.com/snu-ccl/approxCNN/tree/main.
In the repo, you'll find that they've hardcoded the coefficients of polynomial approximations of sign function from alpha=4 to 14!

Maokami · 2024-05-02T12:24:34Z

The paper linked above 2020/834 does not release source code, but there is a reference implementation in LattiGo: https://github.com/tuneinsight/lattigo/blob/4cce9a48c1daaa2dd122921822f5ad70cd444156/he/hefloat/minimax_composite_polynomial.go#L124

Oh, I didn't know there was an implementation in Lattigo! In that case, these hardcoded coefficients might not be necessary.

j2kun · 2024-05-02T18:07:09Z

After starting an implementation in #665 (with a fixed approximation polynomial) and discussing in the HEIR meeting today, I have a few kinks to work out:

I want to separate the task of choosing a polynomial approximation from the optimizations around evaluating it. This implies:
1. I need a floating-point representation of a polynomial in the IR, but PolynomialAttr currently only supports integer coefficients
2. I need a new op, say poly_ext.eval whose operands are the polynomial to apply and its input
3. The approximation itself is for sign, but these operations are actually applied to max(0, x) = (x + x * sign(x)) / 2, which means we should support some internal polynomial arithmetic to construct these from the checked-in approximation. We meant to do this to support constant folding in the polynomial dialect, but never got around to it.
(1) has a twist in that many more advanced polynomial approximations are not represented literally, but implicitly as a composition of smaller degree polynomials. This implies I will need a polynomial.compose op, or else an attribute that supports composite sub-polynomials, and cannot limit (1.ii) above to a single static polynomial. I think I will start with a single static polynomial but try to avoid making it difficult to upgrade to a composite polynomial.
The approximate polynomial itself has a few quirks, because its coefficients further need to be encoded in a scheme-specific fashion. For CKKS this is relatively straightforward, but introduces additional error. For BGV/BFV this seems much harder, in part because the encodings are fixed-point and hence require rounding during polynomial evaluation, but rounding itself is hard (see above). There is also a question about which basis the polynomial is expressed in, cf. https://discord.com/channels/901152454077452399/1235349479482196049 for more on this
The above points expose a problem with "lowering a ReLU": at the tosa level we don't yet know what scheme will be chosen, so the choice of polynomial approximation can't be scheme-aware or encoding-aware. I think the right solution here will be to include some extra metadata on the polynomial to express what function is being approximated, so that we can re-approximate it at lower levels if necessary.

j2kun · 2024-05-08T17:02:21Z

These folks do something slightly different, which is more holistic in re NN training: https://github.com/EfficientFHE/SmartPAF

They pick small degree polynomial approximations and then do a variety of network fine-tuning to adapt the network to the replaced operations. This seems out of scope of the compiler, since it would require training data to be included.

j2kun · 2024-05-31T21:41:51Z

I added an upstream RFC for the polynomial approximation pass https://discourse.llvm.org/t/rfc-a-polynomial-approximation-pass/79301

asraa · 2024-09-17T16:22:04Z

@JianmingTONG has also worked on https://github.com/EfficientFHE/SmartPAF system for approximating activation functions

Pro7ech · 2024-11-08T07:03:24Z

The paper linked above 2020/834 does not release source code, but there is a reference implementation in LattiGo: https://github.com/tuneinsight/lattigo/blob/4cce9a48c1daaa2dd122921822f5ad70cd444156/he/hefloat/minimax_composite_polynomial.go#L124

Oh, I didn't know there was an implementation in Lattigo! In that case, these hardcoded coefficients might not be necessary.

The one in Tune Insight's Lattigo is unstable and highly likely to fail when approximating over multiple intervals. The issue comes from the method to find the roots of the error function which can miss roots. I fixed if this summer, because I needed it for the iDASH challenge, it is in my personal fork of Lattigo.

j2kun · 2024-11-08T14:37:17Z

The paper linked above 2020/834 does not release source code, but there is a reference implementation in LattiGo: https://github.com/tuneinsight/lattigo/blob/4cce9a48c1daaa2dd122921822f5ad70cd444156/he/hefloat/minimax_composite_polynomial.go#L124

Oh, I didn't know there was an implementation in Lattigo! In that case, these hardcoded coefficients might not be necessary.

The one in Tune Insight's Lattigo is unstable and highly likely to fail when approximating over multiple intervals. The issue comes from the method to find the roots of the error function which can miss roots. I fixed if this summer, because I needed it for the iDASH challenge, it is in my personal fork of Lattigo.

@Pro7ech Since I wrote that comment I've been doing a lot of studying on this topic, including writing a basic Remez solver that had issues with finding roots. I believe that using the so-called barycentric form of the remez algorithm would avoid these issues (though I have not yet followed through with this plan to confirm it). Did you come to the same conclusion? Or did you resolve it via a different method?

Pro7ech · 2024-11-08T21:38:24Z

The paper linked above 2020/834 does not release source code, but there is a reference implementation in LattiGo: https://github.com/tuneinsight/lattigo/blob/4cce9a48c1daaa2dd122921822f5ad70cd444156/he/hefloat/minimax_composite_polynomial.go#L124

Oh, I didn't know there was an implementation in Lattigo! In that case, these hardcoded coefficients might not be necessary.

The one in Tune Insight's Lattigo is unstable and highly likely to fail when approximating over multiple intervals. The issue comes from the method to find the roots of the error function which can miss roots. I fixed if this summer, because I needed it for the iDASH challenge, it is in my personal fork of Lattigo.

@Pro7ech Since I wrote that comment I've been doing a lot of studying on this topic, including writing a basic Remez solver that had issues with finding roots. I believe that using the so-called barycentric form of the remez algorithm would avoid these issues (though I have not yet followed through with this plan to confirm it). Did you come to the same conclusion? Or did you resolve it via a different method?

I haven't looked into the barycentric Remez, but I've managed to make it work and stable (although quite slow for large degrees) even when the number if discret intervals is in the dozen. I directly look for the extrema of the error function, this is easier and more stable than looking for the roots of the derivative of the error function. Because of the alternating condition, the minimum and maximum number of extrema can be bounded, and knowing how many points we are looking for helps a lot. The code is commented if you want to take a look.

j2kun changed the title ~~Implement ReLU/sign function via polynomial approximation~~ Implement initial ReLU/sign function via polynomial approximation May 1, 2024

j2kun self-assigned this Jun 30, 2024

j2kun added the dialect: polynomial Issues concerning the polynomial dialect label Jun 30, 2024

AlexanderViand-Intel mentioned this issue Jul 25, 2024

feat: implement secret-to-bgv lowering #473

Open

5 tasks

asraa mentioned this issue Aug 26, 2024

rlwe: add polynomial approx for sigmoid functions #942

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement initial ReLU/sign function via polynomial approximation #658

Implement initial ReLU/sign function via polynomial approximation #658

j2kun commented Apr 29, 2024 •

edited

Loading

j2kun commented Apr 29, 2024

j2kun commented Apr 29, 2024

j2kun commented Apr 30, 2024

j2kun commented May 1, 2024 •

edited

Loading

j2kun commented May 1, 2024 •

edited

Loading

j2kun commented May 2, 2024

Maokami commented May 2, 2024 •

edited

Loading

Maokami commented May 2, 2024

j2kun commented May 2, 2024

j2kun commented May 8, 2024

j2kun commented May 31, 2024 •

edited

Loading

asraa commented Sep 17, 2024

Pro7ech commented Nov 8, 2024

j2kun commented Nov 8, 2024

Pro7ech commented Nov 8, 2024 •

edited

Loading

Implement initial ReLU/sign function via polynomial approximation #658

Implement initial ReLU/sign function via polynomial approximation #658

Comments

j2kun commented Apr 29, 2024 • edited Loading

j2kun commented Apr 29, 2024

j2kun commented Apr 29, 2024

j2kun commented Apr 30, 2024

j2kun commented May 1, 2024 • edited Loading

j2kun commented May 1, 2024 • edited Loading

j2kun commented May 2, 2024

Maokami commented May 2, 2024 • edited Loading

Maokami commented May 2, 2024

j2kun commented May 2, 2024

j2kun commented May 8, 2024

j2kun commented May 31, 2024 • edited Loading

asraa commented Sep 17, 2024

Pro7ech commented Nov 8, 2024

j2kun commented Nov 8, 2024

Pro7ech commented Nov 8, 2024 • edited Loading

j2kun commented Apr 29, 2024 •

edited

Loading

j2kun commented May 1, 2024 •

edited

Loading

j2kun commented May 1, 2024 •

edited

Loading

Maokami commented May 2, 2024 •

edited

Loading

j2kun commented May 31, 2024 •

edited

Loading

Pro7ech commented Nov 8, 2024 •

edited

Loading