Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use crate with MSM and turn on portable #154

Merged
merged 3 commits into from
Oct 22, 2024

Conversation

matthewkeil
Copy link
Member

@matthewkeil matthewkeil commented Oct 15, 2024

Attempt to fix illegal instruction for celeron.

Perf results from without/with portable feature. Averaged the result from 4 runs with and 4 runs without for surety.

Overall is a bit faster on my mac, and results below show perf on a linux host. Many are faster but a couple are slower, but they are small differences. The one that did jump out at me though is 5% slower aggregating signatures. However its 1% faster aggregating with randomness.

Think this is a smart change for compatibility and guessing the per difference will not be enough to warranty to separate distributions of Lodestar.

                                                           without portable    with portable
                                                           ================    ================
PublicKey
    ✓ PublicKey serialization                                  1.3572 us/op      1.3216 us/op
    ✓ PublicKey deserialize                                   11.832  us/op     11.4223 us/op
    ✓ PublicKey deserialize and validate - 1 keys             44.2532 us/op     42.2470 us/op
    ✓ PublicKey deserialize and validate - 100 keys            4.4171 ms/op      4.1351 ms/op
    ✓ PublicKey deserialize and validate - 10000 keys        433.5016 ms/op    425.3430 ms/op

SecretKey
    ✓ SecretKey.fromKeygen                                     1.3330 us/op      1.2390 us/op
    ✓ SecretKey serialization                                  1.1740 us/op      1.0820 us/op
    ✓ SecretKey deserialization                                1.0962 us/op      0.9990 us/op
    ✓ SecretKey.toPublicKey                                   72.0770 us/op     71.0530 us/op
    ✓ SecretKey.sign                                         268.6522 us/op    263.9860 us/op

Signature
    ✓ Signature serialization                                  1.2705 us/op      1.2056 us/op
    ✓ Signature deserialize                                   23.1830 us/op     21.9480 us/op
    ✓ Signatures deserialize and validate - 1 sets            60.2580 us/op     60.2580 us/op
    ✓ Signatures deserialize and validate - 100 sets           6.1925 ms/op      6.1075 ms/op
    ✓ Signatures deserialize and validate - 10000 sets       649.4870 ms/op    628.6073 ms/op

functions
    aggregatePublicKeys
        ✓ aggregatePublicKeys - 1 sets                       882.0000 ns/op    860.0000 ns/op
        ✓ aggregatePublicKeys - 8 sets                         6.1750 us/op      6.3503 us/op
        ✓ aggregatePublicKeys - 32 sets                       17.5630 us/op     16.9923 us/op
        ✓ aggregatePublicKeys - 128 sets                      61.4280 us/op     60.0560 us/op
        ✓ aggregatePublicKeys - 256 sets                     119.2770 us/op    119.1583 us/op
    aggregateSignatures
        ✓ aggregateSignatures - 1 sets                       953.0000 ns/op    898.0000 ns/op
        ✓ aggregateSignatures - 8 sets                        11.6060 us/op     10.6923 us/op
        ✓ aggregateSignatures - 32 sets                       36.6710 us/op     35.3333 us/op
        ✓ aggregateSignatures - 128 sets                     136.8770 us/op    135.4700 us/op
        ✓ aggregateSignatures - 256 sets                     279.5440 us/op    271.8913 us/op
    aggregateWithRandomness
        ✓ aggregateWithRandomness - 1 sets                   179.3080 us/op    179.4840 us/op
        ✓ aggregateWithRandomness - 16 sets                    1.2897 ms/op      1.2738 ms/op
        ✓ aggregateWithRandomness - 128 sets                   7.8154 ms/op      7.9330 ms/op
        ✓ aggregateWithRandomness - 256 sets                  15.6922 ms/op     15.5924 ms/op
        ✓ aggregateWithRandomness - 512 sets                  31.6398 ms/op     31.5240 ms/op
        ✓ aggregateWithRandomness - 1024 sets                 62.2136 ms/op     62.3204 ms/op
    aggregateVerify
        ✓ aggregateVerify - 1 sets                           603.3200 us/op    603.0067 us/op
        ✓ aggregateVerify - 8 sets                           717.1630 us/op    728.4093 us/op
        ✓ aggregateVerify - 32 sets                            1.3742 ms/op      1.3936 ms/op
        ✓ aggregateVerify - 128 sets                           4.0280 ms/op      4.0536 ms/op
        ✓ aggregateVerify - 256 sets                           6.7707 ms/op      6.9581 ms/op
    verifyMultipleAggregateSignatures
        ✓ verifyMultipleAggregateSignatures - 1 sets         870.4150 us/op    871.2270 us/op
        ✓ verifyMultipleAggregateSignatures - 8 sets         979.8300 us/op    970.4206 us/op
        ✓ verifyMultipleAggregateSignatures - 32 sets          1.8953 ms/op      1.8857 ms/op
        ✓ verifyMultipleAggregateSignatures - 128 sets         5.1984 ms/op      5.0069 ms/op
        ✓ verifyMultipleAggregateSignatures - 256 sets         9.4786 ms/op      9.3287 ms/op
    verifyMultipleAggregateSignatures same message
        ✓ Same message - 1 sets                              668.8890 us/op    667.0863 us/op
        ✓ Same message - 8 sets                                1.1127 ms/op      1.0993 ms/op
        ✓ Same message - 32 sets                               2.6022 ms/op      2.4810 ms/op
        ✓ Same message - 128 sets                              8.5372 ms/op      8.3003 ms/op
        ✓ Same message - 256 sets                             16.6121 ms/op     16.6121 ms/op

@matthewkeil matthewkeil requested a review from a team as a code owner October 15, 2024 07:46
@matthewkeil
Copy link
Member Author

Perf results from herztner linux host

                                                                 without portable  with portable
                                                                 ================  ================
  PublicKey
    ✓ PublicKey serialization                                    3.142000 us/op    3.257000 us/op
    ✓ PublicKey deserialize                                      18.46600 us/op    19.09900 us/op
    ✓ PublicKey deserialize and validate - 1 keys                67.42900 us/op    84.52600 us/op
    ✓ PublicKey deserialize and validate - 100 keys              6.570816 ms/op    8.101886 ms/op
    ✓ PublicKey deserialize and validate - 10000 keys            694.4406 ms/op    754.5784 ms/op

  SecretKey
    ✓ SecretKey.fromKeygen                                       3.149000 us/op    2.844000 us/op
    ✓ SecretKey serialization                                    2.480000 us/op    2.577000 us/op
    ✓ SecretKey deserialization                                  2.563000 us/op    2.456000 us/op
    ✓ SecretKey.toPublicKey                                      120.5220 us/op    115.8130 us/op
    ✓ SecretKey.sign                                             414.8980 us/op    471.9400 us/op

  Signature
    ✓ Signature serialization                                    3.159000 us/op    3.510000 us/op
    ✓ Signature deserialize                                      33.20300 us/op    34.34500 us/op
    ✓ Signatures deserialize and validate - 1 sets               94.89100 us/op    97.31100 us/op
    ✓ Signatures deserialize and validate - 100 sets             10.07905 ms/op    9.350226 ms/op
    ✓ Signatures deserialize and validate - 10000 sets           986.1137 ms/op    981.0709 ms/op

  functions
    aggregatePublicKeys
      ✓ aggregatePublicKeys - 1 sets                             2.332000 us/op    2.575000 us/op
      ✓ aggregatePublicKeys - 8 sets                             12.61600 us/op    12.52400 us/op
      ✓ aggregatePublicKeys - 32 sets                            30.97300 us/op    28.65100 us/op
      ✓ aggregatePublicKeys - 128 sets                           115.9720 us/op    105.5620 us/op
      ✓ aggregatePublicKeys - 256 sets                           204.6460 us/op    209.7040 us/op
    aggregateSignatures
      ✓ aggregateSignatures - 1 sets                             2.695000 us/op    2.711000 us/op
      ✓ aggregateSignatures - 8 sets                             18.34300 us/op    21.04700 us/op
      ✓ aggregateSignatures - 32 sets                            60.18400 us/op    69.99900 us/op
      ✓ aggregateSignatures - 128 sets                           231.2630 us/op    248.3100 us/op
      ✓ aggregateSignatures - 256 sets                           462.5010 us/op    485.4710 us/op
    aggregateWithRandomness
      ✓ aggregateWithRandomness - 1 sets                         234.0890 us/op    276.7920 us/op
      ✓ aggregateWithRandomness - 16 sets                        2.381737 ms/op    3.680782 ms/op
      ✓ aggregateWithRandomness - 128 sets                       15.50784 ms/op    17.89131 ms/op
      ✓ aggregateWithRandomness - 256 sets                       32.66754 ms/op    30.74395 ms/op
      ✓ aggregateWithRandomness - 512 sets                       61.95717 ms/op    57.96183 ms/op
      ✓ aggregateWithRandomness - 1024 sets                      112.9946 ms/op    109.9115 ms/op
    aggregateVerify
      ✓ aggregateVerify - 1 sets                                 1.253683 ms/op    1.389746 ms/op
      ✓ aggregateVerify - 8 sets                                 3.898257 ms/op    3.371304 ms/op
      ✓ aggregateVerify - 32 sets                                8.667316 ms/op    8.427201 ms/op
      ✓ aggregateVerify - 128 sets                               24.15127 ms/op    23.48762 ms/op
      ✓ aggregateVerify - 256 sets                               40.40848 ms/op    40.25433 ms/op
    verifyMultipleAggregateSignatures
      ✓ verifyMultipleAggregateSignatures - 1 sets               1.514626 ms/op    1.428105 ms/op
      ✓ verifyMultipleAggregateSignatures - 8 sets               4.652070 ms/op    4.788486 ms/op
      ✓ verifyMultipleAggregateSignatures - 32 sets              9.948129 ms/op    10.31866 ms/op
      ✓ verifyMultipleAggregateSignatures - 128 sets             30.71151 ms/op    29.48134 ms/op
      ✓ verifyMultipleAggregateSignatures - 256 sets             48.85230 ms/op    47.52506 ms/op
    verifyMultipleAggregateSignatures same message
      ✓ Same message - 1 sets                                    1.434673 ms/op    1.482760 ms/op
      ✓ Same message - 8 sets                                    2.101413 ms/op    1.997373 ms/op
      ✓ Same message - 32 sets                                   4.335625 ms/op    4.336472 ms/op
      ✓ Same message - 128 sets                                  13.95897 ms/op    15.11741 ms/op
      ✓ Same message - 256 sets                                  25.63198 ms/o    26.69647 ms/opp

@wemeetagain
Copy link
Member

Can you deploy this to a feature group? Would like to see what it looks with a real-world load first

@matthewkeil
Copy link
Member Author

matthewkeil commented Oct 17, 2024

Can you deploy this to a feature group? Would like to see what it looks with a real-world load first

Results look good on feat3. Linked this issue to the branch in lodestar. Once we are comfy that this fix is safe and this gets published I will update that PR to pull the published version

ChainSafe/lodestar#7164 (comment)

@philknows
Copy link
Member

Correct me if I'm wrong, but by enabling portable, we would just have a portable version and a non-portable version that we would just publish as part of our release process? This way the slight performance hit can be mitigated by most (non-Celeron) users by just running the non-portable docker image/binary/etc. ?

@matthewkeil
Copy link
Member Author

Correct me if I'm wrong, but by enabling portable, we would just have a portable version and a non-portable version that we would just publish as part of our release process? This way the slight performance hit can be mitigated by most (non-Celeron) users by just running the non-portable docker image/binary/etc. ?

In the worst case that is correct. But so far the metrics look almost identical and the perf times for the library are almost identical so we may not need to publish two and default to the portable version

Copy link
Member

@wemeetagain wemeetagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics look 👍
We don't need to maintain separate versions for portable/non-portable

@matthewkeil matthewkeil merged commit b86329d into master Oct 22, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants