Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use scalar emulation of gather instruction for arg methods #65

Merged
merged 3 commits into from
Sep 20, 2023

Conversation

@r-devulap
Copy link
Contributor Author

Benchmarks after the microcode patch update:

[avx512 vs. avx512]argsort/smallrandom_128/int64_t                 -0.2333         -0.2333          1389          1065          1389          1065
[avx512 vs. avx512]argsort/smallrandom_256/int64_t                 -0.2741         -0.2741          3277          2379          3277          2379
[avx512 vs. avx512]argsort/smallrandom_512/int64_t                 -0.2405         -0.2405          7211          5477          7211          5477
[avx512 vs. avx512]argsort/smallrandom_1k/int64_t                  -0.2739         -0.2738         15966         11594         15966         11594
[avx512 vs. avx512]argsort/random_5k/int64_t                       -0.2901         -0.2900         97785         69422         97780         69420
[avx512 vs. avx512]argsort/random_100k/int64_t                     -0.2623         -0.2622       2946188       2173486       2945979       2173433
[avx512 vs. avx512]argsort/random_1m/int64_t                       -0.2214         -0.2215      46900755      36515654      46898992      36512849
[avx512 vs. avx512]argsort/random_10m/int64_t                      -0.1597         -0.1597    1182871389     993944429    1182788376     993869928
[avx512 vs. avx512]argsort/sorted_10k/int64_t                      -0.2904         -0.2904        200094        141988        200081        141979
[avx512 vs. avx512]argsort/constant_10k/int64_t                    -0.3595         -0.3595         15979         10234         15979         10234
[avx512 vs. avx512]argsort/reverse_10k/int64_t                     -0.2918         -0.2917        201492        142705        201483        142702
[avx512 vs. avx512]argsort/smallrandom_128/uint64_t                -0.2311         -0.2311          1387          1066          1387          1066
[avx512 vs. avx512]argsort/smallrandom_256/uint64_t                -0.2708         -0.2708          3267          2382          3266          2382
[avx512 vs. avx512]argsort/smallrandom_512/uint64_t                -0.2391         -0.2391          7198          5477          7198          5477
[avx512 vs. avx512]argsort/smallrandom_1k/uint64_t                 -0.2783         -0.2783         15929         11496         15929         11495
[avx512 vs. avx512]argsort/random_5k/uint64_t                      -0.2861         -0.2862         97630         69693         97627         69691
[avx512 vs. avx512]argsort/random_100k/uint64_t                    -0.2552         -0.2552       2927185       2180138       2927006       2180088
[avx512 vs. avx512]argsort/random_1m/uint64_t                      -0.2212         -0.2213      47069244      36655556      47066284      36650976
[avx512 vs. avx512]argsort/random_10m/uint64_t                     -0.1640         -0.1640    1191513563     996068992    1191419581     996034361
[avx512 vs. avx512]argsort/sorted_10k/uint64_t                     -0.2836         -0.2836        198996        142567        198986        142559
[avx512 vs. avx512]argsort/constant_10k/uint64_t                   -0.3585         -0.3585         15992         10259         15991         10259
[avx512 vs. avx512]argsort/reverse_10k/uint64_t                    -0.2872         -0.2872        200852        143176        200844        143169
[avx512 vs. avx512]argsort/smallrandom_128/double                  -0.2655         -0.2655          1647          1210          1647          1210
[avx512 vs. avx512]argsort/smallrandom_256/double                  -0.2959         -0.2960          3213          2262          3213          2262
[avx512 vs. avx512]argsort/smallrandom_512/double                  -0.2828         -0.2828          8295          5949          8295          5949
[avx512 vs. avx512]argsort/smallrandom_1k/double                   -0.2865         -0.2865         17546         12519         17546         12519
[avx512 vs. avx512]argsort/random_5k/double                        -0.3103         -0.3103         98034         67615         98032         67612
[avx512 vs. avx512]argsort/random_100k/double                      -0.2605         -0.2605       3028509       2239495       3028438       2239478
[avx512 vs. avx512]argsort/random_1m/double                        -0.2097         -0.2097      47315768      37392398      47308646      37387150
[avx512 vs. avx512]argsort/random_10m/double                       -0.1285         -0.1284    1202360786    1047896012    1202202986    1047846981
[avx512 vs. avx512]argsort/sorted_10k/double                       -0.3122         -0.3122        205226        141162        205218        141157
[avx512 vs. avx512]argsort/constant_10k/double                     -0.3176         -0.3176         17845         12177         17845         12177
[avx512 vs. avx512]argsort/reverse_10k/double                      -0.3137         -0.3136        207158        142182        207146        142175
[avx512 vs. avx512]argsort/smallrandom_128/int32_t                 -0.2645         -0.2644          1301           957          1300           957
[avx512 vs. avx512]argsort/smallrandom_256/int32_t                 -0.3029         -0.3029          3188          2223          3188          2222
[avx512 vs. avx512]argsort/smallrandom_512/int32_t                 -0.3222         -0.3222          5827          3949          5827          3949
[avx512 vs. avx512]argsort/smallrandom_1k/int32_t                  -0.3308         -0.3308         14910          9978         14909          9978
[avx512 vs. avx512]argsort/random_5k/int32_t                       -0.3428         -0.3428         84583         55586         84582         55585
[avx512 vs. avx512]argsort/random_100k/int32_t                     -0.2967         -0.2966       2631532       1850835       2631371       1850786
[avx512 vs. avx512]argsort/random_1m/int32_t                       -0.2198         -0.2198      37253495      29065827      37251665      29065191
[avx512 vs. avx512]argsort/random_10m/int32_t                      -0.1622         -0.1622     957843854     802474608     957723829     802334533
[avx512 vs. avx512]argsort/sorted_10k/int32_t                      -0.3394         -0.3394        174916        115541        174904        115536
[avx512 vs. avx512]argsort/constant_10k/int32_t                    -0.3714         -0.3713         15668          9850         15668          9849
[avx512 vs. avx512]argsort/reverse_10k/int32_t                     -0.3415         -0.3415        177100        116617        177091        116614
[avx512 vs. avx512]argsort/smallrandom_128/uint32_t                -0.2650         -0.2650          1300           955          1300           955
[avx512 vs. avx512]argsort/smallrandom_256/uint32_t                -0.3066         -0.3066          3205          2222          3205          2222
[avx512 vs. avx512]argsort/smallrandom_512/uint32_t                -0.3247         -0.3246          5843          3946          5843          3946
[avx512 vs. avx512]argsort/smallrandom_1k/uint32_t                 -0.3343         -0.3343         14956          9957         14956          9956
[avx512 vs. avx512]argsort/random_5k/uint32_t                      -0.3466         -0.3466         84807         55411         84805         55411
[avx512 vs. avx512]argsort/random_100k/uint32_t                    -0.3012         -0.3012       2636747       1842537       2636505       1842483
[avx512 vs. avx512]argsort/random_1m/uint32_t                      -0.2246         -0.2245      37393832      28993477      37387944      28992980
[avx512 vs. avx512]argsort/random_10m/uint32_t                     -0.1624         -0.1624     957185639     801762773     957126837     801693463
[avx512 vs. avx512]argsort/sorted_10k/uint32_t                     -0.3428         -0.3428        174678        114799        174672        114798
[avx512 vs. avx512]argsort/constant_10k/uint32_t                   -0.3706         -0.3706         15667          9861         15666          9860
[avx512 vs. avx512]argsort/reverse_10k/uint32_t                    -0.3469         -0.3469        177611        115997        177602        115994
[avx512 vs. avx512]argsort/smallrandom_128/float                   -0.2607         -0.2606          1378          1018          1377          1018
[avx512 vs. avx512]argsort/smallrandom_256/float                   -0.2723         -0.2723          3286          2391          3286          2391
[avx512 vs. avx512]argsort/smallrandom_512/float                   -0.2932         -0.2932          6689          4728          6689          4728
[avx512 vs. avx512]argsort/smallrandom_1k/float                    -0.3017         -0.3017         16988         11863         16988         11863
[avx512 vs. avx512]argsort/random_5k/float                         -0.3156         -0.3156         94636         64770         94632         64769
[avx512 vs. avx512]argsort/random_100k/float                       -0.2890         -0.2891       2925335       2079775       2925217       2079636
[avx512 vs. avx512]argsort/random_1m/float                         -0.2202         -0.2201      39791664      31027804      39785129      31027170
[avx512 vs. avx512]argsort/random_10m/float                        -0.1582         -0.1582     982846827     827323238     982744913     827238848
[avx512 vs. avx512]argsort/sorted_10k/float                        -0.3140         -0.3140        196822        135020        196813        135017
[avx512 vs. avx512]argsort/constant_10k/float                      -0.3486         -0.3486         17496         11397         17495         11397
[avx512 vs. avx512]argsort/reverse_10k/float                       -0.3188         -0.3188        199944        136199        199932        136196

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant