Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first cut at fixing a crc32 issue reported by a user #308

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

isildur-g
Copy link

this might fix a problem a user reported where it uses the system library's _mm_crc32_u64 instead of the simde version. making a PR to test in CI

@Unit193
Copy link

Unit193 commented Jul 27, 2024

When testing with this PR, I seem to get

Program terminated with signal SIGILL, Illegal instruction.
#0  0x00007f6fce2a9f78 in hs_alloc_scratch (db=<optimized out>, scratch=0x7f6fce3e6bc8) at ./src/scratch.c:367
367     ./src/scratch.c: No such file or directory.
(gdb) bt full
#0  0x00007f6fce2a9f78 in hs_alloc_scratch (db=<optimized out>, scratch=0x7f6fce3e6bc8) at ./src/scratch.c:367
        rv = <optimized out>
        rose = 0x560141f522c0
        resize = 1
        proto = 0x560141ecdbc0
        proto_tmp = 0x560141ecdb90
        proto_ret = 0
        som_store_count = 0
        queueCount = 7
        bStateSize = <optimized out>
        fullStateSize = 108

@isildur-g
Copy link
Author

hello,
i can't reproduce this problem , either on a VM with specifically extensions only up to sse2 enabled, or on pretty old real hardware (intel core 5, the oldest x86-64 machine i have running here). Are you sure your machine actually supports sse2? what cpu/model is it? could you also paste exactly the cmake options/env vars etc you used to build it?

When testing with this PR, I seem to get

Program terminated with signal SIGILL, Illegal instruction.
#0  0x00007f6fce2a9f78 in hs_alloc_scratch (db=<optimized out>, scratch=0x7f6fce3e6bc8) at ./src/scratch.c:367
367     ./src/scratch.c: No such file or directory.
(gdb) bt full
#0  0x00007f6fce2a9f78 in hs_alloc_scratch (db=<optimized out>, scratch=0x7f6fce3e6bc8) at ./src/scratch.c:367
        rv = <optimized out>
        rose = 0x560141f522c0
        resize = 1
        proto = 0x560141ecdbc0
        proto_tmp = 0x560141ecdb90
        proto_ret = 0
        som_store_count = 0
        queueCount = 7
        bStateSize = <optimized out>
        fullStateSize = 108

@markos
Copy link

markos commented Jul 30, 2024

@Unit193 ^

@Unit193
Copy link

Unit193 commented Jul 30, 2024

Well it claims to, but it's pretty dang old too.

Handle 0x0400, DMI type 4, 40 bytes
Processor Information
        Socket Designation: CPU1
        Type: Central Processor
        Family: Xeon
        Manufacturer: Intel
        ID: 64 0F 00 00 FF FB EB BF
        Signature: Type 0, Family 15, Model 6, Stepping 4
        Flags:
                FPU (Floating-point unit on-chip)
                VME (Virtual mode extension)
                DE (Debugging extension)
                PSE (Page size extension)
                TSC (Time stamp counter)
                MSR (Model specific registers)                                                                                                                                    
                PAE (Physical address extension)
                MCE (Machine check exception)                                                                                                                                     
                CX8 (CMPXCHG8 instruction supported)                                                                                                                              
                APIC (On-chip APIC hardware supported)
                SEP (Fast system call)
                MTRR (Memory type range registers)                                                                                                                                
                PGE (Page global enable)
                MCA (Machine check architecture)                                                                                                                                  
                CMOV (Conditional move instruction supported)
                PAT (Page attribute table)
                PSE-36 (36-bit page size extension)                                                                                                                               
                CLFSH (CLFLUSH instruction supported)
                DS (Debug store)
                ACPI (ACPI supported)                                                                                                                                             
                MMX (MMX technology supported)
                FXSR (FXSAVE and FXSTOR instructions supported)                                                                                                                   
                SSE (Streaming SIMD extensions)                                                                                                                                   
                SSE2 (Streaming SIMD extensions 2)
                SS (Self-snoop)                                                                                                                                                   
                HTT (Multi-threading)                                                                                                                                             
                TM (Thermal monitor supported)                                                                                                                                    
                PBE (Pending break enabled)                                                                                                                                       
        Version:                   Intel(R) Xeon(TM) CPU 3.00GHz                                                                                                                  
        Voltage: 1.4 V                                                                                                                                                            
        External Clock: 667 MHz                                                                                                                                                   
        Max Speed: 3600 MHz                                                                                                                                                       
        Current Speed: 2333 MHz                                                                                                                                                   
        Status: Populated, Enabled                                                                                                                                                
        Upgrade: Socket LGA771                                                                                                                                                    
        L1 Cache Handle: 0x0700                                                                                                                                                   
        L2 Cache Handle: 0x0701
        L3 Cache Handle: 0x0702
        Serial Number: Not Specified
        Asset Tag: Not Specified
        Part Number: Not Specified
        Core Count: 2
        Core Enabled: 2
        Thread Count: 4
        Characteristics:
                64-bit capable

Generally speaking, something like -DBUILD_AVX2=off -DBUILD_AVX512=off -DBUILD_AVX512VBMI=off -DFAT_RUNTIME=off -DBUILD_SSE2_SIMDE=on is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants