Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal hardware instruction (core dumped) #297

Open
kaes1a opened this issue Dec 19, 2024 · 8 comments
Open

Illegal hardware instruction (core dumped) #297

kaes1a opened this issue Dec 19, 2024 · 8 comments

Comments

@kaes1a
Copy link

kaes1a commented Dec 19, 2024

hi, when i compile the latest pktgen with DPDK-24.11.0 LTS,and run it with
sudo ./usr/local/bin/pktgen -l 1-3 -n 1 -- -T -P -m "2.0,3.1"
i got the those messages

sudo ./usr/local/bin/pktgen -l 1-3 -n 1 -- -T -P -m "2.0,3.1"

*** Copyright(c) <2010-2024>, Intel Corporation. All rights reserved.
*** Pktgen created by: Keith Wiles -- >>> Powered by <<<

EAL: Detected CPU lcores: 4
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: VFIO support initialized
EAL: Using IOMMU type 1 (Type 1)
[1] 13374 illegal hardware instruction sudo ./usr/local/bin/pktgen -l 1-3 -n 1 -- -T -P -m "2.0,3.1"

and than i check the kernel messages:

dmesg:
[18548.263895] pcieport 0000:00:1c.4: Enabling MPC IRBNCE
[18548.263905] pcieport 0000:00:1c.4: Intel PCH root port ACS workaround enabled
[18548.276493] pcieport 0000:00:1c.5: Enabling MPC IRBNCE
[18548.276500] pcieport 0000:00:1c.5: Intel PCH root port ACS workaround enabled
[18548.482901] pcieport 0000:00:1c.4: Enabling MPC IRBNCE
[18548.482912] pcieport 0000:00:1c.4: Intel PCH root port ACS workaround enabled
[18548.764898] pcieport 0000:00:1c.5: Enabling MPC IRBNCE
[18548.764906] pcieport 0000:00:1c.5: Intel PCH root port ACS workaround enabled
[18549.033074] traps: pktgen[13376] trap invalid opcode ip:5a1ea017a192 sp:7ffd02106850 error:0 in pktgen[5a1ea013d000+43000]

so, what's the problem? how can i fix it?
thx!

the build log is here

>>> Ninja build in '/home/ubuntu/Pktgen-DPDK/builddir' buildtype=release
meson setup -Dbuildtype=release -Denable_lua=false /home/ubuntu/Pktgen-DPDK/builddir
The Meson build system
Version: 1.3.2
Source dir: /home/ubuntu/Pktgen-DPDK
Build dir: /home/ubuntu/Pktgen-DPDK/builddir
Build type: native build
Program cat found: YES (/usr/bin/cat)
Project name: pktgen
Project version: 24.10.3
C compiler for the host machine: cc (gcc 13.3.0 "cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0")
C linker for the host machine: cc ld.bfd 2.42
Host machine cpu family: x86_64
Host machine cpu: x86_64
Compiler for C supports arguments -mavx: YES
Compiler for C supports arguments -mavx2: YES
Compiler for C supports arguments -Wno-pedantic: YES
Compiler for C supports arguments -Wno-format-truncation: YES
Found pkg-config: YES (/usr/bin/pkg-config) 1.8.1
Did not find CMake 'cmake'
Found CMake: NO
Run-time dependency libfgen found: NO (tried pkgconfig and cmake)
Run-time dependency libdpdk found: YES 24.11.0
Message: prefix: /usr/local libdir: lib/x86_64-linux-gnu
Message: DPDK lib path: /usr/local/lib/x86_64-linux-gnu
Library rte_net_bond found: YES
Program python3 found: YES (/usr/bin/python3)
Library rte_net_i40e found: YES
Library rte_net_ixgbe found: YES
Library rte_net_ice found: YES
Library rte_bus_vdev found: YES
Run-time dependency threads found: YES
Run-time dependency numa found: YES 2.0.18
Run-time dependency pcap found: YES 1.10.4
Library dl found: YES
Library m found: YES
Library bsd found: YES
Program doxygen found: NO
Program sphinx-build found: NO
Build targets in project: 9

pktgen 24.10.3

  User defined options
    buildtype : release
    enable_lua: false

Found ninja-1.11.1 at /usr/bin/ninja
ninja: Entering directory `/home/ubuntu/Pktgen-DPDK/builddir'
[64/64] Linking target app/pktgen
>>> Ninja install to '/home/ubuntu/Pktgen-DPDK/usr/local'
ninja: Entering directory `/home/ubuntu/Pktgen-DPDK/builddir'
[0/1] Installing files.
Installing app/pktgen to /home/ubuntu/Pktgen-DPDK/usr/local/bin 
@KeithWiles
Copy link
Collaborator

The above error means it detected a invalid instruction for the CPU you are running on. It also looks like the code failed in DPDK routines, but not positive.

What is the CPU you are running on? It could be the code was built for one CPU and executed on another one, without any exact location it will be hard to debug.

@kaes1a
Copy link
Author

kaes1a commented Dec 20, 2024

The above error means it detected a invalid instruction for the CPU you are running on. It also looks like the code failed in DPDK routines, but not positive.

What is the CPU you are running on? It could be the code was built for one CPU and executed on another one, without any exact location it will be hard to debug.

this is my machine info:

OS: Ubuntu 24.04.1 LTS x86_64 
Kernel: 6.8.0-51-generic
Terminal: /dev/pts/0
CPU: Intel i5-2520M (4) @ 2.500GHz
GPU: Intel 2nd Generation Core Processor Family
Memory: 2364MiB / 3808MiB

00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5)
00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 2 (rev b5)
00:1c.2 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 3 (rev b5)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 4 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b5)
00:1c.5 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 6 (rev b5)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation HM65 Express Chipset LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port Mobile SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 05)
01:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
02:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
03:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
04:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
05:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
06:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)


AnonHugePages:         0 kB
ShmemHugePages:     6144 kB
FileHugePages:         0 kB
HugePages_Total:    1024
HugePages_Free:     1023
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:         2097152 kB

In this case, I first compiled DPDK-24.11.0, followed by pktgen. I ensure that both were compiled and executed exclusively on this machine. When I ran dpdk-testpmd, it functioned normally. However, during the testpmd process, while testing the port with the PMD driver, I encountered the following messages:

testpmd> show port stats all

  ######################## NIC statistics for port 0  ########################
  RX-packets: 14910417   RX-missed: 0          RX-bytes:  954267776
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 14923614   TX-errors: 0          TX-bytes:  955112248

  Throughput (since last show)
  Rx-pps:      1195303          Rx-bps:    611994648
  Tx-pps:      1196172          Tx-bps:    612440648
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 14924269   RX-missed: 0          RX-bytes:  955154780
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 14911123   TX-errors: 0          TX-bytes:  954313708

  Throughput (since last show)
  Rx-pps:      1196143          Rx-bps:    612438640
  Tx-pps:      1195249          Tx-bps:    611975384
  ############################################################################

So, it at least proves that DPDK has no issues. How can I take steps to catch errors?
Thx!

@KeithWiles
Copy link
Collaborator

Looks like you are going to need to add some printfs to the code in pktgen to determine the exact location or build pktgen with debug enabled and run with GDB.

make clean debug
sudo gdb -args ./usr/local/bin/pktgen -l 1-3 -n 1 -- -T -P -m "2.0,3.1"

Think this the the command to run GDB, but I can not look ATM. This should stop at the error and we need to determine whare in Pktgen it is failing.

I have not tested or have a i5 processor to test pktgen on.

@kaes1a
Copy link
Author

kaes1a commented Dec 23, 2024

When i build with debug mode make clean debug,
then run it with gdb and got message:

(gdb) run
Starting program: /home/ubuntu/Pktgen-DPDK/usr/local/bin/pktgen -l 1-3 -n 1 -- -T -P -m 2.0,3.1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libcap.so.2

*** Copyright(c) <2010-2024>, Intel Corporation. All rights reserved.
*** Pktgen  created by: Keith Wiles -- >>> Powered by <<<

[Detaching after vfork from child process 28756]
[Detaching after vfork from child process 28759]
[Detaching after vfork from child process 28761]
[Detaching after vfork from child process 28763]
EAL: Detected CPU lcores: 4
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libmlx5.so.1
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libmlx4.so.1
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libmana.so.1
[New Thread 0x7ffff46006c0 (LWP 28766)]
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
[New Thread 0x7ffff3c006c0 (LWP 28767)]
EAL: Selected IOVA mode 'VA'
EAL: VFIO support initialized
[New Thread 0x7ffff32006c0 (LWP 28768)]
[New Thread 0x7ffff28006c0 (LWP 28769)]
EAL: Using IOMMU type 1 (Type 1)
[New Thread 0x7ffff1e006c0 (LWP 28770)]
Invalid mapping format '2.0,3.1'
!ERROR!: error or too many mapping entries
Initialize Port 0 ...

Thread 1 "pktgen" received signal SIGILL, Illegal instruction.
0x000055555559c160 in _mm256_loadu_si256 (__P=0x555555602bc0 <default_port_conf>) at /usr/lib/gcc/x86_64-linux-gnu/13/include/avxintrin.h:929
929	  return *__P;
(gdb)

@KeithWiles
Copy link
Collaborator

KeithWiles commented Dec 23, 2024

The invalid mapping format error, seems odd change -m 2.0,3.1 to -m 2.0 -m 3.1 as for the illegal instructtion it is failing on an AVX instruction for 256 bit register. It seems the processor does not support this instruction. Can you issue the where command in gdb to see what the stack looks like? Use the p __P gdb command to see what __P value is.

@kaes1a
Copy link
Author

kaes1a commented Dec 24, 2024

The invalid mapping format error, seems odd change -m 2.0,3.1 to -m 2.0 -m 3.1 as for the illegal instructtion it is failing on an AVX instruction for 256 bit register. It seems the processor does not support this instruction. Can you issue the where command in gdb to see what the stack looks like? Use the p __P gdb command to see what __P value is.

ok,thx! there is the debug log:

(gdb) run
Starting program: /home/ubuntu/Pktgen-DPDK/usr/local/bin/pktgen -l 1-3 -n 1 -- -T -P -m 2.0 -m 3.1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libcap.so.2

*** Copyright(c) <2010-2024>, Intel Corporation. All rights reserved.
*** Pktgen  created by: Keith Wiles -- >>> Powered by <<<

[Detaching after vfork from child process 33205]
[Detaching after vfork from child process 33208]
[Detaching after vfork from child process 33210]
[Detaching after vfork from child process 33212]
EAL: Detected CPU lcores: 4
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libmlx5.so.1
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libmlx4.so.1
warning: could not find '.gnu_debugaltlink' file for /lib/x86_64-linux-gnu/libmana.so.1
[New Thread 0x7ffff46006c0 (LWP 33214)]
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
[New Thread 0x7ffff3c006c0 (LWP 33215)]
EAL: Selected IOVA mode 'VA'
EAL: VFIO support initialized
[New Thread 0x7ffff32006c0 (LWP 33216)]
[New Thread 0x7ffff28006c0 (LWP 33217)]
EAL: Using IOMMU type 1 (Type 1)
[New Thread 0x7ffff1e006c0 (LWP 33218)]
  Create: 'RX-L2/P0/S0     ' - Memory used (MBUFs 16,384 x size  2,176) =   34,817 KB @ 0x1002c3dc0
  Create: 'TX-L2/P0/S0     ' - Memory used (MBUFs 16,384 x size  2,176) =   34,817 KB @ 0x102afdf00
  Create: 'SP-L2/P0/S0     ' - Memory used (MBUFs  1,024 x size  2,176) =    2,177 KB @ 0x102cf8c40
  Create: 'RX-L3/P1/S0     ' - Memory used (MBUFs 16,384 x size  2,176) =   34,817 KB @ 0x1052acf80
  Create: 'TX-L3/P1/S0     ' - Memory used (MBUFs 16,384 x size  2,176) =   34,817 KB @ 0x107cfdf00
  Create: 'SP-L3/P1/S0     ' - Memory used (MBUFs  1,024 x size  2,176) =    2,177 KB @ 0x107ef8c40
                                                      Total memory used =  143,622 KB
Initialize Port 0 ...

Thread 1 "pktgen" received signal SIGILL, Illegal instruction.
0x000055555559c160 in _mm256_loadu_si256 (__P=0x555555602bc0 <default_port_conf>) at /usr/lib/gcc/x86_64-linux-gnu/13/include/avxintrin.h:929
929	  return *__P;
(gdb) where
#0  0x000055555559c160 in _mm256_loadu_si256 (__P=0x555555602bc0 <default_port_conf>) at /usr/lib/gcc/x86_64-linux-gnu/13/include/avxintrin.h:929
#1  rte_mov32 (src=0x555555602bc0 <default_port_conf> "", dst=0x7fffffffcf30 "") at /usr/local/include/rte_memcpy.h:127
#2  rte_memcpy_generic (n=2264, src=0x555555602bc0 <default_port_conf>, dst=0x7fffffffcf30) at /usr/local/include/rte_memcpy.h:453
#3  rte_memcpy (n=2280, src=0x555555602bc0 <default_port_conf>, dst=0x7fffffffcf30) at /usr/local/include/rte_memcpy.h:757
#4  initialize_port_info (pid=0) at ../app/pktgen-port-cfg.c:138
#5  0x000055555559d604 in pktgen_config_ports () at ../app/pktgen-port-cfg.c:315
#6  0x0000555555598a77 in main (argc=7, argv=0x7fffffffe440) at ../app/pktgen-main.c:464
(gdb) p __P
$1 = (const __m256i_u *) 0x555555602bc0 <default_port_conf>
(gdb)

@KeithWiles
Copy link
Collaborator

This instruction is executed in rte_memcpy() routine which uses AVX instructions to copy memory quickly. This means to me that DPDK is not compiled correctly for this machine and it is using instructions which are not present. I think you stated this was an i5 processor, which may not be suitable for DPDK.

@kaes1a
Copy link
Author

kaes1a commented Dec 25, 2024

This instruction is executed in rte_memcpy() routine which uses AVX instructions to copy memory quickly. This means to me that DPDK is not compiled correctly for this machine and it is using instructions which are not present. I think you stated this was an i5 processor, which may not be suitable for DPDK.

Haha, I think so too. This CPU already old than old. I'm testing DPDK on this device just to evaluate its performance and verify whether it can still be used in a production environment.

Thank you again! I think I might need to update the devices. 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants