Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault Caused by mhpmcounter10 Access Not Implemented in DUT and Spike #129

Open
Bill94l opened this issue Dec 9, 2024 · 1 comment

Comments

@Bill94l
Copy link

Bill94l commented Dec 9, 2024

Hi,

The attached riscv-dv generated code attempts to access mhpmcounter10, which is not implemented in either the DUT or Spike. This results in a segmentation fault and the code fails. Below is the specific line of code causing the issue:

    800001b6:	b0aa15f3          	csrrw	a1,mhpmcounter10,s4

After this issue #121 (comment) I modified the number of HPM counters in Spike to ensure consistency between the DUT and Spike but that doesn't change anything:
https://github.com/SpinalHDL/riscv-isa-sim/blob/5b37ab699f6176ea7f2d2b3187f75625eca7e511/riscv/processor.h#L22

#define N_HPMCOUNTERS 7 //29

Results:

  1. With the regression simulator:
./obj_dir/VNaxRiscv --name dv --output-dir failed --load-elf basic_test_1.o --start-symbol _start --pass-symbol pass --fail-symbol fail --no-putc-flush --seed 96102624 ${ARGS} --trace-ref

*** MISSMATCH PC DUT=80103000 REF=8010301c ***

TIME=574
LAST PC COMMIT=80103000
INCOMING SPIKE PC=80101bf8
ROB_ID=x1c
FAILURE dv
  1. With RVLS using SocSim
[info] **************************************************
[info] DUT-PC = 00000000800001b6, REF-PC=800001b6, INST = b0aa15f3
[info] Debug: CSR_MHPMCOUNTER3 = 0xb03, CSR_MHPMCOUNTER31 = 0xb1f, csrAddress = 0xb0a, csrReadData = 0x0
[info] Warning: csrmap[0xb0a] is null. Skipping write.
[info] # A fatal error has been detected by the Java Runtime Environment:
[info] #
[info] #  SIGSEGV (0xb) at pc=0x00007ffa1f6e01e8, pid=2445164, tid=0x00007ffa68e47640
[info] #
[info] # JRE version: OpenJDK Runtime Environment (8.0_432) (build 1.8.0_432-8u432-ga~us1-0ubuntu2~22.04-ga)
[info] # Java VM: OpenJDK 64-Bit Server VM (25.432-bga mixed mode linux-amd64 compressed oops)
[info] # Problematic frame:
[info] # C  [rvls.so+0x3a91e8]  processor_t::get_csr(int, insn_t, bool, bool)+0x58
[info] #
[info] # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
[info] #
[info] # An error report file with more information is saved as:
[info] # /media/big_dd/nanotrust/bi262934/naxriscv_tuleap/naxriscv_dv_vf/NaxRiscv/hs_err_pid2445164.log
[info] #
[info] # If you would like to submit a bug report, please visit:
[info] #   http://bugreport.java.com/bugreport/crash.jsp
[error] Nonzero exit code returned from runner: 134
[error] (Compile / runMain) Nonzero exit code returned from runner: 134
[error] Total time: 12 s, completed 10 déc. 2024 00:09:08

failed_test.zip

Could you please provide guidance on how to handle this mismatch or segmentation fault when accessing unimplemented HPM counters?

Thank you!

@Bill94l Bill94l changed the title Segmentation Fault Caused by mhpmcounter10 Access Not Implemented in DUT or Spike Segmentation Fault Caused by mhpmcounter10 Access Not Implemented in DUT and Spike Dec 9, 2024
@Dolu1990
Copy link
Member

Hi,

With the regression simulator:

I would say, better focus on getting things ok with RVLS, and omit this one.

With RVLS using SocSim

So, when i have crashed happening in RVLS (when called from SpinalSim), then what i do, is to ask the testbench to generate the trace.log file (--trace argument should do it), which contains all the execution trace in the format RVLS is able to read.
Then with that trace.log file, i run RVLS in debug mode (in a standalone mode), with the -f trace.log argument.
There i can use regular C/C++ debug tools and figure out what is going wrong.

Just be sure to compile riscv-isa-sim in debug mode by adding :
CFLAGS='-g -O0' CXXFLAGS='-g -O0'
as arguments of the configure command

So in short, figuring out why RVLS crash is the first thing to do :)
It maybe due to that :
https://github.com/SpinalHDL/rvls/blob/b5c15e129f8168c2317b2c129aef91adf935bfeb/src/hart.cpp#L234
RVLS will syncronise the pmucounter on access, as they are hardware specific. Thing is your changes in riscv-isa-sim seems to deleted the extra counters ? (rvls assume their CSR is defined)

The implement intends of NaxRiscv is to match the RISC-V spec, but there may be a few missmatches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants