Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about the gen.scala and ifetch #119

Open
duanjiulon opened this issue Jul 17, 2024 · 18 comments
Open

about the gen.scala and ifetch #119

duanjiulon opened this issue Jul 17, 2024 · 18 comments

Comments

@duanjiulon
Copy link

duanjiulon commented Jul 17, 2024

Uploading 0fdd12cd-e038-4a25-b084-1f83504b4812.png…
Hi, recently I have been using your nax gen.scala file to generate a core with three axi interfaces, ibus,dbus,pbus, After connecting another axicrossbar to an axi system, the anchipram was set to 0 and the DDR address was 0x40000000. The following problem occurred during axi debugging:

  1. By loading the address of 0x0 through openocd+gdb, single step debugging can be successfully performed
  2. When loading the DDR address containing 0x40000000 through openocd+gdb, the following error message will appear
Loading section .init, size 0x6e lma 0x40000000
Loading section .text, size 0x16fc lma 0x40000070
Loading section .data, size 0x858 lma 0x4000176c
Start address 0x40000000, load size 8130
Transfer rate: 63 KB/sec, 2710 bytes/write.
(gdb) si
unable to resume hart 0
  dmstatus =0x00430c82
  was stepping, halting
unable to halt hart 0
  dmcontrol=0x00000001
  dmstatus =0x00430c82
Hart was not halted after single step!
unable to step rtos hart

By capturing the waveform signal, it can be preliminarily determined that IFetc failed at address 0x40000000
4A2D549A-3065-4A61-8777-2C704E7860FD
you can look at this picture
in the gen.scala, I set

def plugins = {
    val l = Config1.plugins(
      withRdTime = true,
      aluCount    = 2,
      decodeCount = 2,
      debugTriggers = 4,
      withDedicatedLoadAgu = true,
      withRvc = true,
      withLoadStore = true,
      withMmu = true,
      withDebug = true,
      withEmbeddedJtagTap = false,
      jtagTunneled = false,
      withFloat = false,
      withDouble = false,
      withLsu2 = true,
      lqSize = 16,
      sqSize = 16,
            withCoherency = true,
      ioRange = a => a(31 downto 28) === 0xf// || !a(12)//(a(5, 6 bits) ^ a(12, 6 bits)) === 51
    )

May I ask which parameter caused the issue with this fetch,thanks

@duanjiulon
Copy link
Author


60C5D223-9405-47B0-982E-06B03B26B0B1

0fdd12cd-e038-4a25-b084-1f83504b4812

@Dolu1990
Copy link
Member

Hi,

There is also the fetchRange which can matter, but by default it is :
fetchRange : UInt => Bool = _(31 downto 28) =/= 0x1,

Which should be fine.

Did you tried first the openocd telnet ?
trying to use it to load binaries into memory and reading them back.

Also, one important step, is to run everything in simulation (including jtag / openocd), that way we aren't facing a blackbox to debug.
What are you using to run simulation in general ?

@duanjiulon
Copy link
Author

duanjiulon commented Jul 17, 2024 via email

@Dolu1990
Copy link
Member

To reduce bug possibilities, maybe try with :
aluCount = 1,
decodeCount = 1,
debugTriggers = 0,
withDedicatedLoadAgu = true,
withRvc = false,

This will reduce the CPU complexity and maybe workaround a potential bug in the debug module ?

@duanjiulon
Copy link
Author

duanjiulon commented Jul 17, 2024

Did you tried first the openocd telnet ?
trying to use it to load binaries into memory and reading them back.

  1. I have tried your method, and after loading through GDB, the openocd terminal can read the corresponding hexadecimal data through the mdw command. Loading is done through dbus, but as soon as I start running or step through debugging, the core will freeze directly. However, running the program on the address of onchipram does not have this problem, but running the program on the address of DDR will cause this problem, so I suspect it is an issue with ifetch.
    I also directly used the default settings for fetch range without making any modifications.
  2. The cores generated through the Litex command line all have the 'coherent' option enabled by default. Can we disable it before generating Litex_soc

@duanjiulon
Copy link
Author

To reduce bug possibilities, maybe try with : aluCount = 1, decodeCount = 1, debugTriggers = 0, withDedicatedLoadAgu = true, withRvc = false,

This will reduce the CPU complexity and maybe workaround a potential bug in the debug module ?

Hi, Recently, I have successfully embedded Nax_come into SOC systems without memory consistency, and there are no issues with debugging peripherals. However, I have encountered a problem recently where a timer is required to run drystone programs on this SOC system. I have looked at some NAX_CORE systems that have a timer input, which is:

input  wire [63:0]    PrivilegedPlugin_io_rdtime

assign _zz_EU0_CsrAccessPlugin_logic_fsm_readLogic_csrValue_17[31 : 0] = PrivilegedPlugin_io_rdtime[31 : 0];
assign _zz_EU0_CsrAccessPlugin_logic_fsm_readLogic_csrValue_18[31 : 0] = PrivilegedPlugin_io_rdtime[63 : 32];

May I ask what command I need to use to successfully read this timer?

@duanjiulon
Copy link
Author

Hi,

There is also the fetchRange which can matter, but by default it is : fetchRange : UInt => Bool = _(31 downto 28) =/= 0x1,

Which should be fine.

Did you tried first the openocd telnet ? trying to use it to load binaries into memory and reading them back.

Also, one important step, is to run everything in simulation (including jtag / openocd), that way we aren't facing a blackbox to debug. What are you using to run simulation in general ?

Regarding the issue of abnormal program loading earlier, I found through an online logic analyzer that after GDB loads the program, there will be an authorization failure when reading, and the reason for the failure is that Dcached has an abnormal writable backup. After loading DDR, the program cannot be pulled down normally, so I directly set this signal to a low level and the program loading (dbus) and finger retrieval (ibus) will be fine. Is there anything else to pay attention to here?

@Dolu1990
Copy link
Member

rdtime

to access it you can use the CSR defined in gcc as "time"
https://github.com/SpinalHDL/NaxSoftware/blob/c63c0ce9311160a7965637fb7de5899c3a5110b8/baremetal/driver/riscv.h#L96
=>
x = csr_read(time);
shouls be ok.

Else you can use the cycle counter :
x = csr_read(cycle);

there will be an authorization failure when reading

What kind of authorization failure ?

abnormal writable backup
pulled down normally

What do you mean ?

@duanjiulon
Copy link
Author

duanjiulon commented Jul 31, 2024

What do you mean ?

you can see this one:
After I loaded the program into the DDR where 0x40000000 is located through jtag, the signal DataCachePluginloggic_cache_iw_ritebackBusy remained in a high state. By using commands such as mdw to check the data in the memory at that address, it was successfully loaded.
Correspondingly, if the program is loaded into ROM through jtag, it is the result of the second image.
write_back_busy_1
write_back_busy_2

@Dolu1990
Copy link
Member

Ahhh i forgot you were in litex.

writeback_slot_1_valid being stuck high is realy weird, as it is quite decoupled from all the rest of the CPU, it is rely at the border toward the memory system.

One thing, did you let the litex bios calibrate the dram after the reset and after starting openocd ?

@duanjiulon
Copy link
Author

Ahhh i forgot you were in litex.

writeback_slot_1_valid being stuck high is realy weird, as it is quite decoupled from all the rest of the CPU, it is rely at the border toward the memory system.

One thing, did you let the litex bios calibrate the dram after the reset and after starting openocd ?

No, no, no, this is my own SOC, which has already separated from Litex and lacks memory consistency. It's just a Dcache to axi4 interface connected to axifabric, and then connected to a DDR peripheral, so there's no BIOS

@Dolu1990
Copy link
Member

Ahhh then the issue is likely the source / id handeling in the memory interconnect.
Can you probe the various cache/io_mem valid+ready+id ?
There is a good chance that the memory interconnect give back the wrong ID as a response.

@duanjiulon
Copy link
Author

Ahhh then the issue is likely the source / id handeling in the memory interconnect. Can you probe the various cache/io_mem valid+ready+id ? There is a good chance that the memory interconnect give back the wrong ID as a response.

This is a result after GDB loads the binary file:
write_back_busy_3

@Dolu1990
Copy link
Member

Hi,

You also need the rsp_id for read and writes.

@duanjiulon
Copy link
Author

duanjiulon commented Jul 31, 2024

sp_id
write_back_busy_4
while load on the on chip ram(0x0 start)
write_back_busy_5

@Dolu1990
Copy link
Member

How did you hoocked the SoC to the external memory controller ?
Maybe the issue is there ?
Bad ID handeling ?

@Dolu1990
Copy link
Member

Dolu1990 commented Aug 3, 2024

I think to be sure on which side the issue is, we would need to look at the AXI signals (valid / ready / id) with the logic analyser.
Aswell as on the CPU side simultaneusly.

@duanjiulon
Copy link
Author

duanjiulon commented Aug 5, 2024

I think to be sure on which side the issue is, we would need to look at the AXI signals (valid / ready / id) with the logic analyser. Aswell as on the CPU side simultaneusly.
Dear Dolu,I want to use verialtor to simulate this, but I also want to reproduce the JTAG scene at that time, that is, I want to load binaries in the simulation with JTAG to generate waveforms, how should I use the resources in existing git.like this:
the file mt48lc16m16a2 is used in your spinal git file.But I tried it, varilator doesn't compile this simulation file of sdram very well, how did you test it?

  wire [10:0] io_sdram_ADDR;
  wire [1:0] io_sdram_BA;
  wire [31:0] io_sdram_DQ;
  wire [31:0] io_sdram_DQ_read;
  wire [31:0] io_sdram_DQ_write;
  wire  io_sdram_DQ_writeEnable;
  wire [1:0] io_sdram_DQM;
  wire  io_sdram_CASn;
  wire  io_sdram_CKE;
  wire  io_sdram_CSn;
  wire  io_sdram_RASn;
  wire  io_sdram_WEn;

  assign io_sdram_DQ_read = io_sdram_DQ;
  assign io_sdram_DQ = io_sdram_DQ_writeEnable ? io_sdram_DQ_write : 32'bZZZZZZZZZZZZZZZZ;

  mt48lc16m16a2 sdram(
    .Dq(io_sdram_DQ),
    .Addr(io_sdram_ADDR),
    .Ba(io_sdram_BA),
    .Clk(soc_clk),
    .Cke(io_sdram_CKE),
    .Cs_n(io_sdram_CSn),
    .Ras_n(io_sdram_RASn),
    .Cas_n(io_sdram_CASn),
    .We_n(io_sdram_WEn),
    .Dqm(io_sdram_DQM)
  );

AlSdrDdrSoC SoC_AXI_Fabric(
    .io_asyncReset(sysrst),
    .io_axiClk(soc_clk),
    .io_coreInstrAxi_ar_valid(io_coreInstrAxi_ar_valid),
    .io_coreInstrAxi_ar_ready(io_coreInstrAxi_ar_ready),
    .io_coreInstrAxi_ar_payload_addr(io_coreInstrAxi_ar_payload_addr),
    .io_coreInstrAxi_ar_payload_id(io_coreInstrAxi_ar_payload_id),
    .io_coreInstrAxi_ar_payload_len(io_coreInstrAxi_ar_payload_len),
    .io_coreInstrAxi_ar_payload_size(io_coreInstrAxi_ar_payload_size),
    .io_coreInstrAxi_ar_payload_burst(io_coreInstrAxi_ar_payload_burst),
    .io_coreInstrAxi_r_valid(io_coreInstrAxi_r_valid),
    .io_coreInstrAxi_r_ready(io_coreInstrAxi_r_ready),
    .io_coreInstrAxi_r_payload_data(io_coreInstrAxi_r_payload_data),
    .io_coreInstrAxi_r_payload_id(io_coreInstrAxi_r_payload_id),
    .io_coreInstrAxi_r_payload_resp(io_coreInstrAxi_r_payload_resp),
    .io_coreInstrAxi_r_payload_last(io_coreInstrAxi_r_payload_last),
    .io_coreDataAxi_aw_valid(io_coreDataAxi_aw_valid),
    .io_coreDataAxi_aw_ready(io_coreDataAxi_aw_ready),
    .io_coreDataAxi_aw_payload_addr(io_coreDataAxi_aw_payload_addr),
    .io_coreDataAxi_aw_payload_id(io_coreDataAxi_aw_payload_id),
    .io_coreDataAxi_aw_payload_len(io_coreDataAxi_aw_payload_len),
    .io_coreDataAxi_aw_payload_size(io_coreDataAxi_aw_payload_size),
    .io_coreDataAxi_aw_payload_burst(io_coreDataAxi_aw_payload_burst),
    .io_coreDataAxi_w_valid(io_coreDataAxi_w_valid),
    .io_coreDataAxi_w_ready(io_coreDataAxi_w_ready),
    .io_coreDataAxi_w_payload_data(io_coreDataAxi_w_payload_data),
    .io_coreDataAxi_w_payload_strb(io_coreDataAxi_w_payload_strb),
    .io_coreDataAxi_w_payload_last(io_coreDataAxi_w_payload_last),
    .io_coreDataAxi_b_valid(io_coreDataAxi_b_valid),
    .io_coreDataAxi_b_ready(io_coreDataAxi_b_ready),
    .io_coreDataAxi_b_payload_id(io_coreDataAxi_b_payload_id),
    .io_coreDataAxi_b_payload_resp(io_coreDataAxi_b_payload_resp),
    .io_coreDataAxi_ar_valid(io_coreDataAxi_ar_valid),
    .io_coreDataAxi_ar_ready(io_coreDataAxi_ar_ready),
    .io_coreDataAxi_ar_payload_addr(io_coreDataAxi_ar_payload_addr),
    .io_coreDataAxi_ar_payload_id(io_coreDataAxi_ar_payload_id),
    .io_coreDataAxi_ar_payload_len(io_coreDataAxi_ar_payload_len),
    .io_coreDataAxi_ar_payload_size(io_coreDataAxi_ar_payload_size),
    .io_coreDataAxi_ar_payload_burst(io_coreDataAxi_ar_payload_burst),
    .io_coreDataAxi_r_valid(io_coreDataAxi_r_valid),
    .io_coreDataAxi_r_ready(io_coreDataAxi_r_ready),
    .io_coreDataAxi_r_payload_data(io_coreDataAxi_r_payload_data),
    .io_coreDataAxi_r_payload_id(io_coreDataAxi_r_payload_id),
    .io_coreDataAxi_r_payload_resp(io_coreDataAxi_r_payload_resp),
    .io_coreDataAxi_r_payload_last(io_coreDataAxi_r_payload_last),
    //////////////////
    .io_sdram_ADDR(io_sdram_ADDR),
    .io_sdram_BA(io_sdram_BA),
    .io_sdram_DQ_read(io_sdram_DQ_read),
    .io_sdram_DQ_write(io_sdram_DQ_write),
    .io_sdram_DQ_writeEnable(io_sdram_DQ_writeEnable),
    .io_sdram_DQM(io_sdram_DQM),
    .io_sdram_CASn(io_sdram_CASn),
    .io_sdram_CKE(io_sdram_CKE),
    .io_sdram_CSn(io_sdram_CSn),
    .io_sdram_RASn(io_sdram_RASn),
    .io_sdram_WEn(io_sdram_WEn)

);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants