Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation hits max cycle count for dhry, cmark_iccm, cmark_dccm #162

Closed
GeorgeWu1204 opened this issue Mar 2, 2024 · 11 comments
Closed

Comments

@GeorgeWu1204
Copy link

Hello,
When I try to simulate the design using existing benchmarks, I observed that benchmarks related to ICCM and DCCM consistently fail to pass the simulation. Notably, the console.log file yields no output in these instances and the simulation results says that the simulation hit max cycle. Could you provide insight into the potential causes of this issue? It's also worth mentioning that other benchmarks, such as "hello_world" and "cmark," successfully pass the simulation. Thank you so much.
image

@algrobman
Copy link

check if the CPU does something useful and not stuck in exception - look exec*.log instructions execution trace ..

@GeorgeWu1204
Copy link
Author

check if the CPU does something useful and not stuck in exception - look exec*.log instructions execution trace ..

Thanks for the reply. I have checked the exec.log and noticed that all the instructions after 124 are zeros. Could you please give me some suggestions? Many thanks.
image

@algrobman
Copy link

Your trace shows that CPU reads 0 to ra (return address) - (instruction #122 ) from stack (instead of 0x800006dc, written by #109) and then returns from a function to address 0 (instruction #124), where you don't have code - (zeros) , which are unimplemented opcode for the CPU. So the CPU takes exception and goes to address, set in mtvec CSR, which is also 0 ( mtvec is not set by the program) . Thus the CPU is stuck in the address 0, constantly taking exception.

BTW what are the start/end addresses of the DCCM?

@GeorgeWu1204
Copy link
Author

Your trace shows that CPU reads 0 to ra (return address) - (instruction #122 ) from stack (instead of 0x800006dc, written by #109) and then returns from a function to address 0 (instruction #124), where you don't have code - (zeros) , which are unimplemented opcode for the CPU. So the CPU takes exception and goes to address, set in mtvec CSR, which is also 0 ( mtvec is not set by the program) . Thus the CPU is stuck in the address 0, constantly taking exception.

BTW what are the start/end addresses of the DCCM?

Thank you so much for mentioning that, the start/end addresses of DCCM are from f0040000 to f0043e30.

Does that mean there are some problems related to the benchmark compilation? The command I run is make -f $RV_ROOT/tools/Makefile verilator TEST=dhry

The Verilator Version is v5.020
The riscv64-unknown-elf-gcc Version is 10.2.0

@algrobman
Copy link

you need to make sure that your program data /stack, and other data sections fit the physical size of the DCCM.
it is not related to verilator version. But the way you build your design (how much memory you select)

@GeorgeWu1204
Copy link
Author

you need to make sure that your program data /stack, and other data sections fit the physical size of the DCCM. it is not related to verilator version. But the way you build your design (how much memory you select)

Thanks for the reply. The dccm_size is selected to be 512KB, the rest of the dccm settings are as shown,
image
The DCCM pre-load region is from f0040000 to f0043e30, which I believe is in the right region. The dhry.map is,
image
Unfortunately, the problem has not been resolved yet. May I ask how I can verify whether the data has been correctly loaded into f0040000? Or do you think there might be another way to debug this? Thank you so much for your help; it is greatly appreciated.

@algrobman
Copy link

Your screenshot shows 64 KB DCCM, which should be sufficient for Dhrystone .
Looks like the CPU is designed with external DCCM/ICCM RAMs, instantiated in the tb_top, so run with waves and see if/why you get zero value from address 0xf0043cfc when instruction #122 is executed .

BTW, did you run the test out of the box without any modifications as README suggests?

@GeorgeWu1204
Copy link
Author

Your screenshot shows 64 KB DCCM, which should be sufficient for Dhrystone . Looks like the CPU is designed with external DCCM/ICCM RAMs, instantiated in the tb_top, so run with waves and see if/why you get zero value from address 0xf0043cfc when instruction #122 is executed .

BTW, did you run the test out of the box without any modifications as README suggests?

Yes, I did not modify anything; I simply ran the command make -f $RV_ROOT/tools/Makefile TEST=dhry. I will try to see the waves and thanks for the suggestions.

@GeorgeWu1204
Copy link
Author

Your screenshot shows 64 KB DCCM, which should be sufficient for Dhrystone . Looks like the CPU is designed with external DCCM/ICCM RAMs, instantiated in the tb_top, so run with waves and see if/why you get zero value from address 0xf0043cfc when instruction #122 is executed .

BTW, did you run the test out of the box without any modifications as README suggests?

image
Sorry to interrupt again, but after continuing to track the potential error, I've noticed that the generated program.hex might be incomplete, as shown in the screenshot below. This could lead to the memory being zero for the address after #124. Could this issue be related to a problem with picolibc?

@algrobman
Copy link

Hi, I think they misconnected DCCM/ICCM RAMs when moved them from design to testbench.
update testbench/tb_top.sv :

leave only these defines:

endtask



`define DRAM(bk) Gen_dccm_enable.dccm_loop[bk].ram.ram_core
`define IRAM(bk) Gen_iccm_enable.iccm_loop[bk].iccm_bank.ram_core


task slam_dccm_ram(input [31:0] addr, input[38:0] data);

and copy this stuff instead of original code

//////////////////////////////////////////////////////
// DCCM
//
if (pt.DCCM_ENABLE == 1) begin: Gen_dccm_enable
    `define EL2_LOCAL_DCCM_RAM_TEST_PORTS   .TEST1   (1'b0   ), \
                                            .RME     (1'b0   ), \
                                            .RM      (4'b0000), \
                                            .LS      (1'b0   ), \
                                            .DS      (1'b0   ), \
                                            .SD      (1'b0   ), \
                                            .TEST_RNM(1'b0   ), \
                                            .BC1     (1'b0   ), \
                                            .BC2     (1'b0   ), \

    localparam DCCM_INDEX_DEPTH = ((pt.DCCM_SIZE)*1024)/((pt.DCCM_BYTE_WIDTH)*(pt.DCCM_NUM_BANKS));  // Depth of memory bank
    // 8 Banks, 16KB each (2048 x 72)
    for (genvar i=0; i<pt.DCCM_NUM_BANKS; i++) begin: dccm_loop
   
            el2_ram #(DCCM_INDEX_DEPTH,39)  ram (
                                    // Primary ports
                                    .ME(el2_mem_export.dccm_clken[i]),
                                    .CLK(el2_mem_export.clk),
                                    .WE(el2_mem_export.dccm_wren_bank[i]),
                                    .ADR(el2_mem_export.dccm_addr_bank[i]),
                                    .D({el2_mem_export.dccm_wr_ecc_bank[i],el2_mem_export.dccm_wr_data_bank[i]} ),
                                    .Q({el2_mem_export.dccm_bank_ecc[i], el2_mem_export.dccm_bank_dout[i]}),
                                    .ROP ( ),
                                    // These are used by SoC
                                    `EL2_LOCAL_DCCM_RAM_TEST_PORTS
                                    .*
                                    );
    end : dccm_loop
end :Gen_dccm_enable

//////////////////////////////////////////////////////
// ICCM
//
if (pt.ICCM_ENABLE) begin : Gen_iccm_enable
for (genvar i=0; i<pt.ICCM_NUM_BANKS; i++) begin: iccm_loop
    el2_ram #(.depth(1<<pt.ICCM_INDEX_BITS), .width(39)) iccm_bank (
                                     // Primary ports
                                     .ME(el2_mem_export.iccm_clken[i]),
                                     .CLK(el2_mem_export.clk),
                                     .WE(el2_mem_export.iccm_wren_bank[i]),
                                     .ADR(el2_mem_export.iccm_addr_bank[i]),
                                     .D({el2_mem_export.iccm_bank_wr_ecc[i],el2_mem_export.iccm_bank_wr_data[i]}),
                                     .Q({el2_mem_export.iccm_bank_ecc[i], el2_mem_export.iccm_bank_dout[i]}),
                                     .ROP ( ),
                                     // These are used by SoC
                                     .TEST1    (1'b0   ),
                                     .RME      (1'b0   ),
                                     .RM       (4'b0000),
                                     .LS       (1'b0   ),
                                     .DS       (1'b0   ),
                                     .SD       (1'b0   ) ,
                                     .TEST_RNM (1'b0   ),
                                     .BC1      (1'b0   ),
                                     .BC2      (1'b0   )

                                      );

end : iccm_loop
end : Gen_iccm_enable



@GeorgeWu1204
Copy link
Author

Hi, I think they misconnected DCCM/ICCM RAMs when moved them from design to testbench. update testbench/tb_top.sv :

leave only these defines:

endtask



`define DRAM(bk) Gen_dccm_enable.dccm_loop[bk].ram.ram_core
`define IRAM(bk) Gen_iccm_enable.iccm_loop[bk].iccm_bank.ram_core


task slam_dccm_ram(input [31:0] addr, input[38:0] data);

and copy this stuff instead of original code

//////////////////////////////////////////////////////
// DCCM
//
if (pt.DCCM_ENABLE == 1) begin: Gen_dccm_enable
    `define EL2_LOCAL_DCCM_RAM_TEST_PORTS   .TEST1   (1'b0   ), \
                                            .RME     (1'b0   ), \
                                            .RM      (4'b0000), \
                                            .LS      (1'b0   ), \
                                            .DS      (1'b0   ), \
                                            .SD      (1'b0   ), \
                                            .TEST_RNM(1'b0   ), \
                                            .BC1     (1'b0   ), \
                                            .BC2     (1'b0   ), \

    localparam DCCM_INDEX_DEPTH = ((pt.DCCM_SIZE)*1024)/((pt.DCCM_BYTE_WIDTH)*(pt.DCCM_NUM_BANKS));  // Depth of memory bank
    // 8 Banks, 16KB each (2048 x 72)
    for (genvar i=0; i<pt.DCCM_NUM_BANKS; i++) begin: dccm_loop
   
            el2_ram #(DCCM_INDEX_DEPTH,39)  ram (
                                    // Primary ports
                                    .ME(el2_mem_export.dccm_clken[i]),
                                    .CLK(el2_mem_export.clk),
                                    .WE(el2_mem_export.dccm_wren_bank[i]),
                                    .ADR(el2_mem_export.dccm_addr_bank[i]),
                                    .D({el2_mem_export.dccm_wr_ecc_bank[i],el2_mem_export.dccm_wr_data_bank[i]} ),
                                    .Q({el2_mem_export.dccm_bank_ecc[i], el2_mem_export.dccm_bank_dout[i]}),
                                    .ROP ( ),
                                    // These are used by SoC
                                    `EL2_LOCAL_DCCM_RAM_TEST_PORTS
                                    .*
                                    );
    end : dccm_loop
end :Gen_dccm_enable

//////////////////////////////////////////////////////
// ICCM
//
if (pt.ICCM_ENABLE) begin : Gen_iccm_enable
for (genvar i=0; i<pt.ICCM_NUM_BANKS; i++) begin: iccm_loop
    el2_ram #(.depth(1<<pt.ICCM_INDEX_BITS), .width(39)) iccm_bank (
                                     // Primary ports
                                     .ME(el2_mem_export.iccm_clken[i]),
                                     .CLK(el2_mem_export.clk),
                                     .WE(el2_mem_export.iccm_wren_bank[i]),
                                     .ADR(el2_mem_export.iccm_addr_bank[i]),
                                     .D({el2_mem_export.iccm_bank_wr_ecc[i],el2_mem_export.iccm_bank_wr_data[i]}),
                                     .Q({el2_mem_export.iccm_bank_ecc[i], el2_mem_export.iccm_bank_dout[i]}),
                                     .ROP ( ),
                                     // These are used by SoC
                                     .TEST1    (1'b0   ),
                                     .RME      (1'b0   ),
                                     .RM       (4'b0000),
                                     .LS       (1'b0   ),
                                     .DS       (1'b0   ),
                                     .SD       (1'b0   ) ,
                                     .TEST_RNM (1'b0   ),
                                     .BC1      (1'b0   ),
                                     .BC2      (1'b0   )

                                      );

end : iccm_loop
end : Gen_iccm_enable

Solved! Thank you so much! I really appreciate your help : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants