-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interface to vector units #3599
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few thoughts from the T1's aspect, no need to block by T1, after our general release, we will try to contribute back to the upstream version of RC.
The tricky part still might be decoder, and I will find some way to align.
|
||
id_ctrl.vec := false.B | ||
if (usingVector) { | ||
val v_decode = rocketParams.vector.get.decoder(p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting.
csr.io.vector.foreach { v => | ||
v.set_vconfig.valid := wb_reg_set_vconfig && wb_reg_valid | ||
v.set_vconfig.bits := wb_reg_rs2.asTypeOf(new VConfig) | ||
v.set_vs_dirty := wb_valid && wb_ctrl.vec | ||
v.set_vstart.valid := wb_valid && wb_reg_set_vconfig | ||
v.set_vstart.bits := 0.U | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the CSRs, T1 just insert instruction for chaining w/ csr flushing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe how that works? I'm not sure where csr maintenance is affected by chaining.
Do you have any plan to open source the vector unit?
…________________________________
发件人: Jerry Zhao ***@***.***>
发送时间: 星期四, 三月 21, 2024 9:51:27 上午
收件人: chipsalliance/rocket-chip ***@***.***>
抄送: Subscribed ***@***.***>
主题: [chipsalliance/rocket-chip] Interface to vector units (PR #3599)
This PR adds an interface to vector units. This has been tested to support a fully compliant, not-yet-open-sourced, RVV-1.0 implementation, including support for precise traps and virtual memory.
API:
* RocketVectorUnitParams - specifies fields for controlling generation of vector unit LazyModule
* EarlyVectorDecoder - decodes read/write ctrl bits, and allows implementations to define subsets of supported insns (non-supported will trap)
* VectorUnit - integrates with ex/mem/wb stages of rocket, can set vstart/vxsat/fflags, can request replays, block memory operations in rocket-pipe
UArch:
* vsets are decoded early, executed in EX stage
* Load/store/arithmetic instructions are issued to the vector unit in ex stage along with vtype, but may be killed before wb stage
* The implementation can request replays for backpressure, or delay retirement of younger instructions to implement precise traps
* The implementation can access memory through a TL port or share the Rocket L1D$.
Type of change: bug report | feature request | other enhancement
Impact: no functional change | API addition (no impact on existing code) | API modification
Development Phase: proposal | implementation
Release Notes
________________________________
You can view, comment on, or merge this pull request online at:
#3599
Commit Summary
* 0b1ef9c<0b1ef9c> Add vector-unit interface
* cb4e742<cb4e742> Suppress vconfig for id-xcpt
* ad37acb<ad37acb> Add set_vconfig to vector-unit
* bff48a8<bff48a8> Propagate request size/cmd to TLB resp
* d8f6caf<d8f6caf> Add SimpleHellaCacheIF mask
* f51bca4<f51bca4> Vector CSR data hazard
* 9dc08fe<9dc08fe> Pass vxrm to vector impl
* 80dffc8<80dffc8> Merge commit '50adbdb' into ifv
* 12139be<12139be> scalar read, rm
* cd4b38b<cd4b38b> Simplify vector-fpu integration
* bf79222<bf79222> add vector FP exceptions
* 66bd400<66bd400> Fix scalar FP to vector
* 48602b9<48602b9> Vector trap-check should block younger exceptions
* 28bbca5<28bbca5> Add vector ll scalar wb interface
* a68cfc1<a68cfc1> Fix vector-to-scalar trace
* fbd0fb8<fbd0fb8> Add vector/fp interface
* 6d5b054<6d5b054> Vec should kill in-flight dcache
* af11ed4<af11ed4> StoreGen supported maxSize > dat.length
* 56a4da7<56a4da7> Set vector killm for all killm cases
* 203fc1f<203fc1f> Fix vsetvl
* 5930109<5930109> Add diplomatic node to rocket vector unit
* de696aa<de696aa> Fix set vstart
* 3172ee6<3172ee6> Support swap12 in fpu external interface
* fb3c8dd<fb3c8dd> Add scalar FPU-to-vector support
* 0292f13<0292f13> Remove dontCare from fpuOpt
* 0bff786<0bff786> Fix shared FPU for divSqrt ops
* 5bcdcb2<5bcdcb2> Merge commit '749a3ea' into ifv
* ef2876c<ef2876c> Pass vconfig to vec-decode
* f3951e7<f3951e7> Fix vstart bypassing
* eb141eb<eb141eb> Fix vector interface gating in FPU
* 072bc41<072bc41> Fix vsetvl with rs1=x0
* 5a17c56<5a17c56> Allow pulling out full output from iterative imul
* f8105ce<f8105ce> Add full_data to pipelined-mul-unit
* 4996086<4996086> Merge commit '8026b6b' into ifv
* 6e554f3<6e554f3> Add tlb_port to NBDcache as well
* d8afe64<d8afe64> Improve NBDCache performance
* e47a188<e47a188> Add req.no_resp to ScratchpadSlavePort
* 174a4b9<174a4b9> add mem.req.no_resp to rocc examples
* 1e9fef1<1e9fef1> Fix vector integration
* 527560a<527560a> Merge remote-tracking branch 'origin/dev' into ifv
* 0b2d940<0b2d940> Support v-impls which issue vconfig to backend
* 28bf141<28bf141> Fix TLFragmenter assert
* ea3d882<ea3d882> Only connect vector dcache port if requested by VU
* ea35ee2<ea35ee2> Remove DebugROB requires
* c2651f4<c2651f4> Fix debug rob for some vector units
* 84409c8<84409c8> Fix s1_data when coreDataBits > xLen
* db35cb8<db35cb8> Merge remote-tracking branch 'origin/dev' into ifv
File Changes
(11 files<https://github.com/chipsalliance/rocket-chip/pull/3599/files>)
* M src/main/scala/rocket/AMOALU.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-f86d6aea8c17b19e32ea4368819269cdec47fce3c48ed4f80b205a007e3754e0> (5)
* M src/main/scala/rocket/CSR.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-31a67b34f6b106bdfcc83c5d63ff32e9553378b3cf38a9a2a96806f9cad65cb5> (2)
* M src/main/scala/rocket/DCache.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-39d7fb4b419b50a2f0d5dbc70b8b7abe54dad7bb372b3f30d7b33788ab5fb13e> (2)
* M src/main/scala/rocket/DebugROB.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-dd3e81e3075324080ce2be2a9329d8e87487d1c67af6d84c5b49ebe4af9586b7> (3)
* M src/main/scala/rocket/IDecode.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-f0e859d296be8c3630256a08c98689518cc4b1e2af371cfe7d2fc823b7c6e47d> (10)
* M src/main/scala/rocket/RocketCore.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-76fe933894e1d00dd2f7bb620d9c79323422e56293ad5f94ab290d4668c292e3> (238)
* M src/main/scala/rocket/TLB.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-0284bb40d5e5c9a095c0355b32dfe0bf47de23e014ef3cfc8cfade453a643be4> (10)
* A src/main/scala/rocket/VectorUnit.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-b670e1f99e7ec5efca594cd89beaaf218738541e603eb5ce870646fef7d21852> (97)
* M src/main/scala/tile/FPU.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-cf4747af8cdedbd7c39b7112ee0ab1431f427e100f3405eb7fdc8e32691a8652> (255)
* M src/main/scala/tile/LazyRoCC.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-be1318286c8ab5791264fe60a16adc722523b92a506764d719aafec295eac31f> (19)
* M src/main/scala/tile/RocketTile.scala<https://github.com/chipsalliance/rocket-chip/pull/3599/files#diff-1aee95dcc87dc8a7d65f2f53e4aa9bf4a99d4f92d62fe3a7f4d1a91077aae819> (43)
Patch Links:
* https://github.com/chipsalliance/rocket-chip/pull/3599.patch
* https://github.com/chipsalliance/rocket-chip/pull/3599.diff
―
Reply to this email directly, view it on GitHub<#3599>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAN5EDQRONF6YDO7TZPEYKTYZI4JRAVCNFSM6AAAAABFAR373CVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE4TQOJTGIYTEMQ>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Yes |
This is ready. The CLA failure is innocuous, the email in that commit is wrong, and it is not easy to fix it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only issue we may spot is the handling of
val ex_avl = Mux(ex_ctrl.rxs1
Thanks @qinjun-li, @SharzyL for review it together.
@@ -10,6 +10,7 @@ import org.chipsalliance.cde.config.Parameters | |||
class StoreGen(typ: UInt, addr: UInt, dat: UInt, maxSize: Int) { | |||
val size = Wire(UInt(log2Up(log2Up(maxSize)+1).W)) | |||
size := typ | |||
val dat_padded = dat.pad(maxSize*8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do this padding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maxSize
is the width of the memory word. In configurations with wide DCache, maxSize > dat, since dat is xLen.
This is needed to handle that case.
@@ -241,6 +248,7 @@ class Rocket(tile: RocketTile)(implicit p: Parameters) extends CoreModule()(p) | |||
val ex_reg_inst = Reg(Bits()) | |||
val ex_reg_raw_inst = Reg(UInt()) | |||
val ex_reg_wphit = Reg(Vec(nBreakpoints, Bool())) | |||
val ex_reg_set_vconfig = Reg(Bool()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T1 do set_vconfig at WB stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid pipeline bubbles due to dependency on new VL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other key requirement is for performant precise traps. The checks for memory access fault in M/W stages must occur ahead of the commit point and depend on the updated vconfig.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid pipeline bubbles due to dependency on new VL.
Understand, we do it in another way: for each vsetvl, it will be a phantom instruction that will be observed by following instructions. But we didn't take the precise trap into consideration.
Thanks @SharzyL @qinjun-li @sequencer for the detailed review, will investigate and respond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, feel free to merge as you wish.
This PR adds an interface to vector units. This has been tested to support a fully compliant, not-yet-open-sourced, RVV-1.0 implementation, including support for precise traps and virtual memory.
The interface should not affect existing configs with no vector unit.
API:
UArch:
Type of change: bug report | feature request | other enhancement
Impact: no functional change | API addition (no impact on existing code) | API modification
Development Phase: proposal | implementation
Release Notes