-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use one length and capacity variable for whole struct #19
Comments
Yes, this is a known limitation of this crate. I think it would be worthwhile to have a single pointer for each member of the struct, plus two
We could re-use part of |
Additional advantage will be that compiler will know that length of each field is always equal to each other, which may significantly help with optimizations. |
Hey, I've been trying to make something similar for a few days. Using the experimental The other option would be to make some kind of The obvious downside of both is that they are based on experimental API. Maybe it could be enabled with a feature. If you are fine with it, I'll try to make the a version using |
By I think I would rather keep this crate compatible with stable Rust, since one of my goals with it is writing software that's distributed as source and compiled by non programmers. Having to explain the differences between stable & nightly would be more work on top of explaining them they need a Rust compiler. One possible way forward would be to use an alternative RawVec from a crate, potentially copy-pasted from rust's If the differences between standard Vec and |
That's the one. I'm intrigued to see if the compiler is smart enough to leverage the fact to have the same length for all the vectors. I'll make a feature to enable it and make a PR. |
Oh, I thought this is the default but looks like it is not, I wonder why people would use this rather typing out the Vec themselves? It could be done on stable by using NonNull probably, keeeping the pointers seperately. |
The advantage of soa_derive is on the API side: being able to consider a XXXVec exactly like a Vec<XXX>, and use `vec.push(e: XXX)` and all other Vec methods directly.
One could type the Vec and forward to the methods directly, but this would be high boilerplate code.
… Le 3 août 2020 à 04:14, Ivan Tham ***@***.***> a écrit :
Oh, I thought this is the default but looks like it is not, I wonder why people would use this rather typing out the Vec themselves?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
A bit of an update now that I've spend a few hours implementing a single length version. An advantage that we didn't discuss is that having a single length makes it impossible to have the underlying arrays unsynchronized. With the current version it is as simple as Right now, the tests rely on having access the underlying So, what I would do is:
This is quite some work but I feel it would be worth it. How do you feel about it? |
I agree that preventing desynchronization is good, as long as we can still provide access to individual slices with or without
Do you have a specific example of this? They should work fine with slice access, not full Vec. If they don't, I agree they need fixing =)
Part of the appeal of soa_derive is to get access to all Vec functions without writing to code manually. I fear that splitting between core/extra would break this. At the same time, I see that re-implementing everything would be quite a lot of work, so how about creating a branch out of master to work on this new interface? This allow piecewise changes and integration in other projects with git dependencies, while still landing the full interface in master.
I'm not sure I see the link between #24 and this, could you elaborate? let me know if I'm unclear! |
@copying Can I see the rough plan of the new expanded structure? Using one length should be easier compared to using one capacity since the allocator should align it according to the size. |
Sorry @Luthaf, I think I explained myself poorly. The tests are using the underlying
Here I was referring to the tests. Dumb me didn't realise that I didn't write it.
The point here is that there can be improvements to the interface and might be better to do some before finishing the other implementation. Also, reading everything again it feels like I've not implemented almost anything. Actually most code is done in a branch called |
@pickfire Right now the structure I'm working with is: struct ExampleVec {
field1: RawVec<T1>,
field2: RawVec<T2>,
len: usize
} This structure doesn't work with a structure that has a field named struct ExampleVecFields {
pub field1: RawVec<T1>,
pub field2: RawVec<T2>
}
struct ExampleVec {
fields: ExampleVecFields,
len: usize
} |
Can derive even change it into this? struct ExampleVecFields {
pub field1: RawVec<T1>,
pub field2: RawVec<T2>
}
struct ExampleVec {
fields: ExampleVecFields,
len: usize
} |
We use procedural macros so you can inject as much code as you want :) PD: This package don't change the original structure, it only adds more code |
@iMplode-nZ moving the discussion on single length here. Hm for the single length thing perhaps just make the storage generic and use a HList? Originally posted by @iMplode-nZ in #35 (comment) I really don't see how a HList would help here, could you clarify? Originally posted by @Luthaf in #35 (comment) So for using a HList, your
then the vector type would simply be something like this:
(Or something like this, this doesn't compile but you probably get the point right?) But this way you only need to implement a single trait and then just use Originally posted by @iMplode-nZ in #35 (comment) |
Except that in your example, since We could have a struct Foo {
x: i32,
y: String,
}
struct RawVec<T> {
ptr: mut* T,
capacity: usize,
}
struct FooVec {
len: usize,
x: RawVec<i32>,
y: RawVec<String>,
} In both cases, we need to manually deal with unsafe code. I also personally find the |
I believe https://docs.rs/soak is doing this already since it have a different mechanism. |
Well or make your SOAStorage contain a SOAElementStorage hierachy, where the SOAStorage has the head and the size and each of the SOAElementStorages contain the raw pointer and the next one. |
Hello again! I'm sorry that I've been off for such a long time, but I finally have the time to be able to do stuff! I didn't have the time to sit down and write what I've been thinking until now, but I've made this proof of concept that shows how I think would be a nice way of handling things. It doesn't cantain any macro and some code has been generated by hand, but I think it's easy to see how it could be done. Here is a fast explanation of what I've done: Instead of making a vector structure each time, I've made a generic vector where you can set any buffer that implements the required traits:
After that I made 2 test files:
Note that the SoA implementation relies on the fact that its composed of buffers. This buffers could actually be another SoA, giving nested vectors for "free". Also note that there is no restriction stating that the buffer should be able to resize: it should be possible to allocate a large slap once and fail if ever try to resize. One thing I find nice about this approach is that the code that would be generated contains very little logic: most code would be in the traits or base buffers. Let me know what you think! |
A small update: I've updated the PoC. I added support for fixed size array as buffers which was easy enough (so that would also solve #46 ). Also I've moved and renamed the code a bit. I also tried to use the built-in I'm not sure what the best next step would be. I'm thinking either doing the actual macro or add more methods to the vector, but I'm not too sure. |
Hey @copying! Thanks for working on this. Unfortunately, I don't have a lot of time to spend on this crate these days. I'll try to give a look at your prototype ASAP, hopefully within a week or two! |
Hey @Luthaf, it took me 2 years to come back, so don't feel too much pressure to read this or to look into the PoC quickly. Also, I'll have some vacation days this week, so I'll probably invest some time to this project! :) I realized that this thread is more about a collection of ideas but I haven't actually explained what I've done in the proof of concept, so here it is. The PoC partsNote: this PoC is very minimal, and we may find some case that breaks this scheme. Also, naming is hard; I would really appreciate any suggestions to improve it :)
CompositionOne of the strengths of this PoC is it's ability to compose the different parts. // Tries to be equivalent to Vec<Point>
type RegularPointVec = Vector<Point, CustomRawVec<Point>>;
// Using a fixed-size array
type ArrayPointVec = Vector<Point, ArrayRawVec<Point, 1000>>;
// Using a single, non-configurable raw-vec
type SoaPointVec = Vector<Point, SoaPointRawVec>;
// Using a generic raw-vec with different base raw-vecs (defined with markers)
type RegularSoaPointVec = Vector<Point, GenericSoaPointRawVec<CustomRawVecMarker>>;
type RegularSoaPointVec = Vector<Point, GenericSoaPointRawVec<ArrayRawVecMarker<1000>>>;
type RegularSoaPointVec = Vector<Point, GenericSoaPointRawVec<SomeThirdPartyVecMarker>>; This composition allows to recreate the Vec code, make weirder layouts like SoA, or AoSoA. Even fancier layouts may be possible (it look awfully close to an allocation system). VectorOne of the core concepts is that a vector code only changed due to the changes in the underlying raw vector (buffer), and not the logic itself. This means that all the pub struct Vector<T, Buf: Buffer<T>> {
buf: Buf,
len: usize,
_marker: PhantomData<T>,
} BufferThe trait that defines the interface used by MemoryPrimitivesThis trait defines the underlying types and functions that the raw-vec will use to manage memory. This intends to unify the interface so a regular pointer and a SoA pointer can be called using the same code. This is simple but quite tedious to implement, as there are a lot of functions. Because of that, I created the trait Basic raw-vecs
MarkersI wanted a way to define what the underlying raw-vec for a possibly-nested SoA was. I though about using generic types but the type as argument needs to be fully qualified. When dealing with nested SoA raw-vecs, we cannot pass it fully qualified, as we do not now the underlying types. This is what I would like to do: Vector<Point, GenericSoaPointRawVec<SomeThirdPartyVec<???>> This is not directly possible, so this is the trick I found which allows to bypass it by using generic associated types together with zero-sized structures: // Trait implementation
pub trait BaseRawVecMarker: 'static {
type RawVec<T>: Buffer<T>;
}
// Use example
pub struct ArrayRawVecMarker<const CAP: usize>;
impl<const CAP: usize> BaseRawVecMarker for ArrayRawVecMarker<CAP> {
type RawVec<T> = ArrayRawVec<T, CAP>;
}
// Usage
type MyVec = GenericSoaPointRawVec<ArrayRawVecMarker<1000>>; Notice how we no longer need to define "Generated" codeIn this PoC I've not created any macro yet. So I've basically generated the code by hand. That being said, it's easy to see the the amount of code will be reduced, as the raw vectors have much less necessary functions. The complexity of such functions is the same if not less than with the original code :) PS: If you have any questions or you feel that something is not clear let me know. |
Hey! I've made a new repository with the PoC without the SoA code. The I've also made a small vector optimization (SVO) which is not in the standard. Haven't mesured anything yet since it's not finished nor optimized, but the hability to even allowing something like this gives me a lot of hope! @Luthaf I'm guessing that you are still very occupied, but I'd love to have your input at this point. Is there something I could do to make it simple/fast to take a look at it? Maybe some examples or something? |
As it stands, this crate appears to make separate fields each with their own
Vec
. This would duplicate the length and capacity values for each field. This may not be a huge problem, but it would triple the size of the struct as it grows and can lead to the differentVec
s falling out of sync.The main alternative would be to use
unsafe
and raw pointers (orNonNull
pointers). That said, managing theunsafe
would almost certainly be more effort than keeping theVec
s in sync. Regardless, I think this could be a useful discussion to be had. Is cutting the struct to a third worth havingunsafe
code to vet? (probably not)The text was updated successfully, but these errors were encountered: