Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Based on #28] Getting alignment right #33

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
984b0b0
Move to a NonNull-based exhume() interface
HadrienG2 Sep 18, 2019
e97de1b
Port unsafe_abomonate to NonNull-based interface
HadrienG2 Sep 18, 2019
5fd4957
Clarify that the tuple_abomonate macro takes types as input
HadrienG2 Sep 25, 2019
b182a38
Port tuple_abomonate to NonNull-based interface
HadrienG2 Sep 18, 2019
d4a78e4
Port Box<T> abomonation to NonNull-based interface
HadrienG2 Sep 18, 2019
421a486
Port Rust enum abomonation to NonNull-based interface
HadrienG2 Sep 18, 2019
5ef07a7
Port slice-like abomonations to NonNull-based interface (and deduplic…
HadrienG2 Sep 19, 2019
b23d2dc
Require Debug and use assert_eq for improved test failure messages
HadrienG2 Sep 19, 2019
b5f7b71
Clarify invalid data as a source of UB
HadrienG2 Sep 19, 2019
e4a4963
Remove outdated Abomonable notion
HadrienG2 Sep 19, 2019
b8629a3
Make sure that black_box does its job
HadrienG2 Sep 19, 2019
48b2b9a
Remove some strange whitespace
HadrienG2 Sep 26, 2019
7388af5
Snipe some unnecessary usage of Ok(())
HadrienG2 Sep 26, 2019
3f84783
Don't use manual inlining where benchmarks don't show a benefit
HadrienG2 Sep 19, 2019
cb8e2af
Add inline(always) where it matters in benchmarks
HadrienG2 Sep 19, 2019
17a0354
Avoid multiple codegen units in benchmarks
HadrienG2 Sep 19, 2019
ee283b0
Improve Abomonation documentation and make the trait unsafe
HadrienG2 Sep 20, 2019
6c12b2e
Take writer by value
HadrienG2 Sep 20, 2019
a4377f4
Do not rely on autoderef to select the right Abomonation impl
HadrienG2 Sep 25, 2019
cda9ebb
Make Box<T>::entomb simpler and more consistent with others
HadrienG2 Sep 25, 2019
e63516b
Add basic support for types which contain references
HadrienG2 Sep 29, 2019
7a326fe
Add support for this &[mut] T
HadrienG2 Oct 3, 2019
f2fe84d
Add support for str and &[mut] str
HadrienG2 Oct 3, 2019
1b7166b
Add support for [T] and &[mut] [T]
HadrienG2 Oct 3, 2019
8ba2a08
Add tests for Abomonated of reference
HadrienG2 Oct 11, 2019
f9c5ff7
Clarify and narrow down padding bytes UB
HadrienG2 Oct 21, 2019
7a995f7
Expose alignment of nontrivial abomonated types
HadrienG2 Oct 25, 2019
956715e
Expose alignment of trivial abomonated types (TODO: Remove once ecosy…
HadrienG2 Oct 25, 2019
15cc150
Add infrastructure for well-aligned binary I/O
HadrienG2 Oct 23, 2019
1051a47
Use aligned reads and writes in the abomonation implementation
HadrienG2 Oct 25, 2019
d8abfcf
Provide abstractions for properly aligning abomonated bytes
HadrienG2 Nov 8, 2019
a532059
Add support for zero-sized types to Coffin
HadrienG2 Nov 12, 2019
e424281
Add a default implementation of alignment()
HadrienG2 Nov 12, 2019
b3787b6
Whoops, wrong cfg trick
HadrienG2 Nov 14, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,9 @@ license = "MIT"

[dev-dependencies]
recycler="0.1.4"

[profile.bench]
# Multiple codegen units speed up compilation, but make compilation output less
# deteministic and generally decrease codegen quality through worse inlining.
# Let's turn it off for benchmarking.
codegen-units = 1
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ A mortifying serialization library for Rust

Abomonation (spelling intentional) is a serialization library for Rust based on the very simple idea that if someone presents data for serialization it will copy those exact bits, and then follow any pointers and copy those bits, and so on. When deserializing it recovers the exact bits, and then corrects pointers to aim at the serialized forms of the chased data.

**Warning**: Abomonation should not be used on any data you care strongly about, or from any computer you value the data on. The `encode` and `decode` methods do things that may be undefined behavior, and you shouldn't stand for that. Specifically, `encode` exposes padding bytes to `memcpy`, and `decode` doesn't much respect alignment.
**Warning**: Abomonation should not be used on any data you care strongly about, or from any computer you value the data on. The `encode` and `decode` methods do things that may be undefined behavior, and you shouldn't stand for that. Specifically, `encode` exposes padding bytes to `memcpy`, and `decode` doesn't much respect alignment and may need to construct Rust references to invalid data.

Please consult the [abomonation documentation](https://frankmcsherry.github.com/abomonation) for more specific information.

Expand Down Expand Up @@ -49,7 +49,7 @@ Be warned that these numbers are not *goodput*, but rather the total number of b

## unsafe_abomonate!

Abomonation comes with the `unsafe_abomonate!` macro implementing `Abomonation` for structs which are essentially equivalent to a tuple of other `Abomonable` types. To use the macro, you must put the `#[macro_use]` modifier before `extern crate abomonation;`.
Abomonation comes with the `unsafe_abomonate!` macro implementing `Abomonation` for structs which are essentially equivalent to a tuple of other `Abomonation` types. To use the macro, you must put the `#[macro_use]` modifier before `extern crate abomonation;`.

Please note that `unsafe_abomonate!` synthesizes unsafe implementations of `Abomonation`, and it is should be considered unsafe to invoke.

Expand Down Expand Up @@ -82,4 +82,4 @@ if let Some((result, remaining)) = unsafe { decode::<MyStruct>(&mut bytes) } {
}
```

Be warned that implementing `Abomonable` for types can be a giant disaster and is entirely discouraged.
Be warned that implementing `Abomonation` for types can be a giant disaster and is entirely discouraged.
12 changes: 7 additions & 5 deletions benches/bench.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,9 @@ use test::Bencher;
#[bench] fn vec_u_vn_s_enc(bencher: &mut Bencher) { _bench_enc(bencher, vec![vec![(0u64, vec![(); 1 << 40], format!("grawwwwrr!")); 32]; 32]); }
#[bench] fn vec_u_vn_s_dec(bencher: &mut Bencher) { _bench_dec(bencher, vec![vec![(0u64, vec![(); 1 << 40], format!("grawwwwrr!")); 32]; 32]); }

fn _bench_enc<T: Abomonation>(bencher: &mut Bencher, record: T) {

fn _bench_enc<T>(bencher: &mut Bencher, record: T)
where for<'de> T: Abomonation<'de>
{
// prepare encoded data for bencher.bytes
let mut bytes = Vec::new();
unsafe { encode(&record, &mut bytes).unwrap(); }
Expand All @@ -40,12 +41,13 @@ fn _bench_enc<T: Abomonation>(bencher: &mut Bencher, record: T) {
bencher.bytes = bytes.len() as u64;
bencher.iter(|| {
bytes.clear();
unsafe { encode(&record, &mut bytes).unwrap(); }
unsafe { encode(&record, &mut bytes).unwrap() }
});
}

fn _bench_dec<T: Abomonation+Eq>(bencher: &mut Bencher, record: T) {

fn _bench_dec<T>(bencher: &mut Bencher, record: T)
where for<'de> T: Abomonation<'de> + Eq
{
// prepare encoded data
let mut bytes = Vec::new();
unsafe { encode(&record, &mut bytes).unwrap(); }
Expand Down
10 changes: 6 additions & 4 deletions benches/clone.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,9 @@ use test::Bencher;
#[bench] fn vec_u_vn_s_e_d(bencher: &mut Bencher) { _bench_e_d(bencher, vec![vec![(0u64, vec![(); 1 << 40], format!("grawwwwrr!")); 32]; 32]); }
#[bench] fn vec_u_vn_s_cln(bencher: &mut Bencher) { _bench_cln(bencher, vec![vec![(0u64, vec![(); 1 << 40], format!("grawwwwrr!")); 32]; 32]); }

fn _bench_e_d<T: Abomonation>(bencher: &mut Bencher, record: T) {

fn _bench_e_d<T>(bencher: &mut Bencher, record: T)
where for<'de> T: Abomonation<'de>
{
// prepare encoded data for bencher.bytes
let mut bytes = Vec::new();
unsafe { encode(&record, &mut bytes).unwrap(); }
Expand All @@ -42,8 +43,9 @@ fn _bench_e_d<T: Abomonation>(bencher: &mut Bencher, record: T) {
});
}

fn _bench_cln<T: Abomonation+Clone>(bencher: &mut Bencher, record: T) {

fn _bench_cln<T>(bencher: &mut Bencher, record: T)
where for<'de> T: Abomonation<'de> + Clone
{
// prepare encoded data
let mut bytes = Vec::new();
unsafe { encode(&record, &mut bytes).unwrap(); }
Expand Down
10 changes: 6 additions & 4 deletions benches/recycler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,9 @@ use test::Bencher;
// TODO : this reveals that working with a `vec![(); 1 << 40]` does not get optimized away.
// #[bench] fn vec_u_vn_s_rec(bencher: &mut Bencher) { _bench_rec(bencher, vec![vec![(0u64, vec![(); 1 << 40], format!("grawwwwrr!")); 32]; 32]); }

fn _bench_own<T: Abomonation+Clone>(bencher: &mut Bencher, record: T) {

fn _bench_own<T>(bencher: &mut Bencher, record: T)
where for<'de> T: Abomonation<'de> + Clone
{
// prepare encoded data
let mut bytes = Vec::new();
unsafe { encode(&record, &mut bytes).unwrap(); }
Expand All @@ -42,8 +43,9 @@ fn _bench_own<T: Abomonation+Clone>(bencher: &mut Bencher, record: T) {
}


fn _bench_rec<T: Abomonation+Recyclable>(bencher: &mut Bencher, record: T) {

fn _bench_rec<T>(bencher: &mut Bencher, record: T)
where for<'de> T: Abomonation<'de> + Recyclable
{
// prepare encoded data
let mut bytes = Vec::new();
unsafe { encode(&record, &mut bytes).unwrap(); }
Expand Down
93 changes: 65 additions & 28 deletions src/abomonated.rs
Original file line number Diff line number Diff line change
@@ -1,23 +1,27 @@

use std::mem::transmute;
use std::marker::PhantomData;
use std::ops::{Deref, DerefMut};

use super::{Abomonation, decode};

/// A type wrapping owned decoded abomonated data.
///
/// This type ensures that decoding and pointer correction has already happened,
/// and implements `Deref<Target=T>` using a pointer cast (transmute).
/// This type ensures that decoding and pointer correction has already happened.
/// It provides a way to move the deserialized data around, while keeping
/// on-demand access to it via the `as_ref()` method.
///
/// As an extra convenience, `Deref<Target=T>` is also implemented if T does
/// not contain references. Unfortunately, this trait cannot be safely
/// implemented when T does contain references.
///
/// #Safety
///
/// The safety of this type, and in particular its `transute` implementation of
/// the `Deref` trait, relies on the owned bytes not being externally mutated
/// once provided. You could imagine a new type implementing `DerefMut` as required,
/// but which also retains the ability (e.g. through `RefCell`) to mutate the bytes.
/// This would be very bad, but seems hard to prevent in the type system. Please
/// don't do this.
/// The safety of this type, and in particular of access to the deserialized
/// data, relies on the owned bytes not being externally mutated after the
/// `Abomonated` is constructed. You could imagine a new type implementing
/// `DerefMut` as required, but which also retains the ability (e.g. through
/// `RefCell`) to mutate the bytes. This would be very bad, but seems hard to
/// prevent in the type system. Please don't do this.
///
/// You must also use a type `S` whose bytes have a fixed location in memory.
/// Otherwise moving an instance of `Abomonated<T, S>` may invalidate decoded
Expand Down Expand Up @@ -54,8 +58,11 @@ pub struct Abomonated<T, S: DerefMut<Target=[u8]>> {
decoded: S,
}

impl<T: Abomonation, S: DerefMut<Target=[u8]>> Abomonated<T, S> {

impl<'s, 't, T, S> Abomonated<T, S>
where S: DerefMut<Target=[u8]> + 's,
T: Abomonation<'t>,
's: 't
{
/// Attempts to create decoded data from owned mutable bytes.
///
/// This method will return `None` if it is unable to decode the data with
Expand Down Expand Up @@ -94,34 +101,64 @@ impl<T: Abomonation, S: DerefMut<Target=[u8]>> Abomonated<T, S> {
/// not change if the `bytes: S` instance is moved. Good examples are
/// `Vec<u8>` whereas bad examples are `[u8; 16]`.
pub unsafe fn new(mut bytes: S) -> Option<Self> {
// Fix type `T`'s inner pointers. Will return `None` on failure.
//
// FIXME: `slice::from_raw_parts_mut` is used to work around the borrow
// checker marking `bytes` as forever borrowed if `&mut bytes` is
// directly passed as input to `decode()`. But that is itself a
// byproduct of the API contract specified by the `where` clause
// above, which allows S to be `&'t mut [u8]` (and therefore
// require such a perpetual borrow) in the worst case.
//
// A different API contract might allow us to achieve the same
// result without resorting to such evil unsafe tricks.
//
decode::<T>(std::slice::from_raw_parts_mut(bytes.as_mut_ptr(),
bytes.len()))?;

// performs the underlying pointer correction, indicates success.
let decoded = decode::<T>(bytes.deref_mut()).is_some();

if decoded {
Some(Abomonated {
phantom: PhantomData,
decoded: bytes,
})
}
else {
None
}
// Build the Abomonated structure
Some(Abomonated {
phantom: PhantomData,
decoded: bytes,
})
}
}

impl<T, S: DerefMut<Target=[u8]>> Abomonated<T, S> {
impl<'t, T, S> Abomonated<T, S>
where S: DerefMut<Target=[u8]>,
T: Abomonation<'t>
{
/// Get a read-only view on the deserialized bytes
pub fn as_bytes(&self) -> &[u8] {
&self.decoded
}
}

/// Get a read-only view on the deserialized data
//
// NOTE: This method can be safely used even if type T contains references,
// because it makes sure that the borrow of `self` lasts long enough
// to encompass the lifetime of these references.
//
// Otherwise, it would be possible to extract an `&'static Something`
// from a short-lived borrow of a `Box<[u8]>`, then drop the `Box`,
// and end up with a dangling reference.
//
pub fn as_ref<'a: 't>(&'a self) -> &'a T {
unsafe { &*(self.decoded.as_ptr() as *const T) }
}
}

impl<T, S: DerefMut<Target=[u8]>> Deref for Abomonated<T, S> {
// NOTE: The lifetime constraint that was applied to `as_ref()` cannot be
// applied to a `Deref` implementation. Therefore, `Deref` can only be
// used on types T which do not contain references, as enforced by the
// higher-ranked trait bound below.
impl<T, S> Deref for Abomonated<T, S>
where S: DerefMut<Target=[u8]>,
for<'t> T: Abomonation<'t>,
{
type Target = T;
#[inline]
fn deref(&self) -> &T {
let result: &T = unsafe { transmute(self.decoded.get_unchecked(0)) };
result
self.as_ref()
}
}
Loading