-
Why should I use this and not X?
As a rule of thumb, if you only need to support a single language (e.g., Python) then
X
might be better; and you'll probably write FFI bindings tailored to that specific runtime anyway.However, once you target more than one language everything you do needs to have a proper C representation and this crate aims to give you the best of both worlds, being both universally C-level and reasonably idiomatic in each backend; including Rust.
-
Who's the target audience for this?
Anyone writing 'system libraries' that need to be consumable from multiple environments at the same time. Our usage profile is:
- several libraries with 100+ functions (think Vulkan, OpenXR),
- callable from Unity, Unreal, Python,
- high "API uncertainty" in the beginning, requiring lots of iterations to get it right,
- predictable and "industry-grade" APIs in the end (final bindings ship to customers).
-
Where can I ask questions?
Use Github Discussions.
-
Why do I get
error[E0658]: macro attributes in #[derive] output are unstable
?This happens when
#[ffi_type]
appears after#derive[...]
. Just switch their order. -
How should I design my APIs?
This is a broad question and depends on your use case. As a rule of thumb we recommend being slightly conservative with your signatures and always think C. Other languages do not track lifetimes well, and it is easy to accidentally pass an outlived pointer, or doubly alias a
&mut X
on reentrant functions. -
I have a
Vec<T>
in Rust, how can I move it to C#, Python, ...?Moving a
Vec<T>
as-is cannot work as the type would be deallocted on passing the FFI boundary. Creating a newFFIVec<T>
pattern could be implemented, but is currently unsupported. The main design issue is that we would have to create helper methods on the user's behalf and manage ownership and (de)allocation on both side of the boundary.That said, if you want to pass arbitrarily long data from a Rust function
f
to FFI you have 3 options:- Accept a callback
f(c: MyCallback)
. This allows you to create data ad-hoc withinf
and invokecallback
with aFFISlice
. - Return a slice
f() -> FFISlice<T>
. For your users this is a bit nicer to call, but requires you to hold theVec<T>
somewhere else. Usuallyf
would be a method of some service pattern. You also run the risk of UB if callers hold on to your slice for too long. - Accept a mutable slice
f(slice: FFISliceMut<T>)
and write into it. This is a bit more verbose for your caller but usually the most flexible and performant option.
We recommend to accept callbacks if you have a few but unknown-many elements and mutable slices if users can query the length by other means.
- Accept a callback
-
I'm trying to compose pattern X with Y but get errors.
While on the Rust side patterns compose easily, backends usually have some trouble creating working code for things like
FFISlice<FFIOption<CStrPointer>>
. For example C# does not support generics on FFI types, thus a newFFISliceT
type has to be generated for every use ofFFISlice<T>
in Rust. We generally do not recommend to nest type patterns. -
How can I add a new pattern?
Adding support for new patterns is best done via a PR. Patterns mimicking common Rust features and improvements to existing patterns are welcome. As a rule of thumb they should
- be idiomatic in Rust (e.g., options, slices),
- have a safe and sound Rust implementation and 'reasonable' usability in other languages,
- be useful in more than one backend and come with a reference implementation,
- must fallback to C primitives (e.g., 'class methods' are functions with receivers).
As an alternative, and discouraged for public backends, you might be able to get away using "tags".
-
Why do you pin objects in the C# bindings and pass GCHandles to slice constructors?
This question relates to bindings generated like this:
public static uint pattern_ffi_slice_1(uint[] ffi_slice) { var ffi_slice_pinned = GCHandle.Alloc(ffi_slice, GCHandleType.Pinned); var ffi_slice_slice = new Sliceu32(ffi_slice_pinned, (ulong) ffi_slice.Length); try { return pattern_ffi_slice_1(ffi_slice_slice); } finally { ffi_slice_pinned.Free(); } }
-
Without pinning the .NET runtime could relocate the memory while the FFI call is running. In other words, when you enter
pattern_ffi_slice_1
ffi_slice
might reside at0x1234
, but during FFI execution the CLR GC can move the whole array to0x1000
if it wants to optimize memory layout. Since this could happen while Rust is still accessing the old location UB or an access violation would ensue. Pinning prevents that. -
The reason
Sliceu32
in turn only accepts aGCHandle
and not theuint[]
array itself is that once an object is pinned, somebody needs to remember its proper lifetime and to unpin it, butSliceu32
has no reserved field for that, being a low-level primitive. In most cases the method overload handling pinning is the right place, as lifetimes are guaranteed to be correct, but if you need a "long-lived" FFI slice (which, I'd argue, is playing with fire from an interop perspective) you'll also need to handle proper pinning / unpinning (aka lifetime semantics) and race prevention elsewhere.
-
-
How can I get more performance with slices?
As mentioned above, the C# backend will pin slices by default. On our test machine this incurs a performance overhead of about 30-40ns per pinned slice, but uses only safe C#:
| Construct | ns per call | | --- | --- | |
pattern_ffi_slice_delegate(x => x[0])
| 195 | |pattern_ffi_slice_delegate(x => x.Copied[0])
| 1307 | |pattern_ffi_slice_delegate_huge(x => x[0])
| 190 | |pattern_ffi_slice_delegate_huge(x => x.Copied[0])
| 11844317 | |pattern_ffi_slice_2(short_vec, 0)
| 64 |pattern_ffi_slice_2(long_vec, 0)
| 61 |
For a dramatic 2x - 150x (!) performance increase you can enable use_unsafe
in the C# backend which will use
a fixed
slice instead.
Construct | ns per call |
---|---|
pattern_ffi_slice_delegate(x => x[0]) |
52 |
pattern_ffi_slice_delegate(x => x.Copied[0]) |
87 |
pattern_ffi_slice_delegate_huge(x => x[0]) |
61 |
pattern_ffi_slice_delegate_huge(x => x.Copied[0]) |
79741 |
pattern_ffi_slice_2(short_vec, 0) |
28 |
pattern_ffi_slice_2(long_vec, 0) |
24 |
pattern_ffi_slice_4(short_byte, short_byte) |
28 |
This gives more performance when working with slices, but requires <AllowUnsafeBlocks>true</AllowUnsafeBlocks>
being enabled in the C# project setting. In Unity it will force the entire game
project to be ticked Unsafe
and might not be nice if you ship bindings to customers.
However, if you only consume your own bindings and don't give them to 3rd parties this is a non-issue.
-
Why
ctypes
and notcffi
?We had a cffi backend, but ctypes works out of the box and there was, according to our benchmarks, no significant speed difference between these two. Also, ctypes code looks nicer.
-
How can I return a ctypes struct in a ctypes callback?
Right now you can't. This is a know bug in Python and outside of our control.
Quickstart
- start a crate
- copy code of whatever backend comes closest (e.g, C)
- from
Interop::write_to
produce some output, fix errors as they appear - create UI test against
interoptopus_reference_project
to ensure quality
Some Tips
Once you understand how Interoptopus abstracts APIs writing a backend is quite simple:
-
The
Library
is the input to any backend, as in, it fully describes what you should generate. It mainly consists of these elements:- Types
- Functions
- Constants
- Patterns
-
Any backend more or less just converts each of these things one-by-one. It usually writes all constants, then all (composite) types and enums, then all functions, then (optionally) all patterns.
-
Writing and converting types is usually the most tricky part, and might require that you sort types by dependencies (e.g., for C) or handle types differently depending on where they appear (e.g., in C# an
IntPtr
in a field might become aref T
in a function). -
Patterns are fully optional. You can always just implement their "fallback type" (e.g, a CStrPointer is just a
*const u8
) and call it a day. However, when exporting larger APIs (like 100+ functions) producing idiomatic pattern bindings will be a good investment.
How long will it take?
Judging from creating the existing backends, and assuming you've done some FFI calls from that language to a C library, I'd say:
- 1h - browsing an existing backend and understanding how CTypes work
- 2h - producing MVP output that can call a single
hello_world()
- 4h - generate bindings for arbitrary functions with primitive parameters
- 1d - also produce
structs
andenums
- 2d - support the entire C API surface
- 3-5d - have clean, idiomatic wrappers for all patterns and run automated reference tests
This library naturally does "unsafe" things and any journey into FFI-land is a little adventure. That said, here are some assumptions and quality standards this project is based on:
-
Safe Rust calling safe Rust code must always be sound, with soundness boundaries on the module level, although smaller scopes are preferred. For example, creating a
FFISlice
from Rust and directly using it from Rust must never cause UB. -
We must never willingly generate broken bindings. For low level types we must never generate bindings which "cannot be used correctly" (e.g., map a
u8
to afloat
), for patterns we must generate bindings that are "correct if used according to specification". -
There are situations where the (Rust) soundness of a binding invocation depends on conditions outside our control. In these cases we trust foreign code will invoke the generated functions correctly. For example, if a function is called with an
CStrPointer
type we consider it safe and sound to obtain astr
from this pointer asCStrPointer
's contract specifies it must point to ASCII data. -
Related to the previous point we generally assume functions and types on both sides are used appropriately w.r.t. Rust's FFI requirements and we trust you do that.
-
Any
unsafe
code in any abstraction we provide should be "well contained", properly documented and reasonably be auditable. -
If unsound Rust types or bindings were ever needed (e.g., because of a lack of Rust specification, like 'safely' mapping a trait's vtable) such bindings should be gated behind a feature flag (e.g.,
unsound
) and only enabled via an explicit opt-in. Right now there are none, but this is to set expectations around discussions.
tl;dr: if it's fishy we probably want to fix it, but we rely on external code calling in 'according to documentation'.
-
Around Rust '24
#[no_mangle]
became unsafe, and you add it automatically, isn't that an issue?Theoretically yes, practically no:
- It is true that with
#[no_mangle]
you could cause UB, for example by accidentally writing a#[no_mangle] fn malloc() -> usize {...}
. Around Rust '24 the attribute was therefore made unsafe. The vast majority (probably all) of safe Rust projects should simply not use the attribute because of that. - In FFI crates though, you must use the attribute to get normal C names, and there is practically no way of
knowing
which names you are not supposed to use. In other words, even if we made you type
#[ffi_function] #[unsafe(no_mangle)] fn _ZN2io5stdio6_print20h94cd0587c9a534faX3gE() {...}
(compare this Rust issue), and even if you tried to be diligent, you still wouldn't have any way of knowing whether what you just typed might cause UB (without using low-level symbol table analyzers after the fact). - By that same logic, there are quite a few other 'safe' things you are not supposed
to do from FFI crates, e.g., messing up calling conventions, panicking, or not specifying
#[no_mangle]
some of which can be impossible to guard against. - With all that said, us automatically handling these attributes does not create additional issues, but allows us to prevent some, and makes the library nicer to use.
- It is true that with
Any Rust code ...
-
... likely used in applications must not panic, unless a panic is clearly user-requested (e.g., having an
unwrap()
on anFFIOption
). If a function can fail it must return aResult
. Allocations & co. are currently exempt if they lack a goodResult
-API; although the plan is to replace them eventually. -
... reasonably only used during compilation or unit tests (e.g., proc macro code or code generation helpers) should panic, if panicking can lead to a better developer experience (e.g., clearer error messages).
A clarification how the license is meant to be applied:
- Our license only applies to code in this repository, not code generated by this repository.
- We do not claim copyright for code produced by backends included here; even if said code was based on a template in this repository.
- For the avoidance of doubt, anything produced by
Interop::write_to
or any item emitted by a proc macro is considered “generated by”.