-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rethink about whether creating Address from usize should be unsafe #1227
Comments
The intent was that an Address should originate only from mmap, malloc, or similar sources, or to be derived from existing Addresses. Any attempt to create an arbitrary Address was considered "unsafe." However, this approach hasn’t worked out very well. We can remove the "unsafe". |
We discussed this topic today. The "validity" of an Address is not well-defined in general, but there are definitions of "validity" in different contexts. For example, a block must be 32 KB in size and aligned to 32 KB, and an ObjectReference (currently defined as an address) must be within an object and must be word-aligned. In those cases, deriving addresses in one way may be safer than others. For example, we can derive the address of blocks in a chunk, given the chunk address. And we can derive chunk addresses from a memory region from Currently, we have several types that wrap |
If I use the In this scenario, I think it is very "valid" to convert the |
Any address can be presented as I agree some 'unsafe' is unnecessary for On the contrary, the following is an example that unsafe is useful. mmtk-core/src/util/alloc/bumpallocator.rs Line 220 in 8640ab8
|
TL;DR: Currently,
Address
is unsafe to create from arbitrary values. But should it be? This paper says "Address creation from arbitrary integers is forbidden", but we should rethink about that.The
Address
type has many unsafe methods. Among them, there are methods forAddress
instances:zero
,max
,from_usize
load
,store
, ...as_ref
,as_mut_ref
Understandably, memory accesses are unsafe because of potential races and unaligned/illegal memory operations, and conversions to Rust reference types should be unsafe because only the programmer can guarantee such conversions don't violate Rust's ownership and borrowing rules.
However, it's worth discussing the safety of creating
Address
instances. The following methods createAddress
instances, and some of them are marked as unsafe.ObjectReference::to_raw_address(self)
Address::from_ptr(ptr)
Address::from_mut_ptr(ptr)
Address::from_ref(r)
add
,sub
,and
,or
, ...Address::zero()
Address::max()
Address::from_usize(raw)
According to the 2016 paper Rust as a Language for High Performance GC Implementation by @qinsoon et al.,
and
The paper did not explain why "Address creation from arbitrary integers is forbidden", and it apparently contradicts with "an address represents an arbitrary location in the memory space managed by the GC". In the current Rust MMTk,
Address::zero()
,Address::max()
andAddress::from_usize
are all marked as unsafe. Their doc comments say:It is unclear what "invalid address" is. Since creating
Address
from pointers andObjectReference
is considered safe, it seems to imply thatAddress
is supposed to point to somewhere "safe", such as inside an object (But by that time the raw address ofObjectReference
was not guaranteed to be inside an object, until #1195.) or inside a memory region obtained bymmap
ormalloc
.And since address arithmetics are considered safe, we can get an
Address
safely from anyObjectReference
, and calladdr.and(0).add(0x12345678)
to "safely" create an address from an arbitrary address 0x12345678. That bypasses the unsafe annotations on the creation methods.In fact, in the current Rust MMTk code base, we use
Address
to point to quite many things that are not inside objects. Things likeChunk
,Block
andLine
are wrappers ofAddress
. We can also derive sub-regions from their parents, such as iterating through allBlock
s from aChunk
, or allLines
in aBlock
, and they are both considered safe. We also have linear scanning algorithms that go through every byte, and that is considered safe, too.So I think it is pointless to mark
zero()
,max()
andfrom_usize()
as unsafe.Address
is just what it is: an address, an arbitrary address. There is no validity guarantee of anAddress
anyway. It can be zero, be word-aligned or not, be inside the heap or not, be addressable by a 64-bit Intel CPU or not. There should be no restriction on creatingAddress
. Only memory accesses and methods that create Rust references should be unsafe. And it should be unsafe to convertAddress
toObjectReference
, too, becauseObjectReference
does have the concept of validity (which can be checked by the valid-object (VO) bit).The text was updated successfully, but these errors were encountered: