Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

Checked C overview

David Tarditi edited this page Jun 19, 2023 · 2 revisions

New pointer and array types

The Checked C extension adds new checked pointer and array types to C. These types do not have undefined behavior, unlike existing C pointer and array types. Runtime errors such as dereferencing a null pointer or accessing out-of-bounds memory result in signals that stop programs. For some pointer and array types, programmers must declare bounds for variables and members with those types to be able to access memory.

Conversions

The Checked C extension provides implicit and explicit conversion operations between the different pointer types. Implicit conversions must always be provably correct at compile time; they cannot be potential points of runtime errors.

Interoperation support

Checked C provides support for interoperation between code that uses existing unchecked types and code that uses new checked types. Programmers can provide bounds-safe interfaces for existing functions and data structures. The interfaces describe the behavior and requirements of thoese functions and data structures when using them with checked types. Programmers provide bounds declarations and can specify alternate checked types. A bounds declaration implies an alternate checked type, if one is not specified. The existing types of the functions and data structures remain unchanged. When the functions and data structures are used with checked types, implicit conversions are inserted and additional static checking is done.

Checked scopes

Intermixing checked and unchecked pointers can be problematic: unchecked pointers can corrupt memory and can indirectly cause checked pointer accesses to miss errors. Pointer casts can be problematic too because they can lead to memory corruption via type confusion.

To prevent these problems over regions of programs or entire programs, Checked C introduces checked scopes. There are two kinds of checked scopes, which provide different levels of checking: bounds-checked checked scopes and memory-safe checked scopes.

  • In bounds-checked checked scopes, declarations can use only checked pointer and array types, or they must have bounds-safe interfaces that describe the checked types to use. Unchecked pointer and array types cannot be used. This enforces bounds checking. Calls to variable argument functions and functions without prototypes are not allowed. Casts involving pointer types propagate bounds information, but do not maintain type safety.
  • Memory-safe checked scopes add restrictions on casts involving pointer types. For void pointers, the restrictions disallow implicit casts to or from void pointers and only allow explicit casts between void pointers and pointers to "plain-old data". The restrictions prevent memory-safety violations (assuming that memory allocation and deallocation is correct).

Memory-safe checked scopes are the default kind of checked scope. Programmers can use unchecked scopes within checked scopes if necessary.

Generic functions

The Checked C extension adds generic functions and generic function types. Generic functions provide a safe alternative to some uses of void pointer types for function argument and return types. Void pointers are well-known to bypass static typechecking. Functions that take or return void pointers (such as memcpy) also bypass static typechecking. This is dangerous because it can lead to type confusion errors. A type confusion error is when data of one type is used where data of another type is expected. This can lead to memory corruption or data disclosure.

With a generic function, a programmer can express that a function can operate over any kind of pointer type, just like a function that takes or returns void pointers. A programmer also expresses constraints on argument types and return types. For example, the arguments at a specific call to a generic function might need to have the same pointer type. In this way, type confusion is avoided.

Some existing C library functions, such as memcpy and bsearch, are naturally generic functions. Checked C provides support for giving those functions alternate generic function types. In unchecked scopes, programmers can opt-in to using the generic function type at a call by supplying additional type information. In checked scopes, the generic function type must be used and programmers must supply additional type information.

Generic structures

Programmer sometimes use void pointer types in C to build generic data structures such as lists, hash tables, or trees. A programmer may cast a pointer type to and from a void pointer. This introduces the possibility of type confusion errors because there is no guarantee that a void pointer will be cast back to its original type.

To avoid these type confusion problems, Checked C supports generic structure types. A programmer defines a structure, introducing type variable to used in place of void. The type variable is named, just like any other variable can be named. A programmer may then build in an instance of generic structure for a specific type. For example, a programmer could define a structure this a generic structure PtrList for a list of pointers. A programmer could then have a list of pointers to integers, by writing PtrList<int>.

Existential (abstract) structures

Sometimes programmers want to pair data with functions that operate on that data. A programmer may be implementing a list of callbacks, where each callback has some data to be passed to it when it is called. A programmer implementing a library may wish to hide some implementation details of the library, so that they can change the details later. In C, this information hiding is implemented by converting pointers to and from void *. This can be dangerous because type confusion can occur at the points of these casts.

Checked C provides existential structures to provide information hiding in a type-safe way. A programmer can say that there exists a structure containing two members, one which is a Ptr<T> and the other that is a function taking a Ptr<T> argument. A programmer can say there exists some type T used by the structure. The details of T are otherwise hidden (or abstracted). The operations for instances of structures with an existential type are limited. A programmer can use the members of an instance with each other, for example, call the function member with the data member as an argument. It would be incorrect to use members from a different instance because the underlying types may differ. Programmers can also copy an instance of the structure or create an instance with each other. Type checking enforces that the members are used safely with each other.

More information

These pages describe more about the extension. For a comprehensive description, see the PDF of the Checked C specification.

Examples

For some simple code, see the samples directory. For benchmarks and real-world code examples, see our list of examples.