-
Notifications
You must be signed in to change notification settings - Fork 6
Recall Design
Recall is designed for use by single-threaded applications, making it simple and performant.
In the multi-core era, it may seem odd that high-performance software is designed for use by a single thread, but there is a good reason for this.
Developers of low-latency systems will be aware of the Disruptor pattern, which showed that applying the Single Writer Principal could lead to extreme performance gains.
Recall is also allocation-free (in steady state), meaning that it will always use pieces of memory that are more likely to be resident in CPU caches. Recall will never cause unwanted garbage collection pauses, since it will not exhaust any memory region.
While allocation-free in steady state, Recall will allocate new buffers as data is recorded
into a Store
. For the most part, these should be direct ByteBuffer
s, which will not
greatly impact the JVM heap. Users can pre-empt these allocations by correctly sizing
a Store
or SequenceMap
before use, to ensure that necessary storage is available at
system start time.
Since the data structures in Recall are never accessed by multiple threads, there is no need for locking, so there is no contention. Resource contention is one of the main things that can limit throughput in multi-threaded programs, so side-stepping the problem altogether leads to significant gains over solutions designed for multi-threaded applications.
Highly peformant multi-threaded map implementations will use compare-and-swap (CAS) operations to avoid the cost of standard locks, and remain lock-free. While this does lead to typically better scalability than a map utilising locks, those CAS operations are still relatively expensive compared to the normal arithmetic operations that are used in single-threaded designs.
Recall's Store
object is an open-addressed hash map, storing a 64-bit long
value against
an arbitrary number of bytes. The developer is responsible for providing functions to
serialise and deserialise objects to and from the Store
's buffer. Each entry is of a fixed
length, eliminating the chance of compaction problems.
The mapping of long
to byte sequence is performed by a simple open-addressed hash map
(from the Agrona library).
Each entry in the map records the long
identifier against a
position in the Store
's buffer. This makes the "index" for the data structure incredibly
compact (~16b per entry), and has the nice side-effect of making inserts, updates and removals
occur in effectively constant time.
Due to the use of a separate index, Recall's Store
does not suffer from compaction problems,
and is always 100% efficient in terms of storage space used.
Record insertion involves adding a new key to the "index", serialising the entry, and increasing the write pointer by the record size.
Note that if insertion causes the map to exceed its load factor, then a re-hash
will occur, causing each entry to be copied into a new, larger buffer.
For this reason it is recommended to correctly size the Store
when it is
constructed.
Removing a record from the map involves copying the highest entry in the buffer over the top of the entry to be remove, updating the "index", and decreasing the write pointer by the record size.
Consider the following highly contrived example:
A trading exchange has the requirement to publish possible profit reports to holders of open positions as market prices fluctuate. When a user creates an open position, a distribution gateway will need to cache the position, and on every price update received from the exchange, publish some information indicating the possible profit associated with the position.
In order to meet the exchange's strict latency requirements, the gateway must be allocation-free, and is written according to the single-writer principle.
Orders are represented by the Order
class:
public final class Order {
private long accountId;
private long orderId;
private int instrumentId;
private double quantity;
private double price;
// getters and setters omitted
}
New Orders are received on the OrderEvents
interface:
public interface OrderEvents {
void onOrderPlaced(Order orderFlyweight);
}
Market data is received on the OrderBook
interface:
public interface OrderBook {
void onPriceUpdate(int instrumentId, double bid, double ask, long timestamp);
}
Profit updates are published on the ProfitEvents
interface:
public interface ProfitEvents {
void onProfitUpdate(long orderId, long accountId, double buyProfit, double sellProfit, long timestamp);
}
public final class ProfitPublisher implements OrderBook, OrderEvents {
private final SingleTypeStore<ByteBuffer, Order> store = createStore();
private final IdAccessor<Order> idAccessor = new OrderIdAccessor();
private final Encoder<Order> encoder = new OrderEncoder();
private final Decoder<Order> decoder = new OrderDecoder();
private final Int2ObjectMap<LongHashSet> instrumentToOrderSetMap = createMap();
private final Order flyweight = new Order();
private final ProfitEvents profitEvents; // initialised in constructor
public void onOrderPlaced(Order orderFlyweight) {
store.store(orderFlyweight, encoder, idAccessor);
instrumentToOrderSetMap.get(orderFlyweight.instrumentId()).add(orderFlyweight.id());
}
public void onPriceUpdate(int instrumentId, double bid, double ask, long timestamp) {
for (long orderId : instrumentToOrderSetMap.get(instrumentId)) {
store.load(flyweight, decoder, orderId);
double buyProfit = flyweight.quantity() * (flyweight.price() - ask);
double sellProfit = flyweight.quantity() * (flyweight.price() - bid);
profitEvents.onProfitUpdate(orderId, flyweight.accountId(), buyProfit, sellProfit, timestamp);
}
}
}
In this example, the incoming Order
s are serialised to a Store
. When a price update is received,
the system iterates over any Order
s associated with the specified instrument, and publishes
a profit update for the Order
.
Operation is allocation-free, meaning that the system will run without any garbage-collection pauses
that could cause unwanted latency spikes. There is no need for object-pooling, as domain objects are
serialised to the Store
's buffer until required.