NATS backend #935

peedrr · 2022-12-04T11:10:36Z

peedrr
Dec 4, 2022

I am working on adding NATS as a backend. The following (lengthy) detail will eventually become the documentation to accompany the PR/module and so it has been written as though the backend module already exists... which it doesn't 🙈. For now it is a kind of statement of intent/planning doc while I work out how to actually achieve it all!

Any help, ideas, comments are appreciated 🙏🏼

Draft documentation:

Why NATS?

NATS is a lightweight, highly performant "message oriented middleware". Similar to other messaging systems, it allows for Producers to create and issue messages (and in this case Cryptofeed becomes the Producer) and for Consumers (e.g. a database, or perhaps a machine learning model) to receive those messages, with the NATS server(s) sitting in the middle.

NATS fits many Cryptofeed use cases

NATS performs well at any scale. It can be up and running on a laptop with just two commands, or it can scale to a multi-cloud Kubernetes deployment. (NATS is a CNCF Incubating project) This versatility makes it a great candidate as a Cryptofeed backend as it can potentially cater for any size of deployment.

Simple de-duplication through Message Headers

Outages and failures can cause gaps in the data. Through the use of customizable Message Headers, the NATS backend can help create an extremely resilient application: configure multiple replicas of Cryptofeed to retrieve the same exchange/pair feeds (perhaps even running each replica in a different datacenter), publish all the messages to the NATS server(s), and NATS JetStream will efficiently deduplicate your data, making sure only one copy of each message is retained for use by the rest of your application.

Even if your application is running locally, having NATS in between replicated Cryptofeeds and your database can easily help you avoid costly UNIQUE checks in the database, or from having multiple database connections trying to feed the same data at the same time (which is considered an anti-pattern).

Replay the data feeds

NATS JetStream has another powerful feature which allows a subset of messages (e.g. all Binance Trades in December) to be replayed either in real-time or as fast as a Consumer can process them. This kind of functionality can greatly help quant traders to backtest a new strategy, for example.

JetStream is optionally (and easily) implemented on the NATS server and so it adds no complexity or dependencies to the Cryptofeed NATS Producer.

Design considerations of this implementation

Freedom to design your messages

NATS itself was designed to be the connective middleware for microservices and so the design choices made in this Cryptofeed implementation are largely based around microservices principles. Specifically, Cryptofeed with NATS is viewed as an isolated service which strives to remain as loosely-coupled from other services as possible and give the user the freedom to publish only the data they require, in the format they need. To achieve these objectives, the NATS backend tries not to be opinionated about what a message looks like, and so it allows Cryptofeed users to:

define their own subject name, either:
- statically, e.g. cryptofeed.data,
- dynamically per message, e.g. binance.btcusdt.trades ... huobi.adabtc.ticker
- or as a mix of the two, e.g. cryptofeed.binance.trades.btcusdt
select which data to exclude from the message body, e.g. remove the exchange name and trading pair from the message (if, for example, they are already included in the subject),
give Cryptofeed's internal data types a different, customizable label before including them in the message, e.g. symbol → instrument, amount → size
re-order the included data types

One (opinionated) Limitation

For simplicity and speed of integration, messages are encoded as JSON byte arrays. Other structs/schemas should be possible and could be implemented in the future, however this current limitation has the advantage of reducing further dependencies.

peedrr · 2022-12-04T13:24:58Z

peedrr
Dec 4, 2022
Author

@bmoscon: regarding customisable/dynamic Subject names and manipulating the contents of the message body, it feels like these features could be useful for Redis Keys, Kafka Topics, etc. (and maybe other backends that I'm not so familiar with) so maybe they could/should be a couple of reusable utilities. It'll take quite a bit of extra thought and work to make them generic. I would appreciate knowing if you think it's worth it (i.e. do you get the sense that enough people are using Kafka/Redis etc. to justify the effort?)

And if it's worth doing as utilities, where should they live? The obvious place is cryptofeed/util however the rest seem to be exchange helpers, not backend utils. Then there's a _util.py file in the backends directory but it seems to be unused?

Thanks

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NATS backend #935

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

NATS backend #935

peedrr Dec 4, 2022

Why NATS?

NATS fits many Cryptofeed use cases

Simple de-duplication through Message Headers

Replay the data feeds

Design considerations of this implementation

Freedom to design your messages

One (opinionated) Limitation

Replies: 1 comment

peedrr Dec 4, 2022 Author

peedrr
Dec 4, 2022

peedrr
Dec 4, 2022
Author