Replies: 13 comments
-
In doing some more research, I've come across this StackOverflow answer (by @superroma , I believe), which says you can run multiple reSolve servers running from the same event store:
I'm very glad to hear this! But I still don't fully understand the following:
Thanks for any help you can provide! I've worked with ES/CQRS systems in Elixir and Scala and both have had a clear way to cluster servers, so I'm looking forward to understanding the same about reSolve. |
Beta Was this translation helpful? Give feedback.
-
@dwmcc, sorry for delay, i would like to answer this myself, but i'm on vacation this week, will answer in detail next week. As a very short answer - several options are possible, some of them we did not tested yet, since we decide to focus on serverless approach. |
Beta Was this translation helpful? Give feedback.
-
Hi @superroma - this is very much appreciated. Enjoy your vacation and I look forward to discussing next week! Happy to help contribute to the documentation as I get through the process of running multiple hosts. |
Beta Was this translation helpful? Give feedback.
-
First let me briefly describe the state of the project. There is "local" implementation based on expressjs, and "cloud" serverless implementation. Here is what possible right now in the current version. Processing Commands
We are using serveless approach - nothing is stored in memory, when command arrives, aggregate state is calculated from events in the event store. Then command is processed and event is saved, response sent to caller. We ensure aggregate consistency by using optimistic locking with aggregate versions, as described here https://stackoverflow.com/a/54826095/318097 This works the same way in the local and cloud implementation. Updating read modelsAfter event is saved to event store, it should be applied to all read models. Cloud implementation is using aws step-functions to collect and distribute events to read models. Local implementation is using simple event broker based on zeromq. I'll explain local in detail. EventBroker is the service that notifies other instances about new events. It should be running only on one instance, all other connects to it. It is very simple: instance that stored an event notifies eventBroker, and it notifies all other instances. This means that in the default configuration, all instances will try to update their read models, and if they point to the same db, first one will update and other will receive "already updated" result. This can be further optimized by having different configs for different instances. Exactly-once read model update problemWe spent a lot of time trying to have a central "event bus" that can reliably update any kind of read models, so the projection function can be a simple SQL statement or adding a line to the file and events arrive exactly once. We did not succeed, this is too complicated and still there are cases when it fails. So at the moment, reliable read model implementation requires to have a "ledger" - store/table that knows which events were applied and in which order. Unfortunately, this complicates the read model adapter, it should be written in certain way. Another option is to make projection functions idempotent - being able to handle multiple calls with the same event. Anyway, in server cluster scenario, multiple servers may attempt to apply the same event, but this is prevented by adapter implementation - the first will win, second will receive "already applied" error. There is no negotiation mechanism to decide which server will apply given event. Serving queriesIf instance has specified read model, then it just surves the query the same way as in single-process mode. This is what implemented in the current codebase. Devs told me that they did not test multy-instance use cases for quite a while, so there may be bugs. We'll test it ASAP and fix it. I can see issues with current eventBus implementation in the cluster scenario - what if that server dies, it should be started somewhere else. As a solution, we could abstract away this service and have adapters for several systems like Kafka or RabbitMQ. We could provide an example/tutorial for server cluster. Can you describe your scenario? |
Beta Was this translation helpful? Give feedback.
-
Hi @superroma -- first off, thank you for your thorough response - it is much appreciated! Hope you had a restful vacation. Please bear with me with regards to my reply - I have several specific questions about how clustering might function with this framework.
Perfect - you're resolving the aggregate consistency problem (in both multi-node and serverless scenarios) with a database constraint. 👍
What I'm considering for deployment is a kubernetes cluster (wherein each instance will be able to privately reach the other instances). I'd have one instance specifically configured as the broker, and a pool of instances that are connected to the broker. To my understanding this will work, assuming it is configured properly. I do still have a few questions on the broker setup for my clustering scenario, specifically:
Perfect - as long as this is the case for the PostgreSQL adapter, then I don't think there would be any trouble with this functionality when clustering.
Is this built in to the PostgreSQL adapter for the read model? I'd assume so but wanted to confirm. Ostensibly, the table would only need to be defined and created once, and then once it exists we wouldn't need to run that statement again (outside of schema changes). I need to spend some time digging into the repository, but for my sanity -- does the read model adapter check the existence of the table, and only run the init code if it doesn't exist, or if the enumeration of fields has changed? The docs mention:
If I wanted to use an ORM for defining and updating tables in the read models, such as TypeORM, would you suggest I use it directly (keeping in mind the "all instances will try to update their read models" constraint)? Final questions:
To clarify, you mention "eventBus" in that quote -- is this another term for eventBroker in the reSolve framework? If I deploy reSolve as described above -- one instance running the broker and a pool of instances connected, assuming kubernetes has the job of keeping the broker online and restarting it if it's unhealthy, should this operate as intended? If the broker goes offline, I would assume any API requests would fail until it comes back online? Would we be guarded against an instance entering an invalid state such as when applying commands to an aggregate? Thanks Roman! Thoroughly appreciate your input here. |
Beta Was this translation helpful? Give feedback.
-
Dylan, thank you for your interest in reSolve and possible contribution, First of all - devs are not sure that event broker currently works with more than one server - we will test it and fix in nearest time - I will keep you informed.
Let me do that after our tests, because implementation may change.
Instance that hosts the broker actually runs two processes - a broker and resolve server. So yes, it can serve API calls.
Yes, all supplied adapters implement "ledger".
Yes. It does not alter tables though, you reset read model - meaning delete everything, init db structure and rebuild from events.
This section is likely to be removed, since "central ledger" approach seems to be not working reliably. You still can work without an adapter - just do anything in the init and projection functions, but then it should be idempotent - reSove will retry events. At the moment it is not possible to get underlying db connection for adapter, so you can init TypeORM from it, but it is not hard to provide. I guess using ORM would be an anti-pattern for CQRS/ES app, because of significant overhead (code is serverless - stateless, remember?) and assuming normalization, while CQRS works best with denormalized read models - meaning you just store answers to your queries. Perhaps this topic deserves its own article to explain.
Its a typo, I meand event broker.
Yes, everything should work this way (after our tests of course). If event broker dies, everything will work, except read models updates. When event broker starts, it will continue where it left - no events will be missed. |
Beta Was this translation helpful? Give feedback.
-
🎉 💯
👍
Fair point - we should keep the projection logic slim since we do our state management in the aggregate. I was mainly curious about using an external library within the read models. Or possibly even running raw SQL.
Thank you for taking the time for such thorough responses @superroma -- excited to see if the event broker is able to function with multiple servers without too much work! In the meantime, let me know if I can assist. |
Beta Was this translation helpful? Give feedback.
-
Hi Dylan, JFYI, we have tested multi-instance config, and, as I feared, it doesn't work correctly with the current version. We have fixed the problem, and fix is likely to be included in the next release. |
Beta Was this translation helpful? Give feedback.
-
@superroma fantastic news! Excited to check it out. Thanks for the update. |
Beta Was this translation helpful? Give feedback.
-
Hi Dylan, We have released reSolve 0.28. There we have removed event broker process. Now all instances are registered in event store and notify each other without explicit configuration. So all instances working with the same event store work well together. To demonstrate this without clusters and containers, we have added "replica" config to the hacker news example: resolve/examples/hacker-news/run.js Lines 76 to 81 in 39fbf8a https://github.com/reimagined/resolve/blob/dev/examples/hacker-news/config.dev.replica.js So you can run two instances of hacker news app on the same machine - they will use the same event store, but have their own read models and listen on different ports. If you want, we can prepare an example with app containers and db container, what container orchestration system do you prefer? |
Beta Was this translation helpful? Give feedback.
-
@superroma that would be very helpful with understanding I think. Maybe docker compose because its quite simple? I personally would love to see a serverless example, perhaps using CDK, not sure if that is out of the scope of what you were thinking tho? To be honest I think some good docs on how everything fits together would go a long way with trying to understand it all. For example I think Wolkenkit does a good job here: https://docs.wolkenkit.io/3.1.0/getting-started/understanding-wolkenkit/architecture/ (Wolkenkit docs in general are actually really good, worth a look) |
Beta Was this translation helpful? Give feedback.
-
Yes, we need a good architecture overview, will work on it. Regarding docker and serverless examples - are you implying that resolve-cloud deployment work in containers? So here we are describing how to run "local" express-based reSolve version in a multi-instance cluster. I would not call it serverless. It is mostly stateless, works similar to cloud version, but is not serverless. We'll prepare an example using docker composer. |
Beta Was this translation helpful? Give feedback.
-
This I would be very interested in seeing because it sounds the most scalable and lowest costs (pay only what you use) but I understand if this is your secret sauce you dont want to give away .. |
Beta Was this translation helpful? Give feedback.
-
Is your feature request related to a problem? Please describe.
Reading through the reSolve documentation, I'm specifically curious about whether clustering is natively supported. Specifically, the ability for multiple reSolve servers to be run in a cluster, where a given aggregate state is accessible to every server across the cluster, and where events can be emitted and processed by any member of the cluster.
Looking through the Application Configuration documentation, the only potentially relevant information I've found is in the eventBroker section, where you can configure publisher and consumer settings. If I want to spin up multiple reSolve servers, should I set both of these configuration options on every server?
I'd also like to know if configuring the eventBroker will also allow multiple reSolve servers to cluster in the way I've mentioned?
An example use-case for this is deploying a reSolve application into a kubernetes cluster. The application will serve an API which will dispatch commands to aggregates.
You will likely want multiple instances (pods) of the server running, for availability's sake, but these instances should be aware of each-other. Otherwise, you have no way to guarantee a given aggregate is unique across the cluster of servers, you risk aggregate state getting out-of-sync between servers, etc.
Am I misunderstanding the design principles of the reSolve framework?
Any clarity is appreciated here. Thanks for your input.
Beta Was this translation helpful? Give feedback.
All reactions