[NEW] Multiple DB supports in cluster mode #1319

madolson · 2024-11-19T02:36:34Z

When moving from Standalone to Cluster, there are two API changes that end users need to consider: cross-slot commands and moving from multiple DBs to a single database. Although the cross-slot requirement is a requirement in order to make sure Valkey clusters scale, there is no similar requirement for DBs. The decision to only support one Database was an optimization, mainly to simplify the key to slot mapping.

This feature was not added in Redis since the old core team considered using multiple databases to be an anti-pattern compared to using prefixes. For example, instead of using database 1 and 0, you could have all keys prefixed with 0:: and 1:: and then build ACLs on top of that.

This use case works, but has some drawbacks. One common workload is loading in a fresh dataset into a secondary database and then performing a SWAPDB operation and then an async flush on the old data.

We will have some technical difficulty with implementing multiple databases now with the introduction of dict per slot, since we would have to duplicate all of the structure for each dictionary.

The text was updated successfully, but these errors were encountered:

wuranxx · 2024-11-19T02:53:30Z

Does this issue expect to support multiple databases in cluster mode?

I think this is a valuable feature. In production environments, many customers are accustomed to using the multi-DB feature. When migrating from standalone mode to a clustered setup, they expect the provider to offer this capability.

madolson · 2024-11-19T02:56:27Z

Does this issue expect to support multiple databases in cluster mode?

Yes, thanks for you commenting. I made this as a placeholder to follow up about, so haven't fully added the details yet.

hpatro · 2024-11-20T04:46:07Z

I think this is a valuable feature. In production environments, many customers are accustomed to using the multi-DB feature. When migrating from standalone mode to a clustered setup, they expect the provider to offer this capability.

@wuranxx Do they use it as a multi tenant setup? I think we should also consider supporting first party ACL support for multiple DBs.

murphyjacob4 · 2024-11-20T05:33:43Z

the ability to flush or swap DBs

Although not currently implemented, we could use DBs as an abstraction to allow users to control and monitor individual workloads that are isolated from one another but collocated on the same Valkey cluster. I think it is a common use case to have the same cluster hosting data for many workloads/microservices. Maybe databases are an easier avenue to solving this use case than just prefixes?

It should be possible for the engine to collect per-DB stats (e.g. number of keys, memory footprint, etc). A command like DBINFO <db_num> could make it easy for users to monitor their server when they have many applications/use cases.
We could support certain configurations on a per DB level. E.g. perhaps you could configure maxmemory per DB with eviction to support isolation between workloads? Or maybe you might want to enable certain settings on one DB but not another?
- In a cluster context, you could lock a DB to be hashed to a single slot (e.g. maybe its DB number/ID). For workloads that require cross-key commands, it could be a way without needing to manage hashtags in the client (logically, it is basically doing the same thing but with some syntactical sugar).

Sidenote: if we want to reverse direction on databases, I would like to float the idea of replacing the set number of DBs and the concept of a DB number (0-15) with a map from DB name to DB so users could create as many DBs as they want and name them how they please. By default, maybe DBs 0-15 are there, but perhaps you could DBCREATE <db_name> for additional DBs.

wuranxx · 2024-11-20T08:27:43Z

I think this is a valuable feature. In production environments, many customers are accustomed to using the multi-DB feature. When migrating from standalone mode to a clustered setup, they expect the provider to offer this capability.

@wuranxx Do they use it as a multi tenant setup? I think we should also consider supporting first party ACL support for multiple DBs.

Since redis/valkey has never implemented ACL control for databases, customers have not raised related requirements.

I believe that adding ACL support for databases would be a much larger requirement, and it’s necessary to reconsider the role of databases within valkey. Since databases have traditionally been regarded as an anti-pattern, there has been relatively little discussion on this topic.

hpatro · 2024-11-20T22:10:05Z

Although not currently implemented, we could use DBs as an abstraction to allow users to control and monitor individual workloads that are isolated from one another but collocated on the same Valkey cluster. I think it is a common use case to have the same cluster hosting data for many workloads/microservices. Maybe databases are an easier avenue to solving this use case than just prefixes?

I was suggesting the use case which @murphyjacob4 has called out. Different type of workloads on a single cluster using multiple DBs. This would warrant separate ACL rules for different workloads.

zuiderkwast · 2024-11-20T23:28:09Z

+1 on this feature. It would remove one of the few the differences between cluster and standalone.

zuiderkwast · 2024-11-20T23:29:58Z

ACL for DB numbers sounds good too but this is orthogonal to this feature I believe. No dependencies between the two.

hpatro · 2024-11-21T00:27:49Z

ACL for DB numbers sounds good too but this is orthogonal to this feature I believe. No dependencies between the two.

Thought of bringing it up as the stance had always been we don’t want to support multiple DBs, hence, acl support for multi dbs don’t need to be built. Let me file a separate issue to discuss about it.

madolson · 2024-11-21T21:45:31Z

Sidenote: if we want to reverse direction on databases, I would like to float the idea of replacing the set number of DBs and the concept of a DB number (0-15) with a map from DB name to DB so users could create as many DBs as they want and name them how they please. By default, maybe DBs 0-15 are there, but perhaps you could DBCREATE <db_name> for additional DBs.

I long ago had thoughts on this somewhere in Redis, but they are probably lost to time. I really like this idea though. I mostly have been calling them namespaces, but we could call them databases as well (I'm still going to call them namespaces here though). By default everything is placed into the default "0" namespace. There is no explicit "create a namespace", you can simply just call SELECT my_namespace and it will be created. We would add a new ACL for namespaces for which ones you can select in to, like $my_namespace. I agree with the idea of having each namespace configurable for stuff like eviction policy, but maybe also defaults like TTL or triggers valkey-io/valkey-rfc#9.

For cluster mode, I think it makes more sense to have namespaces be a clusterwide context instead of have them exist on a single shard. Cluster mode is inherently built to scale, constraining something to just one shard seems like a poor way to scale.

One of the reasons I want to differentiate namespaces, is that I think people already have assumptions about DBs that I don't really want to change.

roshkhatri · 2024-11-22T21:49:59Z

+1 to these features, adding multiple db support would make it possible to migrate from standalone to cluster mode, I also like the idea of having db as namespaces along with ACL support. This might make customers life easier of maintaining one cluster for different microservices/workloads with ACL.

madolson changed the title ~~[NEW] Multiple DB supports~~ [NEW] Multiple DB supports in cluster mode Nov 19, 2024

hpatro mentioned this issue Nov 21, 2024

[NEW] Support database level ACL #1336

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NEW] Multiple DB supports in cluster mode #1319

[NEW] Multiple DB supports in cluster mode #1319

madolson commented Nov 19, 2024 •

edited

Loading

wuranxx commented Nov 19, 2024

madolson commented Nov 19, 2024

hpatro commented Nov 20, 2024

murphyjacob4 commented Nov 20, 2024

wuranxx commented Nov 20, 2024

hpatro commented Nov 20, 2024 •

edited

Loading

zuiderkwast commented Nov 20, 2024

zuiderkwast commented Nov 20, 2024

hpatro commented Nov 21, 2024

madolson commented Nov 21, 2024

roshkhatri commented Nov 22, 2024

[NEW] Multiple DB supports in cluster mode #1319

[NEW] Multiple DB supports in cluster mode #1319

Comments

madolson commented Nov 19, 2024 • edited Loading

wuranxx commented Nov 19, 2024

madolson commented Nov 19, 2024

hpatro commented Nov 20, 2024

murphyjacob4 commented Nov 20, 2024

wuranxx commented Nov 20, 2024

hpatro commented Nov 20, 2024 • edited Loading

zuiderkwast commented Nov 20, 2024

zuiderkwast commented Nov 20, 2024

hpatro commented Nov 21, 2024

madolson commented Nov 21, 2024

roshkhatri commented Nov 22, 2024

madolson commented Nov 19, 2024 •

edited

Loading

hpatro commented Nov 20, 2024 •

edited

Loading