-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple kuadrant instances #5
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
# RFC 0000 | ||
|
||
- Feature Name: `multiple kuadrant instances` | ||
- Start Date: 2023-01-12 | ||
- RFC PR: [Kuadrant/architecture#0000](https://github.com/Kuadrant/architecture/pull/0000) | ||
- Issue tracking: [Kuadrant/architecture#0000](https://github.com/Kuadrant/architecture/issues/0000) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
This RFC proposes a new kuadrant architecture design to enable **multiple kuadrant instances** to be running in a single cluster. | ||
|
||
![](https://i.imgur.com/ZsPibfO.png) | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
The main benefit of multiple Kuadrant instances in a single cluster is that it allows dedicated Kuadrant's services for tenants. | ||
|
||
Dedicated Kuadrant deployment brings lots of benefits. Just to name a few: | ||
* Protection against external traffic load spikes. Other tenant's traffic spikes does not affect Authorino/Limitador throughput and delay as it would when shared. | ||
* No need to have cluster administrator role to deploy a kuadrant instance. One tenant administrator can manage gateways, Limitador and Authorino instances (including deployment modes). | ||
* The cluster administrator gets control and visibility across all the Kuadrant instances, while the tenant administrator only gets control over their specific gateway(s), Limitador and Authorino instances. | ||
* (looking for ideas for more benefits)... | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
### Kuadrant instance definition | ||
![](https://i.imgur.com/BfOXfnB.png) | ||
|
||
A kuadrant is composed of: | ||
* One Limitador deployment instance | ||
* One Authorino deployment instance | ||
* A list of dedicated gateways. | ||
|
||
Some properties to highlight: | ||
|
||
* The policies are not included as part of the kuadrant instances. | ||
* The Kuadrant instance is not enclosed by k8s namespaces. | ||
* Gateways are not shared between kuadrant instances. Each gateway is managed by a single kuadrant instance. | ||
* The control plane has cluster scope and will be shared between instances. In other words, it is only in the data plane that each Kuadrant instance has dedicated services and resources. | ||
* Each kuadrant instance owns one instance (possibly multiple replicas, though) of Limitador and one instance of Authorino. Those instances are shared among all gateways included in the kuadrant instance. | ||
|
||
In the following diagram policies RLP 1 and KAP 1 are applied in the instance *A* and the policies RLP 2 and KAP 2 are applied in the instance *B*. | ||
|
||
![](https://i.imgur.com/yChVsT6.png) | ||
|
||
### All the gateways referenced by a single policy must belong to the same kuadrant instance | ||
|
||
The Gateway API allows, in its latest version [v1beta1](https://gateway-api.sigs.k8s.io/v1alpha2/references/spec/#gateway.networking.k8s.io/v1beta1.CommonRouteSpec), an HTTPRoute to have multiple gateway parents. Thus, a kuadrant policy might technically target multiple gateways managed by multiple kuadrant instances. Kuadrant does **not** support this use case. | ||
|
||
![](https://i.imgur.com/ZpsBf4i.png) | ||
|
||
The main reason is related to the rate limiting capability. The limits specified in the RateLimit Policy would be enforced per kuadrant instance basis (provided by Limitador instance). Thus, traffic hitting one gateway would see different rate limiting counters than traffic hitting the other gateway. The user would expect X rps and actually it would be X rps per gateway. For consistency reasons, when this configuration happens, the control plane will reject the policy. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While I think this is a good first step and we indeed should reject such policies initially, I think there might be a way to support this as we get to architect for multiple cluster support. The use-case isn't much different. What I'm thinking is that each limitador would know about each other. Upon hitting a "shared" limit, they'd start sharing the counters for it. Which, other than latency being lower and cross communication being easier, there isn't much differences between the two applications. Sharing this thought here as an FYI and sharing a possibly path forward. Again, it looks perfectly acceptable to have that limitation initially. |
||
|
||
### The Kuadrant CRD | ||
|
||
Currently, the Kuadrant CRD has an empty spec. | ||
|
||
```yaml | ||
apiVersion: kuadrant.io/v1beta1 | ||
kind: Kuadrant | ||
metadata: | ||
name: kuadrant-sample | ||
spec: {} | ||
``` | ||
|
||
According to the definition above of the kuadrant instance, | ||
the proposed new Kuadrant CRD would add a label __selector__ to specify which gateways that instance would manage. | ||
|
||
```yaml | ||
apiVersion: kuadrant.io/v1beta1 | ||
kind: Kuadrant | ||
metadata: | ||
name: kuadrant-a | ||
spec: | ||
gatewaysSelector: | ||
matchLabels: | ||
app: kuadrant-a | ||
Comment on lines
+78
to
+80
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've been thinking about the label selector approach for this application lately. Although it is consistent with other usages such as how Authorino selects AuthConfigs, how AuthConfigs select API key and X.509 cert issuer Secrets, how Istio selects workloads, among others, I think for this particular case of selecting (assigning) Gateways for a Kuadrant instance nevertheless an approach based on the GatewayClasses could be a better fit. A few advantages I see:
|
||
``` | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
### Wiring Kuadrant policies with Kuadrant instances | ||
Technically, the Kuadrant policies do not belong to any Kuadrant instance. At any moment of time, one policy can switch the targeted network resource specified in the `spec` from one gateway to another. Directly or indirectly via the HTTPRoute. The target references are dynamic by nature, so is the list of gateways to which kuadrant policies should apply. | ||
Thus, the Kuadrant's control plane needs a procedure to associate a policy with **one** kuadrant instance at any time. When the control plane knows which kuadrant instance is affected, the policy rules can be used to configure the Limitador and Authorino instances belonging to that kuadrant's instance. Since the associated kuadrant instance of a policy is dynamic by nature, this procedure must be executed on every event related to the policy. | ||
|
||
When the policy's `targetRef` targets a Gateway, there is a direct reference to the gateway. | ||
|
||
When the policy's `targetRef` targets an HTTPRoute, Kuadrant will follow the [`parentRef`](https://gateway-api.sigs.k8s.io/v1alpha2/references/spec/#gateway.networking.k8s.io%2fv1beta1.CommonRouteSpec) attribute which should be a direct reference to the gateway or gateways. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This means that indirectly, a KAP or RLP, would be applied to more than 1 Kuadrant instance. As mentioned before, the control plane in charge of associating policy with 1 Kuadrant instance, will need to decide which policy apply in case there's another HTTPRoute sharing one or more of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, it's mentioned that this use case is not supported , still we might need to provide a way to mitigate since one might not have access to modify the network topology |
||
|
||
Given a gateway, Kuadrant needs to find out which Kuadrant's instance is managing that specific gateway. By design, Kuadrant knows it is only one. There are at least two options to implement that mapping: | ||
* Read all Kuadrant CR objects and the first one that matches label selector. | ||
* This approach works as long as the control plane ensures that each gateway is matched by only one kuadrant gateway selector. The control plane must reject any new kuadrant instance matching a gateway already "taken" by other kuadrant instance. | ||
* Add annotation in the gateway with a value of the Name/Namespace of the Kuadrant CR. | ||
* This approach is commonly used. Requires annotation management. | ||
Comment on lines
+97
to
+98
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even tho this would require the extra management, might be the simpler and more flexible way. At the moment, the control plane assumes there's only one Kuadrant instance and annotates every single gateway. Both the Kuadrant CR and Gateways targeted should be in sync. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could define the list of gateways that meant to be managed in the Kuadrant CR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd suggest a list of Users can always "split" one
Some discussion about this also here: #7 (comment) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Kuadrant manages gateway instances, not gateway classes. Kuadrant instance A may manage gateway X and instance B may mananage instance Y. Both X and Y gateways may be share the same gateway class. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess what I'm saying is that Kuadrant can use the GatewayClass to know which Gateways to manage. In your provided example,
|
||
|
||
|
||
### Just one external auth stage and one rate limiting stage | ||
Kuadrant configures the gateway with a single external authorization stage (backed by Authorino) and a single external rate limiting stage (backed by Limitador). | ||
Multiple rate limit or authN/AuthZ stages involving multiple instances of Authorino and Limitador can be implemented, technically speaking. | ||
Until there is a real use case and it is strictly necessary, this scenario is discarded. The main reason is about complexity. It is already complex enough to reason about rate limiting and auth services having a single stage. Adding multiple rate limiting stages, or hitting multiple Limitador instances in a single stage (doable with the WASM module) makes it too complex to reason about observed behavior. Currently there is no use case to require that complex scenario. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
Multitenancy is not a requested capability from users. Usually ingress gateways are shared resources managed by cluster administrators and a cluster may have only few of them. It is also a cluster admin task to route traffic to the ingress gateway. Cluster users usually do not control the life cycle of the ingress gateways in order to have their own Kuadrant instance. | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
- Why is this design the best in the space of possible designs? | ||
|
||
The gateway is the top class entity in the design, not the policy. The API protection happens at the gateway and the configuration needs to be done at the gateway. This kuadrant instance design protects the gateway isolating them from other instances (mis)configurations or traffic spikes. | ||
|
||
- What other designs have been considered and what is the rationale for not choosing them? | ||
|
||
TODO | ||
|
||
- What is the impact of not doing this? | ||
|
||
This design is a step forward in a consistent API to protect other service APIs. It makes easier to protect any API, no matter traffic nature. Either north-south or east-west. It makes easier to have the scenario where cluster users deploy their own (not ingress) gateways and enable API protection declaratively. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
TODO | ||
|
||
# Unresolved questions | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might already be work in progress, but a specification of how errors (and which error cases) would be surfaced to the user would be good. For instance, a policy is applied to a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Currently, the addition of a new I think this is already important today and even more so in a context of multiple Kuadrant instances per cluster. Users could benefit from the info of which Gateways have been configured for a policy, which were expected to but have not (and why not), in the policy status. Currently, users can only partially work that out by reading in the annotations of the Gateways which policies affect the Gateway. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
[unresolved-questions]: #unresolved-questions | ||
|
||
- What parts of the design do you expect to resolve through the RFC process before this gets merged? | ||
|
||
Validate the main points of the design: | ||
a) Single auth/ratelimit stage in the processing pipeline of the gateway | ||
|
||
b) Gateways are not shared among kuadrant instances | ||
|
||
- What parts of the design do you expect to resolve through the implementation of this feature before stabilization? | ||
|
||
The wiring mechanism. | ||
|
||
- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? | ||
|
||
Supporting multiple gateway providers #7 | ||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the key benefits are protection against noisy neighbour, isolation particularly in the auth context, and the ability to independently scale based on usage.