Scaling Public & Private Registries #56
Replies: 1 comment 4 replies
-
Scalability of the key management depends more on the keys needed by an individual user to verify images than it does on the total number of registries that exist. If a user can use a single trusted root key to find 1,000,000 artifact keys, the size of the registry doesn't matter as much. This is exactly why it's important to allow aggregation on public registries when possible. Users will need to discover keys for every image they download from every registry. The more that these registries can consolidate roots, the less TOFU will sneak into the key management process as a shortcut. For example, with Notary v1-style repository roots, users need to discover root keys for every repository on a public registry. It would be much easier to instead have a smaller number of roots for large public registries (as proposed in the tuf-notary proposal, then use these roots to delegate keys to individual repositories. This minimizes the amount of external secure root key distribution that the user needs to do. The story is different for private registries as aggregation may not be possible, and users will primarily be using a small number of these private registries. Because of this, users will only need to securely access root keys for a small number of private registries (or repositories on private registries). |
Beta Was this translation helpful? Give feedback.
-
When designing key management solutions, there are a number of key scaling aspects to consider.
Before getting into the discussions of short-lived, long-lived, or rolling start keys, it helps to consider the number and type of registries, orgs and repos we will need to support.
Public Registry Expansion
When we talk about content promotion within and across registries, there's a growing landscape of registries to consider. Normally, we wouldn't think much about how many public and private registries that exist. However, as we talk about key management, and how we might prevent rollback attacks or other cross registry compromises, it becomes a factor to consider.
Before we add private registries, it's also important to recognize how many public registries exist, as Public Registries represent Software Registries and Distributors.
Public Registries
Public registries are essentially distributors of content. They represent content from multiple sources. For many content producers, this may be the primary form of distribution, or they may be hosted for redistribution for a specific target audience. This would be akin to buying a product from a distributor, such as buying XBox from BestBuy, Target, CostCo or other distributors. Some content creators may only sell through distributor channels as they're not big enough, or simply don't see the need to host their content directly.
Public registry distributors, or redistributors include:
Software Registries
For some content creators, they may choose to distribute directly, or the content may be redistributed. Using the XBox example, some users may choose to buy direct from Microsoft, or they may choose a distributor. In the case of software, the initial acquisition of content isn't necessarily tied to where the user may get subsequent updates, or other content from the same author. Some authors may contract with distributors to host the catalog, but they may serve the content directly. This is the case with MCR syndicating its catalog to Docker Hub.
Software registries include:
Private Registries
Following best practices for consuming public content users will promote the content they depend upon to a private registries. Private registries may be Cloud Hosted, or Product Instanced.
Cloud Hosted Private Registries
Privately Instanced Product/Project Registries
Scaling Key Management Across Public an Private Registries
With a minimum of 4 public registries and 4 software registries, we're already well beyond the single public registry. Just as retail business and distributors come and go, we'll continue to see an ever increasing expansion of software and distributors evolve.
To expand this further, it's a fair estimation that each cloud provider may have between 50,000 and 500,000 private registries. Within each of those private registries, customers sub-divide their permission boundaries. If we used a minimum average of 4 sub-divisions within a registry, we're talking a minimum of 200,000 to 2,000,000 privately scoped permission boundaries. While the software registries don't typically subdivide their permissions, the public distributors likely do.
Within the sub-division of a registry, we can now start counting the number of repositories. While many groups may share a set of keys across their repos within their sub-divided registry, others may decide to manage a dev, staging and prod root key.
Considering we're still at the early phase of container registries, a year from now, it's easy to predict 4 million privately scoped registries, with an average of 10 repos each suggests 40 million root keys is a reasonable estimate. Even if customers would allow us to create aggregated hashes of their data, which they don't, is it even reasonable to build a system that attempts to reconcile 40 million root keys, when many will be in air-gapped virtual networks?
Beta Was this translation helpful? Give feedback.
All reactions