-
Notifications
You must be signed in to change notification settings - Fork 9
Default enarx trust process #26
Default enarx trust process #26
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! It has the level of detail I was looking for in one of the earlier RFCs. Thank you for writing all of this up, @MikeCamel . I caught a few typos. Also had a few questions on the client <--> Keep communications after attestation, but I need to dig up the thread we had on it, so feel free to ignore my comments on there at the moment.
|
||
It should be noted that the key party for whom trust is considered | ||
important within the Enarx project is the workload owner - in this case, | ||
the tenant. It is trust relationships from this entity to others, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trust relationships from this entity to others
I'm not sure how to interpret this. "The trust relationships from this entity's point of view"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to clarify this. Please let me know if I've succeeded!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's certainly more clear now, but I'm still not sure. If what matters is trust from me to you, does that mean it's important that I trust you or that you trust me? I can extrapolate this from the context since I already know what's supposed to be going on, but it takes me a minute to think about this phrasing. Could be just me, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken some suggested text from @axelsimon which I think helps.
|
||
The **Enarx runtime image** is typically stored on host storage (local or | ||
remote - the distinction is not relevant to the process). The loading | ||
process varies between architectures (TODO: is this true?), but is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have different code for each loading process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite what I was asking, which was really "do we believe that the actual process for loading will be different, e.g. one will have direct access to local storage, another listens on a socket, etc.?".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good questions. : ) I'm curious about the answers....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lkatalin Who do you think might have these "answers" you mention? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should definitely check with @npmccallum about whether the loading process varies between architectures.
Keep** at provision-time, but this must then be tied to the attestation | ||
measurement, and unique to each **Empty Keep** instance. The session key | ||
established in the state 1->2 transition may be sufficient for this | ||
purpose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by this paragraph. Are there scenarios in which keeping the same communication pathway as before (client agent -> host agent -> now the Keep) but encrypting with the session key would not be good enough, or in which the session key would not be sufficient to prove to the client agent that the Keep is the expected entity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, and I'm not sure what set of options we want to entertain. This is one of the paragraphs that I'd like the group to spend some time discussing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I can feel there might be some not-so-obvious cases here and that we should tread carefully, but right now, the path described by @lkatalin seems fine.
|
||
The **Enarx client agent** now creates an encrypted communications channel | ||
to the **Empty Keep**, using the session key established in the state 1->2 | ||
transition. The **Empty Keep** will be listening for a network connection, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we end up deciding that the network connection would actually be in the Keep, as opposed to having the host ferry encrypted communications between the Keep and client? I don't remember now, but will dig up the thread on this to refresh my memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this is what the RFC says, but it's still in draft, awaiting review from @npmccallum ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you end up finding the thread, @lkatalin? It could help refresh everyone's memory and inform @npmccallum's review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I was thinking of this one: #4
The Tenant Workload MUST never be unencrypted on the network (in transit) | ||
within the Enarx-managed process. | ||
|
||
The Tenant Workload MUST never be unencrypted in storage once it has been | ||
transmitted by the Enarx client agent. | ||
|
||
The Tenant Workload MUST never be unencrypted in RAM once it has been | ||
transmitted by the Enarx client agent. | ||
|
||
The Tenant Workload MUST never be available in unencrypted form on the host. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are already in the general requirements section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I left them in just in case we wish to separate out the general requirements section to another document. Need to tidy up once we decide.
Thanks @MikeCamel , your new commit addresses basically all of my questions and concerns (except the ones we'll need to discuss as a group). I can approve once those are ironed out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some diagrams would help reduce the amount of prose here. The state machine could be represented as a diagram and then each state could contain prose to describe it.
Furthermore, the process flow description could be factored out into something like an mscgen process flow diagram which would further reduce the amount of prose needed. In many cases we won't eliminate prose here, but I think it would allow for more concision.
I'm also hesitant to approve an RFC with TODOs still embedded in the text.
- Status: [PROPOSED](/README.md#proposed) | ||
- Since: 2020-03-17 | ||
- Status Note: under discussion | ||
- Start Date: 2020-03-17 (date you started working on this idea) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(date you started working on this idea)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some diagrams would help reduce the amount of prose here. The state machine could be represented as a diagram and then each state could contain prose to describe it.
I think you're right, but I've tried to stick with standard RFC plain text. I can try to do state machines in ASCII art, I suppose: what do you think? Alternatively, some pictures could be referenced.
Furthermore, the process flow description could be factored out into something like an mscgen process flow diagram which would further reduce the amount of prose needed. In many cases we won't eliminate prose here, but I think it would allow for more concision.
Again, text is good and unambiguous, but... pictures are more concise.
I'm also hesitant to approve an RFC with TODOs still embedded in the text.
Agreed. I think that all(?) of the TODOs refer to other RFCs awaiting approval. Any others should probably be issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the state machines I was thinking of dot graphs. For example: source, output.
For the process flow diagrams I was thinking of mscgen graphs. For example: source, output
I think the visual representation of the process flow or the state machine would just pay back dividends in terms of making sense of it all.
RFCs to describe the default model by which Enarx allows this to take | ||
place, using the capabilities of TEEs, the underlying CPU(s) and | ||
firmware of the host and other parties. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reword:
"allows this to take place through the use of TEE capabilities supported by the host CPU(s)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't agree: the capabilities are a combination of these three entities.
important within the Enarx project is the workload owner - in this case, | ||
the tenant. We should remember that all trust relationships are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reword:
"is the workload owner, also referred to as the tenant"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree - the workload owner may sometimes be the tenant (and is, in this default use case), but will not always be. We have other trust models to explore.
See rfc#0000x for an introduction to trust relationships. | ||
See rfc#0000x for an introduction to trust domains and their application | ||
to Enarx. | ||
See rfc#0000x for an introduction to endorsing authorities, trust anchors | ||
and trust pivots. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this in the spirit of RFCs being orthogonal. I also think we have too many un-merged RFCs to have half-resolved references between them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rfc#0000x described the following Enarx use case states for the default | ||
Enarx trust model (TODO: check correctness in rfc#0000x and normalise): | ||
- State 0 - initial state | ||
- State 1 - pre-attestation | ||
- State 2 - attested | ||
- State 3 - provisioned | ||
|
||
Non-Enarx use cases were also discussed, but these are irrelevant to this | ||
discussion, and therefore ignored in this RFC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, let's remove this information cited from another RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was intended to set the scene. If we don't have references to other RFCs, how do you know to what to refer? I don't think you can have it both ways...
|
||
- host owner trust domain | ||
- host | ||
- host CPU + CPU firmware |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a little redundant. When will we have one of these without the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't (I suspect!), but I'm trying to be as explicit at possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, more important, we're utilising a combination of the capabilities that the two entities provide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite a few comments and suggestions. Again, lots of interesting things here.
One thing i'm wondering is if we could find a more explanatory title, i find "trust process" maybe a bit too… static? I'm not sure if "trust propagation" is the right way to express it, but there is something along those lines: how the trust evolves (and, in our terms, how items transfer from one trust domain to another) in Enarx.
Just an idea.
In any case, this RFC does a lot to explain and clarify how trust is transferred. Thanks!
Keep** at provision-time, but this must then be tied to the attestation | ||
measurement, and unique to each **Empty Keep** instance. The session key | ||
established in the state 1->2 transition may be sufficient for this | ||
purpose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I can feel there might be some not-so-obvious cases here and that we should tread carefully, but right now, the path described by @lkatalin seems fine.
|
||
The **Enarx client agent** now creates an encrypted communications channel | ||
to the **Empty Keep**, using the session key established in the state 1->2 | ||
transition. The **Empty Keep** will be listening for a network connection, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you end up finding the thread, @lkatalin? It could help refresh everyone's memory and inform @npmccallum's review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in all, I think this looks really good. I think having human-meaningful names for the state transitions would be helpful (and I suggested some). There is also one sentence that was a bit complex and had an un-closed parenthesis.
Also: wasn't there some discussion about having diagrams for this RFC? While certainly not necessary, I think those could be really helpful. Maybe in a future version?
In order to get this merged, I think the following will need to be complete:
- Fill out the TODOs and remove them. This may require team discussion, ex. at a design meeting.
- Fix the RFC numbers.
- Sort out the General vs. Other requirements. I recall there was some reason that there is redundancy between them, but I can't remember what that was.
the entire trust model process, and it is vital that all implementations | ||
check it very carefully. | ||
|
||
### State 0->1 change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be helpful to the human brain if each change had a name, ex: "TEE Creation and Enarx runtime loading"
default-enarx-trust-process.md
Outdated
host agent** over a network connection (which does not need to be | ||
encrypted, as no confidential data is passed across it, as because the | ||
**Enarx host agent** is not a trusted entity (part of the **tenant host | ||
domain** anyway). A mechanism (such as a cryptographic nonce), allowing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
host agent** over a network connection (which does not need to be | |
encrypted, as no confidential data is passed across it, as because the | |
**Enarx host agent** is not a trusted entity (part of the **tenant host | |
domain** anyway). A mechanism (such as a cryptographic nonce), allowing the | |
host agent** over a network connection that does not need to be | |
encrypted, as no confidential data is passed across it since the | |
**Enarx host agent** part of the **tenant host | |
domain** and therefore not a trusted entity. A mechanism (such as a cryptographic nonce), allowing the |
Unbalanced parens!
the simplest option. | ||
|
||
|
||
### State 1->2 change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested state change title for humans: "Attestation resulting in a Keep"
the Enarx runtime image, is now an **Empty Keep**, which includes a | ||
running **Enarx runtime instance**. | ||
|
||
### State 2->3 change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Provisioning"
Workload Image** into the **Empty Keep**, turning it into a **Provisioned | ||
Keep**. | ||
|
||
During the state 0->1 change transition, the **Orchestrator** provided the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the transitions have names, the name should be referenced here.
direction). In the case of Enarx, it is trust relationships from the | ||
workload entity to other entities (primarily the host owner in this | ||
document), and maintenance of the tenant's trust domain, which are | ||
always the priority. Any other benefits which may accrue to other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of Enarx, it is trust relationships from the
workload entity to other entities (primarily the host owner in this
document), and maintenance of the tenant's trust domain, which are
always the priority.
This sentence says some of the same things as the first sentence in this paragraph; could they be combined?
default-enarx-trust-process.md
Outdated
communicates with the **Enarx client agent**, which contacts the **Enarx | ||
host agent** over a network connection (which does not need to be | ||
encrypted). A mechanism (such as a cryptographic nonce), allowing the | ||
encrypted, as no confidential data is passed across it, as because the | ||
**Enarx host agent** is not a trusted entity (part of the **tenant host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**Enarx host agent** is not a trusted entity (part of the **tenant host | |
**Enarx host agent** is not a trusted entity (as it is part of the **tenant host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are a lot of as
😄
project team members.
7800c13
to
37ae667
Compare
@MikeCamel Is this PR superseded by #39 or #41? If so, should we close this one? |
Closing - #41 preferred. |
This is an attempt a describing the default attestation process. It's long, complex, and my need breaking up into pieces, but once I'd started writing it, I wanted to get it all out of my head and done. Good luck reviewing...!