Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add identity to task execution metadata #5105

Conversation

noahjax
Copy link
Contributor

@noahjax noahjax commented Mar 26, 2024

Why are the changes needed?

For our specific use case it is critical to be able to determine the identity for the subject who launches a task. It seems like other information on the security context could also be more generally useful, so I included that in addition to the critical execution_identity field as part of this PR.

What changes were proposed in this pull request?

Add identity to TaskExecutionMetadata. In conjunction with flyteorg/flytekit#2282, this change will make identity available to flyte agents.

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

#3936

Docs link

Copy link

welcome bot commented Mar 26, 2024

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

@noahjax noahjax marked this pull request as ready for review March 26, 2024 18:28
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request security Issues related to Security improvements labels Mar 26, 2024
Copy link

codecov bot commented Mar 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 59.11%. Comparing base (cb6384a) to head (d2f9e0f).
Report is 39 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5105      +/-   ##
==========================================
+ Coverage   58.49%   59.11%   +0.61%     
==========================================
  Files         627      645      +18     
  Lines       54155    55576    +1421     
==========================================
+ Hits        31680    32854    +1174     
- Misses      19942    20129     +187     
- Partials     2533     2593      +60     
Flag Coverage Δ
unittests 59.11% <100.00%> (+0.61%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@noahjax noahjax requested a review from ddl-ebrown March 27, 2024 16:19
@noahjax noahjax mentioned this pull request Mar 28, 2024
3 tasks
pingsutw
pingsutw previously approved these changes Mar 29, 2024
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 29, 2024
@pingsutw
Copy link
Member

@noahjax just curious, so you are going to use OAuth2TokenRequest in the agent not Secret, right?

My PR is going to pass the actual secret value to the agent because we save the secret in AWS/GCP secret manager, and only the propeller has permission to read it.

@noahjax
Copy link
Contributor Author

noahjax commented Mar 29, 2024

@pingsutw For my use case I actually only need the execution_identity, but I wanted to grab that in a way that could be useful in other cases as well. Your PR is probably more useful in general because it actually fetches secret values, but it would be awesome if in addition to those secret values we could also get some information from the run_as field to the agent. That might require some new message type instead of piggy-backing in the existing SecurityContext

@pingsutw
Copy link
Member

pingsutw commented Mar 29, 2024

@noahjax In that case, should we just pass Identity to the agent? flyte won't mount the Secret and OAuth2TokenRequest to the agent, since it's a long-running service

I think passing Identity to an agent is very useful. Many of our use cases will also need that.

@kumare3
Copy link
Contributor

kumare3 commented Apr 1, 2024

Cc @EngHabu
I agree with @pingsutw - but what are the security implications

@noahjax noahjax force-pushed the noahjax.add-owner-reference-to-create-task branch from 12f5703 to d2f9e0f Compare April 1, 2024 15:30
@noahjax
Copy link
Contributor Author

noahjax commented Apr 1, 2024

@kumare3 @pingsutw That makes sense, I updated the PR to use Identity instead of SecurityContext

@noahjax noahjax changed the title Add security context to task execution metadata Add identity to task execution metadata Apr 1, 2024
@noahjax noahjax requested a review from pingsutw April 1, 2024 16:56
@pingsutw
Copy link
Member

pingsutw commented Apr 2, 2024

cc @ddl-ebrown @noahjax we can merge it, right

@noahjax
Copy link
Contributor Author

noahjax commented Apr 2, 2024

Yeah go for it

@pingsutw pingsutw merged commit 94f4343 into flyteorg:master Apr 3, 2024
48 checks passed
Copy link

welcome bot commented Apr 3, 2024

Congrats on merging your first pull request! 🎉

Jeinhaus pushed a commit to Jeinhaus/flyte that referenced this pull request Apr 8, 2024
@ddl-ebrown ddl-ebrown deleted the noahjax.add-owner-reference-to-create-task branch April 30, 2024 16:58
return admin.TaskExecutionMetadata{
TaskExecutionId: &taskExecutionID,
Namespace: taskExecutionMetadata.GetNamespace(),
Labels: taskExecutionMetadata.GetLabels(),
Annotations: taskExecutionMetadata.GetAnnotations(),
K8SServiceAccount: taskExecutionMetadata.GetK8sServiceAccount(),
EnvironmentVariables: taskExecutionMetadata.GetEnvironmentVariables(),
Identity: taskExecutionMetadata.GetSecurityContext().RunAs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this implementation is actually incomplete based on our needs. When scheduling a job, this identity is actually the IdpId for the flytepropeller, not the user that scheduled the workflow

Ultimately we want to capture both values

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @pingsutw

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddl-ebrown would you create another PR to address it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can do that -- just need to know what we want to capture in the IDL and how we should name it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer security Issues related to Security improvements size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants