Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

work in progress - remote run #3986

Draft
wants to merge 14 commits into
base: master
Choose a base branch
from
Draft

work in progress - remote run #3986

wants to merge 14 commits into from

Conversation

Qmando
Copy link
Member

@Qmando Qmando commented Nov 2, 2024

Work in progress, not fully functional!

Create a new API, /service/{service}/{instance}/remote_run

Currently accepts additional a username, whether it's interactive, and whether it should recreate the deployment

Launches a deployment with -remote-run-{username}. If interactive, replaces command with sleep. If deployment exists, maybe terminate it and relaunch. Returns the name of the pod and the namespace.

Created CLI cmd to interact with this api. Gives you a kubectl command to run to get a shell in the container.

def remote_run(args) -> int:
"""Run stuff, but remotely!"""
system_paasta_config = load_system_paasta_config(
"/nail/home/qlo/paasta_config/paasta/"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using this to override the API endpoint

Comment on lines 58 to 64
def wait_until_deployment_gone(kube_client, namespace, deployment_name):
for retry in range(10):
pod = find_pod(kube_client, namespace, deployment_name, 1)
if not pod:
return
sleep(5)
raise Exception("Pod still exists!")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work. The request times out before the deployment terminates. I think we'd need to return and the client needs to wait and retry.

paasta_tools/api/api.py Outdated Show resolved Hide resolved
paasta_tools/cli/cmds/remote_run_2.py Outdated Show resolved Hide resolved
paasta_tools/cli/cmds/remote_run_2.py Outdated Show resolved Hide resolved

# Create the app with a new name
formatted_application = deployment.format_kubernetes_app()
formatted_application.metadata.name += f"-remote-run-{user}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's the risk of the pod name being truncated, so we'll definitely also want to indicators of this being a remote-run in the pod's labels

paasta_tools/paasta_remote_run_2.py Outdated Show resolved Hide resolved
Comment on lines 2049 to 2053
selector=V1LabelSelector(
match_labels={
"paasta.yelp.com/service": self.get_service(),
"paasta.yelp.com/instance": self.get_instance(),
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: This was copied over from the Deployment function below and is causing errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm, yeah, I don't see why you would need a select at this stage

paasta_tools/api/api.py Outdated Show resolved Hide resolved
paasta_tools/kubernetes/application/controller_wrappers.py Outdated Show resolved Hide resolved
paasta_tools/kubernetes/application/controller_wrappers.py Outdated Show resolved Hide resolved
kind="Job",
metadata=self.get_kubernetes_metadata(git_sha),
spec=V1JobSpec(
active_deadline_seconds=3600,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'll need to propagate this somehow from a parameter

Comment on lines 2049 to 2053
selector=V1LabelSelector(
match_labels={
"paasta.yelp.com/service": self.get_service(),
"paasta.yelp.com/instance": self.get_instance(),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm, yeah, I don't see why you would need a select at this stage

Comment on lines 2074 to 2087
# DO NOT ADD LABELS AFTER THIS LINE
config_hash = get_config_hash(
self.sanitize_for_config_hash(complete_config),
force_bounce=self.get_force_bounce(),
)
complete_config.metadata.labels["yelp.com/paasta_config_sha"] = config_hash
complete_config.metadata.labels["paasta.yelp.com/config_sha"] = config_hash

complete_config.spec.template.metadata.labels[
"yelp.com/paasta_config_sha"
] = config_hash
complete_config.spec.template.metadata.labels[
"paasta.yelp.com/config_sha"
] = config_hash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this stuff is very related to deployment bounces, so I'd say we don't need it in this case

audience = "remote-run-" + user
service_account = "default"
token_spec = V1TokenRequestSpec(
expiration_seconds=600, audiences=[audience] # minimum allowed by k8s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The audience will need to be the kubernetes one, otherwise it won't be usable for doing execs. You should be able to just leave it unset to achieve that.

def create_temp_exec_token(kube_client: KubeClient, namespace: str, user: str):
"""Create a short lived token for exec"""
audience = "remote-run-" + user
service_account = "default"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to come from the main paasta config. We'll have to create a user that can just exec in to places.

@@ -4354,6 +4452,19 @@ def get_kubernetes_secret_env_variables(
return decrypted_secrets


def create_temp_exec_token(kube_client: KubeClient, namespace: str, user: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user parameter is not used here. Is the plan to include it in the service account name if we go for creating ephemeral service accounts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants