Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Remove sys_ptrace dependency in raw container task #2162

Open
2 tasks done
wild-endeavor opened this issue Feb 14, 2022 · 6 comments
Open
2 tasks done
Assignees
Labels
backlogged For internal use. Reserved for contributor team workflow. enhancement New feature or request flytecopilot help wanted Extra attention is needed
Milestone

Comments

@wild-endeavor
Copy link
Contributor

Motivation: Why do you think this is important?

We've been getting more questions like this in the open-source community.

Is there a way we can improve the operation a bit. Why do we need to have special permissions?

Getting inputs into the system:
Run a light-weight image as an init container - it should do nothing except run flytecopilot or something and just download the inputs.pb file and deserialize them. (Existing behavior where primitives are unpacked into {{ .inputs.x }} should also continue.)

Getting outputs out of the system:
The user-container should write one of the following:

  • An outputs.pb file
  • some json file.

The sidecar container should continue to be there. But instead of monitoring the process, it should monitor the outputs folder, which should be cross-mounted.

After it detects one of the two output files, it should begin a timer. After some amount of time, if an empty file named SUCCESS isn't present, it should fail the task.

Once the SUCCESS file is present, it should (serialize and) upload the file to the right location.

If the task doesn't have an output it should still write the SUCCESS file to indicate that it's finished and copilot would upload an empty outputs.pb file (if that's what the system needs).

After copilot is done uploading the outputs, the process should exit. The raw container plugin should be changed to monitor the sidecar container (if it doesn't already). Usual task-timeouts apply.

Also improve documentation a bit. Documentation should state precisely what the contract is between the raw container, and the Flyte system. If it's necessary to configure the cluster a certain way in order to run this task type we should state so clearly.

Goal: What should the final outcome look like, ideally?

see above

Describe alternatives you've considered

see above.

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@wild-endeavor wild-endeavor added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Feb 14, 2022
@EngHabu EngHabu added this to the 1.0.0 - Phoenix! milestone Feb 14, 2022
@kumare3
Copy link
Contributor

kumare3 commented Feb 15, 2022

@wild-endeavor problem is what if the process crashes? The SUCCESS file is already supported by co-pilot. It is not reliable. SYS_PTRACE allows you to know when the process has exited, potentially without writing the task.

After copilot is done uploading the outputs, the process should exit. The raw container plugin should be changed to monitor the sidecar container (if it doesn't already). Usual task-timeouts apply.
this is already the case.

Also improve documentation a bit. Documentation should state precisely what the contract is between the raw container, and the Flyte system. If it's necessary to configure the cluster a certain way in order to run this task type we should state so clearly.

Already an issue for this.

@EngHabu EngHabu removed the untriaged This issues has not yet been looked at by the Maintainers label Feb 23, 2022
@EngHabu EngHabu changed the title [Core feature] Remove sys_ptrace dependency in container task [Core feature] Remove sys_ptrace dependency in raw container task Mar 2, 2022
@EngHabu
Copy link
Contributor

EngHabu commented Mar 16, 2022

We also need to update docs and examples if we are going with the SUCCESS file convention.

Pushing to 1.0.1 for now for further discussion.

@EngHabu EngHabu modified the milestones: 1.0.0 - Phoenix!, 1.0.1 Mar 16, 2022
@EngHabu EngHabu modified the milestones: 1.0.1, 1.0.2 May 11, 2022
@eapolinario eapolinario modified the milestones: 1.0.2, 1.1.0 - Hawk Jun 15, 2022
@pingsutw pingsutw modified the milestones: 1.1.0 - Hawk, 1.2.0 Jun 22, 2022
@pingsutw pingsutw closed this as completed Jul 6, 2022
@eapolinario
Copy link
Contributor

@pingsutw , can you leavea comment explaining why this was closed?

@pingsutw
Copy link
Member

@eapolinario Spotify is still using sys_ptrace in the raw container, so we keep it in flytecopilot.
btw, we already have file access mode (Upload the literal once the SUCCESS file is present) in copilot. These two modes (file access and sys_ptrace) are configurable now.

@eapolinario eapolinario reopened this Oct 24, 2024
@eapolinario
Copy link
Contributor

Revive flyteorg/flyteplugins#264 and introduce the ability to configure the watcher type in the flytecopilot config.

@eapolinario eapolinario added the backlogged For internal use. Reserved for contributor team workflow. label Oct 24, 2024
@eapolinario
Copy link
Contributor

What do you think of making the registration of raw containers optional?

We register it as a task type here, but we could provide a flag to control the inclusion of raw-container as a valid task type.

@eapolinario eapolinario added the help wanted Extra attention is needed label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlogged For internal use. Reserved for contributor team workflow. enhancement New feature or request flytecopilot help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants