Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[FEAT] Add better detection of Ray Job environment (#3148)
When running in a Ray Job, without the user invoking any Ray commands or `ray.init()` explicitly, the `ray.is_initialized()` function returns False. This means that Daft "does not know" that it is running inside of a Ray cluster, and thus will not default to using the RayRunner. This can lead to unexpected behavior when using `daft-launcher` because a user must know to call `daft.context.set_runner_ray()`. This PR changes that behavior by attempting to look up the `$RAY_JOB_ID` environment variable, as a heuristic to tell whether or not it is currently running inside of a Ray job. To test, I just ran a Ray job and called `daft.context.get_context()` after initializing a Daft dataframe <img width="1350" alt="image" src="https://github.com/user-attachments/assets/0a6d8ae4-034a-424d-a3d7-9311d08be454"> --------- Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: Jay Chia <[email protected]@users.noreply.github.com>
- Loading branch information