Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CHORE] Auto attach LLDB debugger to python #2940 #3020

Merged
merged 9 commits into from
Oct 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .vscode/launch-example.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"configurations": [
{
"name": "Debug Rust/Python",
"type": "debugpy",
"request": "launch",
"program": "${workspaceFolder}/tools/attach_debugger.py",
"args": [
"${file}"
],
"console": "internalConsole",
"serverReadyAction": {
"pattern": "pID = ([0-9]+)",
"action": "startDebugging",
"name": "Rust LLDB"
}
},
{
"name": "Rust LLDB",
"pid": "0",
"type": "lldb",
"request": "attach",
"program": "${command:python.interpreterPath}",
"stopOnEntry": false,
"sourceLanguages": [
"rust"
],
"presentation": {
"hidden": true
}
}
]
}
71 changes: 71 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,77 @@ To use a remote Ray cluster, run the following steps on the same operating syste
3. `make build-release`: an optimized build to ensure that the module is small enough to be successfully uploaded to Ray. Run this after modifying any Rust code in `src/`
4. `ray job submit --working-dir wd --address "http://<head_node_host>:8265" -- python script.py`: submit `wd/script.py` to be run on Ray

### Debugging

The debugging feature uses a special VSCode launch configuration to start the Python debugger with a script at `tools/attach_debugger.py`, which takes the target script's name as input. This script finds the process ID, updates the launch.json file, compiles the target script, and runs it. It then attaches a Rust debugger to the Python debugger, allowing both to work together. Breakpoints in Python code hit the Python debugger, while breakpoints in Rust code hit the Rust debugger.

#### Preparation

- **CodeLLDB Extension for Visual Studio Code**:
This extension is useful for debugging Rust code invoked from Python.

- **Setting Up the Virtual Environment Interpreter**
(Ctrl+Shift+P -> Python: Select Interpreter -> .venv)

- **Debug Settings in launch.json**
This file is usually found in the `.vscode` folder of your project root. See the [official VSCode documentation](https://code.visualstudio.com/docs/editor/debugging#_launch-configurations) for more information about the launch.json file.
<details><summary><code><b>launch.json</b></code></summary>

```json
{
"configurations": [
{
"name": "Debug Rust/Python",
"type": "debugpy",
"request": "launch",
"program": "${workspaceFolder}/tools/attach_debugger.py",
"args": [
"${file}"
],
"console": "internalConsole",
"serverReadyAction": {
"pattern": "pID = ([0-9]+)",
"action": "startDebugging",
"name": "Rust LLDB"
}
},
{
"name": "Rust LLDB",
"pid": "0",
"type": "lldb",
"request": "attach",
"program": "${command:python.interpreterPath}",
"stopOnEntry": false,
"sourceLanguages": [
"rust"
],
"presentation": {
"hidden": true
}
}
]
}
```

</details>

#### Running the debugger

1. Create a Python script containing Daft code. Ensure that your virtual environment is set up correctly.

2. Set breakpoints in any `.rs` or `.py` file.

3. In the `Run and Debug` panel on the left, select `Debug Rust/Python` from the drop-down menu on top and click the `Start Debugging` button.

At this point, your debugger should stop on breakpoints in any .rs file located within the codebase.

> **Note**:
> On some systems, the LLDB debugger will not attach unless [ptrace protection](https://linux-audit.com/protect-ptrace-processes-kernel-yama-ptrace_scope) is disabled.
To disable, run the following command:
> ```shell
> echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
> ```

### Benchmarking

Benchmark tests are located in `tests/benchmarks`. If you would like to run benchmarks, make sure to first do `make build-release` instead of `make build` in order to compile an optimized build of Daft.
Expand Down
84 changes: 84 additions & 0 deletions tools/attach_debugger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
"""
This file was copied from the Polars project (https://github.com/pola-rs/polars/blob/main/py-polars/debug/launch.py)
under the license provided by Ritchie Vink and NVIDIA Corporation & Affiliates.

The following parameter determines the sleep time of the Python process after a signal
is sent that attaches the Rust LLDB debugger. If the Rust LLDB debugger attaches to the
current session too late, it might miss any set breakpoints. If this happens
consistently, it is recommended to increase this value.
"""

import os
import re
import sys
import time
from pathlib import Path

LLDB_DEBUG_WAIT_TIME_SECONDS = 1


def launch_debugging() -> None:
"""
Debug Rust files via Python.

Determine the pID for the current debugging session, attach the Rust LLDB launcher,
and execute the originally-requested script.
"""
if len(sys.argv) == 1:
msg = (
"launch.py is not meant to be executed directly; please use the `Python: "
"Debug Rust` debugging configuration to run a python script that uses the "
"polars library."
)
raise RuntimeError(msg)

# Get the current process ID.
pID = os.getpid()

# Print to the debug console to allow VSCode to pick up on the signal and start the
# Rust LLDB configuration automatically.
launch_file = Path(__file__).parents[1] / ".vscode/launch.json"
if not launch_file.exists():
msg = f"Cannot locate {launch_file}"
raise RuntimeError(msg)
with launch_file.open("r") as f:
launch_info = f.read()

# Overwrite the pid found in launch.json with the pid for the current process.
# Match the initial "Rust LLDB" definition with the pid defined immediately after.
pattern = re.compile('("Rust LLDB",\\s*"pid":\\s*")\\d+(")')
found = pattern.search(launch_info)
if not found:
msg = (
"Cannot locate pid definition in launch.json for Rust LLDB configuration. "
"Please follow the instructions in the debugging section of the "
"contributing guide (https://docs.pola.rs/development/contributing/ide/#debugging) "
"for creating the launch configuration."
)
raise RuntimeError(msg)

launch_info_with_new_pid = pattern.sub(rf"\g<1>{pID}\g<2>", launch_info)
with launch_file.open("w") as f:
f.write(launch_info_with_new_pid)

# Print pID to the debug console. This auto-triggers the Rust LLDB configurations.
print(f"pID = {pID}")

# Give the LLDB time to connect. Depending on how long it takes for your LLDB
# debugging session to initialize, you may have to adjust this setting.
time.sleep(LLDB_DEBUG_WAIT_TIME_SECONDS)

# Update sys.argv so that when exec() is called, the first argument is the script
# name itself, and the remaining are the input arguments.
sys.argv.pop(0)
with Path(sys.argv[0]).open() as fh:
script_contents = fh.read()

# Run the originally requested file by reading in the script, compiling, and
# executing the code.
file_to_execute = Path(sys.argv[0])
exec(compile(script_contents, file_to_execute, mode="exec"), {"__name__": "__main__"})


if __name__ == "__main__":
launch_debugging()
Loading