Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trace mode: Various improvements #299

Merged
merged 12 commits into from
Oct 1, 2023

Conversation

neverpanic
Copy link
Member

@neverpanic neverpanic commented Oct 1, 2023

This set of changes fixes a number of issues with trace mode:

  • Better support for long paths, fixing a number of places where buffer overflows occurred.
  • Removal of DYLD_FORCE_FLAT_NAMESPACE=1 which is no longer required
  • Do not attempt to inject DYLD environment variables into setuid binaries (which would be stripped anyway)
  • Do not copy setuid binaries into the sip-workaround directory. The setuid bit would be stripped when copying anyway.
  • Add automated tests for darwintrace so that developers can see the coverage and when know earlier when it starts breaking
  • Improve trace mode performance by caching SQLite lookups on the trace mode server side
  • Re-sign binaries with an ad-hoc code signature when copying them. This fixes trace mode on x86_64 Ventura.
  • A few fixes to make sure the tests work as expected.

neverpanic and others added 12 commits September 28, 2023 02:41
This is no longer required since we switched to the interposing setup
and was removed from other locations in macports base a while ago.
The loader will ignore DYLD_INSERT_LIBRARIES on SUID/SGID binaries so we
may as well not set them when running such binaries. This could
potentially be extended with checks for other privileged binaries where
DYLD_INSERT_LIBRARIES will not work.
The S_ISUID and S_ISGID bits are in the st_mode field of the stat
struct, not in the st_flags field. This caused our check for SUID/SGID
binaries to be incorrect, which caused us to copy SUID/SGID binaries
into the sip-workaround directory, where they would then not run
(correctly) because they did not have the permissions required.
darwintrace.dylib did occasionally crash in __darwintrace_is_in_sandbox
due to stack canary violations, because the reimplementation of realpath
and splitting into path components assumed that paths will not be longer
than MAXPATHLEN. In practice, this assumption was wrong, and many build
systems will invoke file system operations with paths longer than that.

Switch to dynamically allocated strings and arrays and resize the
buffers as necessary. This also simplifies the code significantly and
fixes a number of corner cases that were previously not handled
correctly, such as walking up past the root directory or multiple
slashes in a row.

With these changes, quassel builds successfully in trace mode – without
them, the build abort(3)s.
Some of the corner cases (such as very long paths, for example) are hard
to test manually, or at least you don't know exactly whether a specific
code path was covered just from installing a port.

Improve this situation by adding automated tests to darwintrace.
99% of all time spent in process_line() in tracelib.c happens in
dep_check(). In dep_check(), time is spent as follows:

- 15-20 % is spent determining the filesystem's case sensitivity
- 50 % of the time is spent querying the port that provides the given
  path
- 25 % of the time is used to translate the port ID into a port name
- ~8 % are in the binary search of the port name in the list of
  dependencies

In a test build of the yubico-pam port, 13884 iterations of dep_check()
were run. Among those, there are only a little more than 3000 unique
paths. This means that we can get a cache hit rate of about 78 % for
a cache that stores previous lookup results.

This affects the time spent in dep_check() during a test build of
yubico-pam as follows (all numbers in return values of
mach_absolute_time()):

without cache:
  2131768084 total
  14188 calls
  ~ 150251 per call
with cache:
  1054866053 total
  14188 calls
  ~ 74349 per call

Caching thus reduces the duration by 50.5 %.
macOS Ventura seems to have broken trace mode because it kills signed
processes when preload libraries are present. Fix this by re-signing
binaries when copying them.

I tested this on x86_64 Ventura, where it works as expected.

Closes: https://trac.macports.org/ticket/66358
During debugging I have seen cases where stat(2) did leave the mtime
fields uninitialized, so make sure the entire struct is zeroized before
calling stat(2) so that there is at least a useful value rather than
reading from unitinialized memory.
This test broke because the return code of /usr/bin/crontab changed when
users do not have a crontab configured. Fix this by checking the output
of the command if the command fails.
Since it becomes increasingly harder and harder to get an unsigned env
binary that works as expected, include a stripped-down version of the
FreeBSD env.c implementation for testing.
Let's not break CI on arm64, even though this test failure points to an
actual problem.

See: https://trac.macports.org/ticket/66358#comment:39
@neverpanic neverpanic merged commit 5c43ff0 into macports:master Oct 1, 2023
4 checks passed
@neverpanic neverpanic deleted the cal-tracemode-optimizations branch October 1, 2023 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant