Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support using dladdr1 for safe trace resolution #188

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

eullerborges
Copy link

Currently, signal-safe tracing is limited to systems where the GLIC version is >=2.35. This is quite restrictive to older systems, and it results in traces silently not resolving even though that is possible (and already in get_frame_object_info) by using dladdr1.

Thus, I'm proposing using that function in older systems, as I managed to get safe traces working after this change. The issue with this, of course, is that these are no longer "safe", as the underlying calls involve memory allocations. These might be avoided, but that involves changes to functions like [elf_]get_module_image_base to make them safe.

In summary, this is not ready to merge in if we consider the safe in get_safe_object_frame, but it can get the discussion started. A simple solution to this might be to allow enabling this behind a flag.

@jeremy-rifkin
Copy link
Owner

Hi, thanks for opening this and taking the time to put together a PR. While restrictive, the decision to not use dladdr for the safe tracing code path is deliberate: I have no reason to believe dladdr is signal-safe. I agree the code here can probably be reworked to not use allocation but the biggest blocker would be proving that dladdr is in fact signal safe. This would also have to be done historically, too, proving that it always has been signal safe or finding versions where it might not have been. Ideally there would be guarantee it will continue to be signal-safe, too. Taking a cursory look at the source code dladdr1 at the very least takes locks so that would require proving pthread_mutex_lock is and always has been signal-safe in the recursive case. It may very well be the case that dladdr1 is signal-safe, however, I have a lot of hesitancy to treat it as such without documentation about its signal-safety.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants