Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(profiler): decrement refcount given strong reference of frame object in python 3.11 #4901

Merged
merged 8 commits into from
Jan 18, 2023

Conversation

Yun-Kim
Copy link
Contributor

@Yun-Kim Yun-Kim commented Jan 13, 2023

Description

#4125 added the use of a getter function PyThread_GetFrame() to provide compatibility for Python 3.11 in the profiler (specifically the stack collector). However, PyThread_GetFrame() returns a strong reference to a PyFrameObject and its reference count was not decremented in that change. This means that the frame object will not be garbage collected and likely the cause behind the memory leak mentioned in #4899.

Checklist

Relevant issue(s)

Fixes #4899

Testing strategy

Since this change ensures that all FrameType objects in the stack collector have their reference count properly decremented, a regression test was added to ensure that all FrameType objects from the stack collector are garbage collected. test_collect_ensure_all_frames_gc runs an arbitrary function _foo() multiple times, then makes sure that garbage collection has occurred. Finally, the test checks that all FrameType objects relating to that function _foo() have been collected by the garbage collector, or in other words are not present in the garbage collector.

Checking the garbage collector was the most direct way to test that this change indeed does avoid leaking any memory due to FrameType objects in the stack collector.

Reviewer Checklist

  • Title is accurate.
  • Description motivates each change.
  • No unnecessary changes were introduced in this PR.
  • Avoid breaking API changes unless absolutely necessary.
  • Tests provided or description of manual testing performed is included in the code or PR.
  • Release note has been added and follows the library release note guidelines, or else changelog/no-changelog label added.
  • All relevant GitHub issues are correctly linked.
  • Backports are identified and tagged with Mergifyio.

P403n1x87
P403n1x87 previously approved these changes Jan 13, 2023
@P403n1x87
Copy link
Contributor

@Yun-Kim can we perhaps add a test case where we assert that

import gc
from types import FrameType

assert all(not isinstance(_, FrameType) for _ in gc.get_objects())

?

@Yun-Kim
Copy link
Contributor Author

Yun-Kim commented Jan 14, 2023

can we perhaps add a test case where we assert that

import gc
from types import FrameType

assert all(not isinstance(_, FrameType) for _ in gc.get_objects())

?

@P403n1x87 I think we can do that in #4895 instead of this PR if the pypi build error (with Project being found when we should expect a PyFrameObject) is what you're referencing.

@P403n1x87
Copy link
Contributor

@Yun-Kim I think it might be worth adding the test case here too. This is the PR that fixes the frame object leaks so it would be nice if we could add the test case that would catch any potential regressions.

@Yun-Kim Yun-Kim marked this pull request as ready for review January 17, 2023 15:06
@Yun-Kim Yun-Kim requested a review from a team as a code owner January 17, 2023 15:06
@brettlangdon brettlangdon enabled auto-merge (squash) January 17, 2023 23:25
@brettlangdon
Copy link
Member

@Yun-Kim can you fill in the "testing strategy" details of the PR description? it would be good to have some human words describing how we decided to test it this way / are there other ways or ways that cannot be automated?

Copy link
Contributor

@P403n1x87 P403n1x87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍

@brettlangdon brettlangdon merged commit a01c18f into DataDog:1.x Jan 18, 2023
Yun-Kim added a commit that referenced this pull request Jan 19, 2023
…ion in Python 3.11 (#4937)

## Description
#4901 added a fix to properly decrement frametype objects in Python 3.11
in our stack collector, with a regression test. We found that this
regression test does not properly assert that frametype objects were all
garbage collected, which this test fixes.

Specifically, the previous change was looking for only frame objects
related to a function call `_foo()`. However, after reverting the fix
from #4901 this test still passed, because all frametype objects have
been garbage collected. This change makes a broader assertion instead
that all frametype objects have been garbage collected, which does
expectedly fail if the fix from #4901 is reverted.

<!-- If this is a breaking change, explain why it is necessary. Breaking
changes must append `!` after the type/scope. See
https://ddtrace.readthedocs.io/en/stable/contributing.html for more
details. -->

## Checklist
- [ ] Followed the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines)
when writing a release note.
- [ ] Add additional sections for `feat` and `fix` pull requests.
- [ ] [Library
documentation](https://github.com/DataDog/dd-trace-py/tree/1.x/docs)
and/or [Datadog's documentation
site](https://github.com/DataDog/documentation/) is updated. Link to doc
PR in description.

<!-- Copy and paste the relevant snippet based on the type of pull
request -->

<!-- START feat -->

## Motivation
<!-- Expand on why the change is required, include relevant context for
reviewers -->

## Design 
<!-- Include benefits from the change as well as possible drawbacks and
trade-offs -->

## Testing strategy
<!-- Describe the automated tests and/or the steps for manual testing.

<!-- END feat -->

<!-- START fix -->

## Relevant issue(s)
<!-- Link the pull request to any issues related to the fix. Use
keywords for links to automate closing the issues once the pull request
is merged. -->

## Testing strategy
<!-- Describe any added regression tests and/or the manual testing
performed. -->

<!-- END fix -->

## Reviewer Checklist
- [ ] Title is accurate.
- [ ] Description motivates each change.
- [ ] No unnecessary changes were introduced in this PR.
- [ ] Avoid breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Tests provided or description of manual testing performed is
included in the code or PR.
- [ ] Release note has been added and follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines),
or else `changelog/no-changelog` label added.
- [ ] All relevant GitHub issues are correctly linked.
- [ ] Change contains telemetry where appropriate (logs, metrics, etc.).
- [ ] Telemetry is meaningful, actionable and does not have the
potential to leak sensitive data.

Co-authored-by: Brett Langdon <[email protected]>
@Yun-Kim Yun-Kim deleted the yunkim/fix-profiler-311-memory-leak branch January 19, 2023 20:40
Yun-Kim added a commit to Yun-Kim/dd-trace-py that referenced this pull request Jan 23, 2023
…ect in python 3.11 (DataDog#4901)

## Description
DataDog#4125 added the use of a getter function `PyThread_GetFrame()` to
provide compatibility for Python 3.11 in the profiler (specifically the
stack collector). However, `PyThread_GetFrame()` returns a strong
reference to a `PyFrameObject` and its reference count was not
decremented in that change. This means that the frame object will not be
garbage collected and likely the cause behind the memory leak mentioned
in DataDog#4899.
 


## Checklist
- [ ] Followed the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines)
when writing a release note.
- [ ] Add additional sections for `feat` and `fix` pull requests.
- [ ] [Library
documentation](https://github.com/DataDog/dd-trace-py/tree/1.x/docs)
and/or [Datadog's documentation
site](https://github.com/DataDog/documentation/) is updated. Link to doc
PR in description.

<!-- Copy and paste the relevant snippet based on the type of pull
request -->

<!-- START feat -->

## Motivation
<!-- Expand on why the change is required, include relevant context for
reviewers -->

## Design 
<!-- Include benefits from the change as well as possible drawbacks and
trade-offs -->

## Testing strategy
<!-- Describe the automated tests and/or the steps for manual testing.

<!-- END feat -->

<!-- START fix -->

## Relevant issue(s)
<!-- Link the pull request to any issues related to the fix. Use
keywords for links to automate closing the issues once the pull request
is merged. -->

## Testing strategy
<!-- Describe any added regression tests and/or the manual testing
performed. -->

<!-- END fix -->

## Reviewer Checklist
- [ ] Title is accurate.
- [ ] Description motivates each change.
- [ ] No unnecessary changes were introduced in this PR.
- [ ] Avoid breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Tests provided or description of manual testing performed is
included in the code or PR.
- [ ] Release note has been added and follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines),
or else `changelog/no-changelog` label added.
- [ ] All relevant GitHub issues are correctly linked.
- [ ] Backports are identified and tagged with Mergifyio.

Co-authored-by: Brett Langdon <[email protected]>
Yun-Kim added a commit to Yun-Kim/dd-trace-py that referenced this pull request Jan 23, 2023
…ion in Python 3.11 (DataDog#4937)

## Description
DataDog#4901 added a fix to properly decrement frametype objects in Python 3.11
in our stack collector, with a regression test. We found that this
regression test does not properly assert that frametype objects were all
garbage collected, which this test fixes.

Specifically, the previous change was looking for only frame objects
related to a function call `_foo()`. However, after reverting the fix
from DataDog#4901 this test still passed, because all frametype objects have
been garbage collected. This change makes a broader assertion instead
that all frametype objects have been garbage collected, which does
expectedly fail if the fix from DataDog#4901 is reverted.

<!-- If this is a breaking change, explain why it is necessary. Breaking
changes must append `!` after the type/scope. See
https://ddtrace.readthedocs.io/en/stable/contributing.html for more
details. -->

## Checklist
- [ ] Followed the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines)
when writing a release note.
- [ ] Add additional sections for `feat` and `fix` pull requests.
- [ ] [Library
documentation](https://github.com/DataDog/dd-trace-py/tree/1.x/docs)
and/or [Datadog's documentation
site](https://github.com/DataDog/documentation/) is updated. Link to doc
PR in description.

<!-- Copy and paste the relevant snippet based on the type of pull
request -->

<!-- START feat -->

## Motivation
<!-- Expand on why the change is required, include relevant context for
reviewers -->

## Design 
<!-- Include benefits from the change as well as possible drawbacks and
trade-offs -->

## Testing strategy
<!-- Describe the automated tests and/or the steps for manual testing.

<!-- END feat -->

<!-- START fix -->

## Relevant issue(s)
<!-- Link the pull request to any issues related to the fix. Use
keywords for links to automate closing the issues once the pull request
is merged. -->

## Testing strategy
<!-- Describe any added regression tests and/or the manual testing
performed. -->

<!-- END fix -->

## Reviewer Checklist
- [ ] Title is accurate.
- [ ] Description motivates each change.
- [ ] No unnecessary changes were introduced in this PR.
- [ ] Avoid breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Tests provided or description of manual testing performed is
included in the code or PR.
- [ ] Release note has been added and follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines),
or else `changelog/no-changelog` label added.
- [ ] All relevant GitHub issues are correctly linked.
- [ ] Change contains telemetry where appropriate (logs, metrics, etc.).
- [ ] Telemetry is meaningful, actionable and does not have the
potential to leak sensitive data.

Co-authored-by: Brett Langdon <[email protected]>
mabdinur pushed a commit that referenced this pull request Jan 24, 2023
…ect in python 3.11 [backport] (#4951)

## Description
Backports for #4901 and #4937 to 1.7.

<!-- If this is a breaking change, explain why it is necessary. Breaking
changes must append `!` after the type/scope. See
https://ddtrace.readthedocs.io/en/stable/contributing.html for more
details. -->

## Checklist
- [ ] Followed the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines)
when writing a release note.
- [ ] Add additional sections for `feat` and `fix` pull requests.
- [ ] [Library
documentation](https://github.com/DataDog/dd-trace-py/tree/1.x/docs)
and/or [Datadog's documentation
site](https://github.com/DataDog/documentation/) is updated. Link to doc
PR in description.

<!-- Copy and paste the relevant snippet based on the type of pull
request -->

<!-- START feat -->

## Motivation
<!-- Expand on why the change is required, include relevant context for
reviewers -->

## Design 
<!-- Include benefits from the change as well as possible drawbacks and
trade-offs -->

## Testing strategy
<!-- Describe the automated tests and/or the steps for manual testing.

<!-- END feat -->

<!-- START fix -->

## Relevant issue(s)
<!-- Link the pull request to any issues related to the fix. Use
keywords for links to automate closing the issues once the pull request
is merged. -->

## Testing strategy
<!-- Describe any added regression tests and/or the manual testing
performed. -->

<!-- END fix -->

## Reviewer Checklist
- [ ] Title is accurate.
- [ ] Description motivates each change.
- [ ] No unnecessary changes were introduced in this PR.
- [ ] Avoid breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Tests provided or description of manual testing performed is
included in the code or PR.
- [ ] Release note has been added and follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines),
or else `changelog/no-changelog` label added.
- [ ] All relevant GitHub issues are correctly linked.
- [ ] Change contains telemetry where appropriate (logs, metrics, etc.).
- [ ] Telemetry is meaningful, actionable and does not have the
potential to leak sensitive data.

Co-authored-by: Brett Langdon <[email protected]>
emmettbutler pushed a commit that referenced this pull request Jan 30, 2023
…ect in python 3.11 (#4901)

## Description
#4125 added the use of a getter function `PyThread_GetFrame()` to
provide compatibility for Python 3.11 in the profiler (specifically the
stack collector). However, `PyThread_GetFrame()` returns a strong
reference to a `PyFrameObject` and its reference count was not
decremented in that change. This means that the frame object will not be
garbage collected and likely the cause behind the memory leak mentioned
in #4899.
 


## Checklist
- [ ] Followed the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines)
when writing a release note.
- [ ] Add additional sections for `feat` and `fix` pull requests.
- [ ] [Library
documentation](https://github.com/DataDog/dd-trace-py/tree/1.x/docs)
and/or [Datadog's documentation
site](https://github.com/DataDog/documentation/) is updated. Link to doc
PR in description.

<!-- Copy and paste the relevant snippet based on the type of pull
request -->

<!-- START feat -->

## Motivation
<!-- Expand on why the change is required, include relevant context for
reviewers -->

## Design 
<!-- Include benefits from the change as well as possible drawbacks and
trade-offs -->

## Testing strategy
<!-- Describe the automated tests and/or the steps for manual testing.

<!-- END feat -->

<!-- START fix -->

## Relevant issue(s)
<!-- Link the pull request to any issues related to the fix. Use
keywords for links to automate closing the issues once the pull request
is merged. -->

## Testing strategy
<!-- Describe any added regression tests and/or the manual testing
performed. -->

<!-- END fix -->

## Reviewer Checklist
- [ ] Title is accurate.
- [ ] Description motivates each change.
- [ ] No unnecessary changes were introduced in this PR.
- [ ] Avoid breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Tests provided or description of manual testing performed is
included in the code or PR.
- [ ] Release note has been added and follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines),
or else `changelog/no-changelog` label added.
- [ ] All relevant GitHub issues are correctly linked.
- [ ] Backports are identified and tagged with Mergifyio.

Co-authored-by: Brett Langdon <[email protected]>
emmettbutler pushed a commit that referenced this pull request Jan 30, 2023
…ion in Python 3.11 (#4937)

## Description
#4901 added a fix to properly decrement frametype objects in Python 3.11
in our stack collector, with a regression test. We found that this
regression test does not properly assert that frametype objects were all
garbage collected, which this test fixes.

Specifically, the previous change was looking for only frame objects
related to a function call `_foo()`. However, after reverting the fix
from #4901 this test still passed, because all frametype objects have
been garbage collected. This change makes a broader assertion instead
that all frametype objects have been garbage collected, which does
expectedly fail if the fix from #4901 is reverted.

<!-- If this is a breaking change, explain why it is necessary. Breaking
changes must append `!` after the type/scope. See
https://ddtrace.readthedocs.io/en/stable/contributing.html for more
details. -->

## Checklist
- [ ] Followed the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines)
when writing a release note.
- [ ] Add additional sections for `feat` and `fix` pull requests.
- [ ] [Library
documentation](https://github.com/DataDog/dd-trace-py/tree/1.x/docs)
and/or [Datadog's documentation
site](https://github.com/DataDog/documentation/) is updated. Link to doc
PR in description.

<!-- Copy and paste the relevant snippet based on the type of pull
request -->

<!-- START feat -->

## Motivation
<!-- Expand on why the change is required, include relevant context for
reviewers -->

## Design 
<!-- Include benefits from the change as well as possible drawbacks and
trade-offs -->

## Testing strategy
<!-- Describe the automated tests and/or the steps for manual testing.

<!-- END feat -->

<!-- START fix -->

## Relevant issue(s)
<!-- Link the pull request to any issues related to the fix. Use
keywords for links to automate closing the issues once the pull request
is merged. -->

## Testing strategy
<!-- Describe any added regression tests and/or the manual testing
performed. -->

<!-- END fix -->

## Reviewer Checklist
- [ ] Title is accurate.
- [ ] Description motivates each change.
- [ ] No unnecessary changes were introduced in this PR.
- [ ] Avoid breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Tests provided or description of manual testing performed is
included in the code or PR.
- [ ] Release note has been added and follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines),
or else `changelog/no-changelog` label added.
- [ ] All relevant GitHub issues are correctly linked.
- [ ] Change contains telemetry where appropriate (logs, metrics, etc.).
- [ ] Telemetry is meaningful, actionable and does not have the
potential to leak sensitive data.

Co-authored-by: Brett Langdon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python 3.11 profiling causes memory leaks
3 participants