[BUG] Increased trace ingestion volume after upgrading from 1.67.0 to 1.67.1 #2915

chrisforrette · 2024-10-04T23:02:22Z

Version of dd-trace-go

1.67.1

Describe what happened:

Hello! Upon reviewing our Datadog usage in September we noticed a large spike in trace ingestion volume that started on September 10th, which is when we deployed an update of dd-trace-go from 1.67.0 to 1.67.1. After reverting back to 1.67.0 today, we're now seeing more expected trace volume levels.

I inspected the datadog.estimated_usage.apm.ingested_bytes metric and found that the spike was isolated to one sampling_resource_name value (which was an endpoint of another service that calls this service a lot):

Before the upgrade, this endpoint would generate around 4GB of traces per day and after the upgrade it was generating around 80GB — 100GB a day, which is quite a huge leap.

Also potentially worth noting, when I applied a "sum by" of the sampling_resource_name on the datadog.estimated_usage.apm.ingested_bytes metric, before the upgrade there was a sampling_resource_name:unknown value that was generating around 30GB of traces a day. The value went away when we upgraded and it has now returned after reverting it.

Please let me know if there are any other details that might be helpful to share.

Describe what you expected:

I would expect trace ingestion volume to remain about the same, or based on the 2 PRs included in the changelog:

...the latter PR seems to reduce "resampling" and with my limited understanding of how this library works I might assume lower trace ingestion volume based on that.

Steps to reproduce the issue:

I didn't reproduce this in an isolated way.

Additional environment details (Version of Go, Operating System, etc.):

Go version: 1.23
OS: Alpine linux 3.20.1

Environment variables:

DD_TRACE_SAMPLE_RATE: 0.1
DD_TRACE_SAMPLING_RULES: [{"service": "primary.db", "sample_rate": 0.03}, {"service": "replica.db", "sample_rate": 0.03}]

The text was updated successfully, but these errors were encountered:

darccio · 2024-10-08T18:16:25Z

@chrisforrette Thanks for reaching out! Sorry for taking a bit longer than usual to answer. We'll need you to open a support ticket so we can have access to your organization.

In the meantime, we are reviewing the released code to understand better what could lead to this behaviour.

Thanks!

mtoffl01 · 2024-10-08T18:37:05Z

You can open a support ticket following this link: https://help.datadoghq.com/hc/en-us/requests/new?_gl=1*ll3rjq*_gcl_au*NzUzNTg3NzU1LjE3MjcyMDExNDI.*_ga*MTczNzI5NjU2OC4xNzE5MjU3MzYz*_ga_KN80RDFSQK*MTcyODQxMjU0Ny4xMzEuMC4xNzI4NDEyNTQ3LjAuMC4yMDA5MzU1Nzc4*_fplc*YmJndDI5V20zTHNON3UzMldmRGRZSElMekI3dkNQb1p4MyUyQlJQdzZhY0ZlaFcyZld1OWdzVTFXWWRrYkh5Y0xWJTJCMTJjS0VFSU9Md2JGejNqVlUwQjdmTFV5UFpDTXRKbldZbTlIcTRXZ2FkVUR1aXdVY2xYT0FEeHhlT01zZyUzRCUzRA..
In the ticket, please include a link to this github issue + a link to your account of the graph pictured above.

chrisforrette · 2024-10-09T22:22:29Z

@darccio @mtoffl01 OK done! Help request number is: 1879303

chrisforrette added the bug unintended behavior that has to be fixed label Oct 4, 2024

github-actions bot added the needs-triage New issues that have not yet been triaged label Oct 4, 2024

darccio removed the needs-triage New issues that have not yet been triaged label Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Increased trace ingestion volume after upgrading from 1.67.0 to 1.67.1 #2915

[BUG] Increased trace ingestion volume after upgrading from 1.67.0 to 1.67.1 #2915

chrisforrette commented Oct 4, 2024

darccio commented Oct 8, 2024

mtoffl01 commented Oct 8, 2024 •

edited

Loading

chrisforrette commented Oct 9, 2024

[BUG] Increased trace ingestion volume after upgrading from 1.67.0 to 1.67.1 #2915

[BUG] Increased trace ingestion volume after upgrading from 1.67.0 to 1.67.1 #2915

Comments

chrisforrette commented Oct 4, 2024

darccio commented Oct 8, 2024

mtoffl01 commented Oct 8, 2024 • edited Loading

chrisforrette commented Oct 9, 2024

mtoffl01 commented Oct 8, 2024 •

edited

Loading