-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support sampling #150
Closed
kamilkowalski
wants to merge
51
commits into
spandex-project:master
from
kamilkowalski:distributed-sampling-priority
Closed
Support sampling #150
kamilkowalski
wants to merge
51
commits into
spandex-project:master
from
kamilkowalski:distributed-sampling-priority
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Closing as there's no updates and I won't have time to address this anytime soon. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Scope
This PR builds on #142 by @reachfh to:
priority
attribute (reading, updating, propagating)mix format
, Credo, Dialyzer and Coveralls, together with efficient cachingI know this PR is quite big in scope, but I figured since the code was already submitted like this then I'll keep it in one PR. Happy to open a couple of smaller PRs if that's better to review.
Approach
While the original approach was to set the default
priority
tonil
(a change from the previous1
) - I don't think there's a value that would fit all purposes. Datadog uses values from-1
to2
, while OpenTracing uses values greater or equal to0
, andnil
to allow the agent to fall back to its defaults.Defaulting it to a static value like
1
ornil
wouldn't fit all backends - so I've introduced aSpandex.Adapter.default_priority/0
callback to the Adapter behaviour. This callback is only called when starting a new trace - allowing the adapter to define the defaultpriority
for new traces, while maintaining thepriority
of the parent trace when propagating over service boundaries.This approach is somewhat backwards-compatible too - application developers will need to update
spandex
ANDspandex_datadog
because of the behaviour change. But ifspandex_datadog
always sets a sensible defaultpriority
- it will never benil
for that adapter, so any software that relied on that property being an integer will keep working.Datadog's implementation
I've been testing the behavior of dd-trace-rb and it seems the
_sampling_priority_v1
metric is always sent for top-level spans (entry spans for a given service). When using head-based sampling configured at the Agent level, the parameter is set to 0 when it was rejected or 1 when it should be kept (AUTO_REJECT
andAUTO_KEEP
respectively).Agent configuration can always be overridden by application developers - either by setting sampling rules on the library level, or by forcing them to be dropped/kept by methods within the library. Regardless of the method, this application-level configuration sets the
_sampling_priority_v1
param to either -1 or 2 (USER_REJECT
andUSER_KEEP
respectively).The Agent then uses that value to determine whether the trace should be kept (values greater than 0) or dropped (values less or equal to 0).
Additionally, top-level spans contain either a
_dd.agent_psr
or_dd.rule_psr
parameter describing the sampling rate used. The former is present if the trace has apriority
ofAUTO_REJECT
orAUTO_KEEP
- meaning the sample rate was determined based on Agent configuration. The latter is present when the priority was set toUSER_REJECT
orUSER_KEEP
- meaning that there was a sampling rule defined at the application level.Finally, whenever the library sends traces to the Datadog Agent - it replies with a JSON payload containing a
rate_by_service
attribute that describes what sampling rate should be applied to services and environments. The Agent calculates these values dynamically based on the target requests per second it's allowed to ingest - rate-limiting "chatty" services, and allowing "quiet" services to send all spans. The tracing library uses those values to perform sampling, in the absence of application-level rules.Since it's the
SpandexDatadog.Adapter
that will set the default priority and communicate with the Agent - the approach presented in this PR allows us to implement things in the exact same way as indd-trace-rb
.