Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowance for using Stored Operation IDs for Semantic conventions for GraphQL Server #1011

Open
magicmark opened this issue Jul 10, 2023 · 5 comments
Assignees

Comments

@magicmark
Copy link

magicmark commented Jul 10, 2023

What are you trying to achieve?

As an alternative to using the GraphQL query name as the span name, I'd like to use its hash (stored operation ID) instead.

Context

We're looking at aligning our GraphQL infra with the following convention:

However, we use document IDs (a sha256 of the query) to uniquely identify queries across our infra. Our org does not allow arbitrary queries at all - every graphql request is known ahead of time, and the query will live somewhere in our codebase.

Why does this matter in the context of the span name?

Query names are not unique (different queries can be defined with the same name across different platforms, or even older/multiple versions of the same query) leading to confusion when trying to find the query in source code.

My concern is this:

Imagine a situation where one of our oncall engineers (who is unfamiliar with the GQL stack) is trying to diagnose a slow or failing operation - they might reasonably use the operation name (query name) as the lookup key when trying to find the query in our codebase to debug further. As mentioned above, this is very susceptible to finding the wrong thing.

"Just add the operation ID as a tag/attribute"

While it is possible to do this, there's still a footgun here in that folks have to know to click into the sapn to copy the operation ID instead of the span name.

Generally we try and build a path for folks to guide them to do the right thing by default - and the "right thing by default" here would be putting the operation name as the span name imo.

The ask

For GraphQL Servers that respect operation IDs only, the span name guidance could allow for using this instead?

FAQs

What about batching

This is actually a different but related point: the first span to our GQL server will be an HTTP request containing multiple queries

https://www.apollographql.com/blog/apollo-client/performance/query-batching

In a case where a server supports batching, I would propose this (where the span name is inside the square brackets)

-[ POST /graphql ]
----[ some-long-sha-256-hash-foo ]
----[ some-long-sha-256-hash-bar ]
----[ some-long-sha-256-hash-baz ]

What about Automatic Persisted Queries?

Apollo support Automatic Persisted Queries (APQ) which is a sort of hybrid. The server supports arbitrary queries, with the query name being the correct and only way to identify them. But subsequent requests have their raw query bodies substituted for IDs instead (which the server has stored in a lookup table). My understanding is that in this setup, IDs are not intended to be used as globally unique identifiers - and more similar to a cache key instead.

tl;dr servers using Apollo APQ should keep the status quo.

@maraisr
Copy link

maraisr commented Jul 17, 2023

What a great start to the discussion, thanks! ❤️

I'd like to offer you my perspective and say yeah, I think it is entirely reasonable to include the query hash in the span name, though with a preference to the OperationName.

Much like with REST, and including the resource name as the span name, eg PUT /my/resource. The equivalent for GraphQL could either be the OperationName or the query identifier.

Both could work, as in to put the query name in the span name, with its id as an attribute or the inverse.

[ 123ms ] |--------------|    * POST /graphql
[ 123ms ]     |----|          * 79c2f6ee6f496da71e2262cf786b6c13
[ 123ms ]        |----|       * Query.ProductCard

Though can say, I do prefer the query name as the span name, so a quick glance I know what is going on.

@benjie
Copy link

benjie commented Jul 22, 2023

Just a thought, but why not both? OperationName.hash has the advantage of being human readable (mostly) whilst also fixing the uniqueness problem.

@trask
Copy link
Member

trask commented Jul 24, 2023

@open-telemetry/technical-committee can you transfer this issue to https://github.com/open-telemetry/semantic-conventions? thx

@Oberon00
Copy link
Member

Oberon00 commented Jul 24, 2023

I am not sure this needs to be moved, as it seems to be a general question about the span name concept. The span name is not supposed to uniquely identify a code location (although in principle, it can). If you want that, there are code.* attributes

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#span

The span name concisely identifies the work represented by the Span, for example, an RPC method name, a function name, or the name of a subtask or stage within a larger computation. The span name SHOULD be the most general string that identifies a (statistically) interesting class of Spans, rather than individual Span instances while still being human-readable. That is, "get_user" is a reasonable name, while "get_user/314159", where "314159" is a user ID, is not a good name due to its high cardinality. Generality SHOULD be prioritized over human-readability.

@SonjaChevre
Copy link

hey @magicmark! just fyi I have created an issue to suggest working on better GraphQL semantic conventions - and linked your interesting use case: #182

Feel free to add feedback in there as well!

@trask trask transferred this issue from open-telemetry/opentelemetry-specification May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants