-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ENH] OTel tracing throughout the codebase (#1238)
## Description of changes This PR adds OpenTelemetry tracing to ~all major methods throughout our codebase. It also adds configuration to specify where these traces should be sent. I focused more on laying the groundwork for tracing than on collecting all the data we need everywhere. Default behavior is unchanged: no tracing, no printing. *Summarize the changes made by this PR.* - New functionality - OpenTelemetryClient with relevant config. - Wrap most of our code in tracing. The only major design decision I made was to fully separate OpenTelemetry stuff and Posthog (product telemetry) stuff. Justification: It's tempting to combine OTel and product telemetry behind a single internal interface. I don't think this coupling is worth it. Product telemetry cares about a small and relatively static set of uses, whereas tracing by nature should be very deep in our codebase. I see two ways to couple them and problems with each: - Have a unified Telemetry interface only for the events our product telemetry cares about and use raw OTel for the rest. In other words, use this telemetry interface only for `collection.add()`s, `collection.delete()`s, etc. This seems weird to me: tracing code would be implicit in some cases but explicit in others, making the codebase less easily comprehensible. Also if an engineer later decides to add product telemetry to a codepath that already has tracing, they need to know to remove existing tracing. This increases the cognitive overhead required to work on Chroma, reducing the readability and maintainability of our codebase. - Have a unified Telemetry interface which does everything. In this case, it has the above behavior but also wraps all other OTel behavior we want. This seems weird to me because we're basically writing a wrapper around the complete set of OpenTelemetry behavior which only modifies a small part of it. This increases our maintenance burden for very little value. Instead we have two well-encapsulated telemetry modules which we can modify and use without worrying about the other telemetry. The OTel module provides some lightweight helpers to make OTel a little easier to use, but we can put raw OTel code throughout our codebase and it'll play nicely. ## Test plan *How are these changes tested?* - [ ] Tests pass locally with `pytest` for python, `yarn test` for js Manual testing: - Set environment variables to export traces to Honeycomb at various granularities. - Went through various basic Chroma flows and checked that traces show up in Honeycomb as expected. ![Screenshot 2023-10-12 at 10 39 11 AM](https://github.com/chroma-core/chroma/assets/64657842/49c95054-ef7f-42b1-bb14-4b372edf9343) ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https://github.com/chroma-core/docs)?* Docs PR to land before this does.
- Loading branch information
Showing
33 changed files
with
766 additions
and
195 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.