Skip to content
This repository has been archived by the owner on May 23, 2023. It is now read-only.

Add support for perfevents and an Observer API #135

Closed
wants to merge 7 commits into from

Conversation

hkshaw1990
Copy link

Perf events can provide additional insights into what's happening within
a span. Although, latency of a span can help us in identifying the
bottle necks within a trace, perf events can compliment that in
identifying as to where was the bottle neck and which platform resource
was busy for that span. This patch is in addition to the perf events
support to zipkin-go-opentracing.

Perf events can provide additional insights into what's happening within
a span. Although, latency of a span can help us in identifying the
bottle necks within a trace, perf events can compliment that in
identifying as to where was the bottle neck and which platform resource
was busy for that span. This patch is in addition to the perf events
support to zipkin-go-opentracing.

Signed-off-by: Hemant Kumar <[email protected]>
@hkshaw1990
Copy link
Author

Issue in reference : opentracing/specification#36
PR to openzipkin : openzipkin-contrib/zipkin-go-opentracing#50
PR to opentracing-contrib/perfevents : opentracing-contrib/perfevents#1

tracer.go Outdated
// Perfevent is the platform metric's name(s) or an event name
// supported by perfevents package
// (https://github.com/opentracing-contrib/perfevents/go)
Perfevent string
Copy link
Member

@yurishkuro yurishkuro Feb 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • nit: PerfEvent
  • it's not clear what tracer is expected to do when receiving this option in the StartOptions. What is the significant of the string value? What is the domain of accepted values? Should be it an enum?
  • instead of extending the StartOption API, can this instead be passed as one of the Tags? We already have precedent for that, sampling.priority tag.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yurishkuro, thanks for the review.

  • Sure, will change it to PerfEvent
  • Yeah. The comments should have been clearer. PerfEvent is a list of perf events in the form of a comma separated string (e.g., "cache-misses,cpu-cycles,instructions"). The tracer is expected to start the events in PerfEvent using the "perfevents" package (https://github.com/opentracing-contrib/perfevents/go) . On span Finish(), the events must be closed. The domain of the accepted values is the set of events supported by the "perfevents" package. Having it as an enum may pose difficulties since, we are also planning to support some dynamic events and other metrics taken from the kernel, in future. Hence, having it as an enum may pose challenges.
  • I agree. Let me check this out and if it proves to be sufficient, will update this PR.

@hkshaw1990
Copy link
Author

hkshaw1990 commented Feb 21, 2017

Hi @yurishkuro
I have added one more commit to this PR removing the "Perfevent" option and creating a new tag "PerfEvent" which takes care of the previous review comments. Changed the zipkin PR accordingly (openzipkin-contrib/zipkin-go-opentracing#50).

// package (https://github.com/opentracing-contrib/perfevents/go) . On span
// Finish(), the events must be closed.

PerfEvent = stringTagName("perfevents")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it perhaps make more sense to create a Tracer that wraps a given "real" Tracer and provides the perfevents for each span therein?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhs it might be useful to think about "tracer plugin" architecture. For example, when I tried a POC of emitting metrics via a decorator tracer (jaegertracing/jaeger-client-go#84), it worked but was quite inefficient, so I decided to go with an observer pattern, which means the implementation is now going to be specific to our tracer (even though the observer API itself isn't - jaegertracing/jaeger-client-go#94). The perf events monitoring seems to be in a similar category that could probably be done simply as an observer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro I think that makes plenty of sense. I would like to avoid things in opentracing-contrib that require action by Tracer implementors... it just seems unrealistic / unsustainable.

More generally I see a benefit (due to intra-process propagation) from separating OT API and SPI interfaces. It looks good on paper, but I think it's tricky to do right with sampling and noop impls taken into account.

Sorry for this tangent, @hkshaw1990 ... it's for a good cause :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hkshaw1990 could you take a look at jaegertracing/jaeger-client-go#94 w.r.t. using a similar API to drive perf event collection?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro and @bhs while I do like the idea of having an observer interface, but I will have to see how the perfevents will fit in there. Will investigate jaegertracing/jaeger-client-go#94 and get back to you on this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro I went through the PRs (jaegertracing/jaeger-client-go#94 and jaegertracing/jaeger-client-go#103 ) regarding an "observer" interface. I did a small prototype implementation for the perfevents package as well, defining the Observer and SpanObserver API in opentracing-go, an observer implementation in the perfevents package and used zipkin to register the oberservers. While this model worked great for starting the perfevents (onSetTag), reading the values and closing the event descriptors (onFinish) and maintaining the count (in the SpanObserver struct), we got issues while logging these event values for a span . The reason being, we couldn't access the values in the (client) tracer or in a sample app which creates the perfevents observer. The perf event values are per-span. Could you please point out if we missed anything obvious?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, your use cases is slightly different from what I created the observer for, i.e. I used it for emitting metrics, something that's triggered by span events but does not have to come back to span as data. I think if the OnStartSpan() method would pass the actual span to the observer, then you can store it in the SpanObserver created for that span (circular ref - hope go can manage) and write to that span from OnFinish().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yurishkuro, sending the actual span to OnStartSpan() method works for this case. I have updated this PR to add a new Observer API in opentracing-go which is largely based on your work (jaegertracing/jaeger-client-go#94). We will need the previous commit which adds a PerfEvent tag though, to identify the user specified perf events through tags. Let me know what do you think.

Metrics are useful to gain insights in a distributed application. But
there can be a lot of metrics in different domains. Adding such metrics
to any one (client) tracer breaks cross platform compatibility. This
commit adds a new observer and span observer API which defines a
standard interface to add such metrics. To expose the metrics, one would
have to implement the observer interface. The (client) tracers would
just  have to install callbacks for methods such as StartSpan,
SetOperationName, SetTag and Finish. The registered callbacks would then
be called on the span events if an observer is created.

This is based on the work done by Yuri here :
jaegertracing/jaeger-client-go#94

Signed-off-by: Yuri Shkuro <[email protected]>
Signed-off-by: Hemant Kumar <[email protected]>
observer.go Outdated
OnFinish(options FinishOptions)
}

// observer is a dispatcher to other observers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this and below are implementation details that shouldn't be in the opentracing-go (especially since they are private and as such unusable anyway).

We need more people to look at the observer API before committing it. @bhs you mentioned a separate SPI, any thoughts? At the moment this observer API is somewhat standalone, i.e. it can only be plugged into the tracer via vendor-specific initialization process (e.g. in Jaeger via tracer options).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro thanks for the review. I have added a commit which removes the implementation details out of opentracing-go. Thanks for pointing this out.

observer.go Outdated
@@ -0,0 +1,15 @@
package opentracing

// Observer can be registered with the Tracer to recieve notifications
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Sorry for the insane delay on this PR -- mea culpa.

  2. "Registered" how?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bhs, I have added a commit explaining how its registered, in the comments, for the observer api.
Thanks for the review!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hkshaw1990 I think the comments are a bit too specific. I would keep it simpler:

// Observer can be registered with the Tracer to receive notifications about new Spans.
// Tracers are not required to support the Observer API. The actual registration of the
// observer is implementation specific, e.g. it might look like this:
//
//     observer := myobserver.New()
//     tracer := client.NewTracer(..., client.WithObserver(observer))

@bhs I prefer not to introduce an official API for registering the observer, because if they are passed to the Tracer constructor the internal data structure can be immutable and not require a mutex for access.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro that makes perfect sense; all I really care about is that the expectations are well-documented.

@hkshaw1990 I would feel a little more comfortable about this PR if there was some sort of clear disclaimer about the Observer being an alpha API that's subject to change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro I have added your suggestions and made the comments a bit more generic.
@bhs I have added a "Note" about the Observer API being in alpha.

// Callback called from opentracing.Span.SetOperationName()
OnSetOperationName(operationName string)
// Callback called from opentracing.Span.SetTag()
OnSetTag(key string, value interface{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it would be better and easier to pass the operation name and set of tags into OnFinish (and eliminate these first two methods). That way the observer doesn't need to have its own copy of the Tag map (and the associated mutex, etc).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would require the tracer implementation to keep track of all tags, which is not always the case, for example when the trace is unsampled, or when the tracer interprets and/or remaps some of the tags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro yes, but since this feature is something Tracers opt-in to, I thought that still might make the most sense... if they have a different "noop" (sic) impl depending on whether there's an observer, so be it.

Otherwise the observer must do the potentially wasteful accounting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a tradeoff, I suppose. If OnFinish() requires all this extra data, now the tracer must do potentially wasteful accounting, even if none of the observers actually need the tags. It also makes the assumption that observers are only interested in tags at the end (e.g. sampling.priority tag is clearly time-dependent).

Copy link
Contributor

@bhs bhs Mar 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/shrug... I am primarily thinking about the OT->metrics use case where I'm assuming the tags are of interest. I do see your point that it's a tradeoff and that there are use cases for either.

@hkshaw1990 in terms of unblocking your own progress, I guess you could move the Observer interface into whatever Tracer you want to modify to depend on Observer. That way you wouldn't get blocked behind these sorts of tradeoffs where we lack information.

In parallel I'd love to see work on coupling OpenTracing and Prometheus (for example), and that would give us another caller to help refine the right interface. At the moment this feels like we're designing in a bit of a vacuum.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In parallel I'd love to see work on coupling OpenTracing and Prometheus (for example), and that would give us another caller to help refine the right interface. At the moment this feels like we're designing in a bit of a vacuum.

@bhs do you mean to use Prometheus (P) just for metrics? In our shared library jaeger-lib we have an adapter from go-kit's metrics to jaeger-client metrics API, specifically with the intention of allowing implementations like P to be pluggable. Thus my POC for metrics could be made to work with P fairly easily, I might spike it in the near future (what stopped me so far was the learning curve of setting up the P). But specifically to your point of whether the Observer is the sufficient interface for that - yes, it is, since I already have it working with other metrics implementations (e.g. expvar, in the hotrod example).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro sorry for the delay.

I meant to use Prometheus (p8s???) for metrics only, yes. That was just an example, though. go-kit's metrics facade is mostly fine, too...

I realize this PR may seem a bit stalled... My remaining concern is just that we don't know whether to do a per-SetTag (etc) observer callback, or a single callback at Finish time. I still lean towards the latter based on my understanding of the potential use-cases.

@tedsuo you have been thinking about prometheus stuff due to kubecon talks, etc... I wanted to cc you here since this is very relevant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep let's get more opinions.

If we did the tags-on-finish approach, we'd need to force yet another implementation detail on tracers - the tags would have to passed in either a map or an array of some structs. Our tracers internally keep tags in a list, not a map.

@yurishkuro
Copy link
Member

How do we proceed with this? In order to keep the API compatibility perhaps it would make sense to add other events like logs, so that the observer fully reflects the span life cycle.

@bhs regarding the discussion about getting tags one at a time or all in the end, perhaps those should be simply two different types of observers, and tracer implementation may choose which one to support. Wdyt?

// which should call the registered (observer) callbacks.
type SpanObserver interface {
// Callback called from opentracing.Span.SetOperationName()
OnSetOperationName(operationName string)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the operation name could be set multiple times throughout the life cycle of a span. You can only know the operation name for sure in OnFinish. In my Java-based approach for something similar, the operationName is a parameter of SpanEventListener#onFinish.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixbarny hi, I am not too sure about making OnSetOperationName() mandatory for the observers, but @yurishkuro might be a better person to answer this since, jaeger-client-go's RPC metrics has a callback registered for this event. However, if its just about logging/setting the operation name for a span, I do agree with you. But, if an observer wants to do something critical based on the operation name, then, probably, this should remain as is?

@bhs
Copy link
Contributor

bhs commented Jun 10, 2017

@yurishkuro

@bhs regarding the discussion about getting tags one at a time or all in the end, perhaps those should be simply two different types of observers, and tracer implementation may choose which one to support. Wdyt?

In Go that seems reasonable since interfaces don't even need to be part of the main OT package. I'm still skeptical about the interpretation of tags as a multimap (rather than "non-multi"-map, I mean), but we don't need to get into that here. :)

@hkshaw1990
Copy link
Author

Moving this PR to opentracing-contrib/go-observer (opentracing-contrib/go-observer#1) and closing this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants