Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: support parallel http requests #97

Open
LukeWinikates opened this issue Aug 19, 2022 · 4 comments
Open

Feature Request: support parallel http requests #97

LukeWinikates opened this issue Aug 19, 2022 · 4 comments

Comments

@LukeWinikates
Copy link
Contributor

Hi Friends,

My team uses wavefront-sdk-go via the telegraf output. We found that using the telegraf output to send metrics over http, we are limited to a single http request at a time. We knew the Wavefront Proxy could handle much more throughput than this, so we found a way to parallelize http requests by creating multiple instances of the wavefront output in our telegraf config. This works fine, but the telegraf configuration is complicated and has a modest overhead that we would like to avoid.

I think it would be interesting to support a worker pool model for wavefront-sdk-go directly. We can default the pool size to 1, but allow parallelization up to a reasonable threshold.

I imagine that we would add something like a MaxParallelHttpRequests field to the configuration struct to allow users to set this value, and then add some reasonable implementation of channels and goroutines to reporter.go in order to make the parallelization happen.

I wanted to gauge what other contributors to the sdk would think about this feature. Thinking of @keep94 and @oppegard, and I wanted to get @jbau's take as well if he has one.

Thanks in advance for your time and attention 🙇‍♂️🙌💻

@keep94
Copy link
Contributor

keep94 commented Aug 19, 2022

Luke, your idea of parallel http requests sounds interesting. I just wonder how that will work with the Flush() function of a sender object. When you create a sender, you can effectively turn off autoflushing, send a bunch of metrics and spans which get buffered and flush manually. Flush() returns an error indicating whether or not everything buffered up was sent successfully. This sort of use of the API implies just one goroutine using the sender, so I am wondering how a pool of http connections would work. Are you possibly thinking of rewriting flush to use multiple http connections to do the flushing?

Regarding releasing a v1, I feel that there are several artifacts in the public API that shouldn't be exposed. For instance, I think the functions that convert metrics to strings to be sent over the wire could be made private.

Back to the parallel http requests, if Flush() were rewritten to use multiple http connections, it would still have to block until that last http connection finished sending its portion of metrics/traces because by contract Flush() has to return an error indicating whether or not everything was successfully sent.

@LukeWinikates
Copy link
Contributor Author

I'll see if I can put together a proof of concept for this soon. I'd like to make sure it doesn't introduce any breaking changes, so that it can be an added feature for a v1.x release.

@oppegard
Copy link
Contributor

oppegard commented Sep 14, 2022

@LukeWinikates: an HTTP thread/goroutine pool to allow parallel requests would be great. Please work with @keep94 on the specifics.

@LukeWinikates
Copy link
Contributor Author

I've been looking at the mutex and atomic integer usage in the SDK. As written now, I don't believe that parallel requests can happen, but the Flush method already uses a mutex - I think this is partly because parallel request support was a goal at one point, and partly proactive defense to protect SDK users from getting into a weird state if they call flush from multiple goroutines.

I'm still interested in attempting a WaitGroup implementation, and I wonder if we'd be able to reduce or eliminate the use of sync/atomic or sync.Mutex in favor of using channels and goroutines for synchronizing state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants