Duplicate events coming from lambda workers #140

cursedquail · 2024-03-01T16:31:30Z

Versions

Lambda runtime: al2023
Lambda extension: 11.2.3, arm64

Steps to reproduce

I have no steps to reproduce, only solid evidence that this is happening.

Invoke is the top-level span name for a lambda invocation - while there may be many of them for a given trace, there should only be one for a given span :). If I zoom into any of these, I can see that they appear to be exactly the same event - same timestamp, columns, data, etc.

Our lambda pipeline should be pretty standard, we're:

configuring beeline to write events to stdout
running the lambda extension, which points directly at the honeycomb api

Additional context

I suspect that what's happening here is the following:

logs api yeets some logs at the extension process
extension process receives them, turns them into events, and gives them to libhoney
libhoney queues the events up to be sent to honeycomb, and makes the http request to the api
because libhoney is acting asynchronously to the logs http server, the logs http server returns
lambda freezes the extension while the events are in flight
lambda wakes up the extension some time later
from the extensions POV, the upstream API just timed out (it has no way of knowing that it was frozen). So it retries!
Some of those retries work, and we get duplicate events.

If I'm right, then I think what's needed is to synchronously flush the events after each "batch", thus ensuring libhoney does it's work while the logs server is processing the event.

The text was updated successfully, but these errors were encountered:

JamieDanielson · 2024-03-07T16:53:57Z

ℹ️ FYI for oncall folks, find a bit of additional context in internal slack channel

cursedquail added the type: bug Something isn't working label Mar 1, 2024

cursedquail mentioned this issue Mar 1, 2024

Not receiving shutdown events #141

Open

robbkidd added the status: oncall Flagged for awareness from Honeycomb Telemetry Oncall label Mar 1, 2024

MikeGoldsmith removed the status: oncall Flagged for awareness from Honeycomb Telemetry Oncall label Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate events coming from lambda workers #140

Duplicate events coming from lambda workers #140

cursedquail commented Mar 1, 2024

JamieDanielson commented Mar 7, 2024

Duplicate events coming from lambda workers #140

Duplicate events coming from lambda workers #140

Comments

cursedquail commented Mar 1, 2024

JamieDanielson commented Mar 7, 2024