-
Notifications
You must be signed in to change notification settings - Fork 833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exporter stops sending any spans to collector after timed out export #4406
Labels
bug
Something isn't working
needs:author-response
waiting for author to respond
priority:p2
Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect
Comments
pichlermarc
added
priority:p2
Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect
and removed
triage
labels
Jan 10, 2024
Hi @aelmekeev. Do you have any news about this case? Thanks |
@dgoscn I'll try to give it a test next week, thank you! |
@aelmekeev I think this has been fixed by #4287, could you re-try with the latest version? 🙂 |
@pichlermarc just tested this and I believe the issue is resolved hence I'm closing it. For the context this what I've done:
With "@opentelemetry/api": "^1.7.0",
"@opentelemetry/core": "^1.18.1",
"@opentelemetry/exporter-metrics-otlp-http": "^0.45.1",
"@opentelemetry/exporter-trace-otlp-http": "^0.45.1",
"@opentelemetry/host-metrics": "^0.34.0",
"@opentelemetry/instrumentation": "^0.45.1",
"@opentelemetry/instrumentation-express": "^0.33.3",
"@opentelemetry/instrumentation-graphql": "^0.36.0",
"@opentelemetry/instrumentation-http": "^0.45.1",
"@opentelemetry/resources": "^1.18.1",
"@opentelemetry/sdk-metrics": "^1.18.1",
"@opentelemetry/sdk-trace-base": "^1.18.1",
"@opentelemetry/sdk-trace-node": "^1.18.1",
"@opentelemetry/semantic-conventions": "^1.18.1"
With "@opentelemetry/api": "^1.7.0",
"@opentelemetry/core": "^1.21.0",
"@opentelemetry/exporter-metrics-otlp-http": "^0.48.0",
"@opentelemetry/exporter-trace-otlp-http": "^0.48.0",
"@opentelemetry/host-metrics": "^0.35.0",
"@opentelemetry/instrumentation": "^0.48.0",
"@opentelemetry/instrumentation-express": "^0.35.0",
"@opentelemetry/instrumentation-graphql": "^0.37.0",
"@opentelemetry/instrumentation-http": "^0.48.0",
"@opentelemetry/resources": "^1.21.0",
"@opentelemetry/sdk-metrics": "^1.21.0",
"@opentelemetry/sdk-trace-base": "^1.21.0",
"@opentelemetry/sdk-trace-node": "^1.21.0",
"@opentelemetry/semantic-conventions": "^1.21.0"
|
Glad this solved your problem!
…On Fri, 23 Feb 2024, 15:27 Alex Elmekeev, ***@***.***> wrote:
@pichlermarc <https://github.com/pichlermarc> just tested this and I
believe the issue is resolved hence I'm closing it. For the context this
what I've done:
1. diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.ALL)
2. Set OTEL_BSP_MAX_EXPORT_BATCH_SIZE and OTEL_BSP_MAX_QUEUE_SIZE to
10 locally
3. Start the app and poke it so it will send traces to collector
4. Break the connection between the app and the collector (stop port
forwarding in my case)
5. Poke the app again to observe no traces received by collector
6. Fix the connection between the app and the collector
With
***@***.***/api": "^1.7.0",
***@***.***/core": "^1.18.1",
***@***.***/exporter-metrics-otlp-http": "^0.45.1",
***@***.***/exporter-trace-otlp-http": "^0.45.1",
***@***.***/host-metrics": "^0.34.0",
***@***.***/instrumentation": "^0.45.1",
***@***.***/instrumentation-express": "^0.33.3",
***@***.***/instrumentation-graphql": "^0.36.0",
***@***.***/instrumentation-http": "^0.45.1",
***@***.***/resources": "^1.18.1",
***@***.***/sdk-metrics": "^1.18.1",
***@***.***/sdk-trace-base": "^1.18.1",
***@***.***/sdk-trace-node": "^1.18.1",
***@***.***/semantic-conventions": "^1.18.1"
1. No errors related to traces.
2. App is not connecting back to collector.
With
***@***.***/api": "^1.7.0",
***@***.***/core": "^1.21.0",
***@***.***/exporter-metrics-otlp-http": "^0.48.0",
***@***.***/exporter-trace-otlp-http": "^0.48.0",
***@***.***/host-metrics": "^0.35.0",
***@***.***/instrumentation": "^0.48.0",
***@***.***/instrumentation-express": "^0.35.0",
***@***.***/instrumentation-graphql": "^0.37.0",
***@***.***/instrumentation-http": "^0.48.0",
***@***.***/resources": "^1.21.0",
***@***.***/sdk-metrics": "^1.21.0",
***@***.***/sdk-trace-base": "^1.21.0",
***@***.***/sdk-trace-node": "^1.21.0",
***@***.***/semantic-conventions": "^1.21.0"
1. Observe error while the app has no connection to the collector:
[serve] {"stack":"Error: connect ECONNREFUSED 127.0.0.1:8080\n at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1159:16)\n at TCPConnectWrap.callbackTrampoline (internal/async_hooks.js:130:17)","message":"connect ECONNREFUSED 127.0.0.1:8080","errno":"-61","code":"ECONNREFUSED","syscall":"connect","address":"127.0.0.1","port":"8080","name":"Error"}
1. App is able to connect back to collector!
—
Reply to this email directly, view it on GitHub
<#4406 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIXOJ6B4XTFYINQOPXPEYLYVCKNPAVCNFSM6AAAAABBR3WPACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRRGMZDQNJSGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
needs:author-response
waiting for author to respond
priority:p2
Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect
What happened?
Steps to Reproduce
We have noticed that sometimes after collectors fails to respond on time application silently stops to send any trace to collector unless restarted.
I think this might be related to the issue introduced in #3958 that @Zirak has tried to address as part of #4287. Raising this to get some visibility to the maintainers.
Expected Result
library should not stop sending spans to collector
Actual Result
library stops sending spans to collector
Additional details
With our details for queue and batch size it looks like this is happening relatively often (4 times in last 5 days) during pick traffic.
Although I must admit that I don't understand why adding
finally
can fix this as in my understanding eitherthen
orcatch
would always be triggered forPromise
but bumpingOTEL_BSP_MAX_QUEUE_SIZE
did help with this issue.OpenTelemetry Setup Code
package.json
Relevant log output
The text was updated successfully, but these errors were encountered: