Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to OpenTelemetry tracing library #1824

Merged
merged 3 commits into from
Jul 21, 2022
Merged

Switch to OpenTelemetry tracing library #1824

merged 3 commits into from
Jul 21, 2022

Conversation

SaveTheRbtz
Copy link
Contributor

@SaveTheRbtz SaveTheRbtz commented Jul 18, 2022

Issue: onflow/flow-go#2796

Description

This diff switches deprecated OpenTracing-based library with standardized OpenTelemetry API. This is also needed since we are migrating Flow production tracing to OTEL Collector that prefers native OTEL formats and libraries.

OpenTelemetry says the following in their docs:

There is no hard requirement to migrate existing application code to OpenTelemetry. However, we do recommend migrating from any OpenTracing instrumentation libraries – libraries used to instrument web frameworks, HTTP clients, database clients, etc. – to their OpenTelemetry equivalents. This will improve support, as many OpenTracing libraries will be retired and may no longer be updated.

PS. While here, regenerated go.cap though I have no idea what i'm doing there.
PPS. This commit will break flow-go interfaces, so it needs to be coupled with a similar diff there. (Not sure if the "Breaking change" label should be used for this or this is only for user-visible changes.)


  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work
  • Code follows the standards mentioned here
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

@codecov
Copy link

codecov bot commented Jul 18, 2022

Codecov Report

Merging #1824 (dc493e3) into master (226a46f) will decrease coverage by 0.00%.
The diff coverage is 86.20%.

@@            Coverage Diff             @@
##           master    #1824      +/-   ##
==========================================
- Coverage   77.70%   77.70%   -0.01%     
==========================================
  Files         294      294              
  Lines       62210    62200      -10     
==========================================
- Hits        48340    48330      -10     
  Misses      12147    12147              
  Partials     1723     1723              
Flag Coverage Δ
unittests 77.70% <86.20%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
runtime/interpreter/interpreter.go 89.51% <ø> (ø)
runtime/runtime.go 86.17% <0.00%> (ø)
runtime/interpreter/interpreter_tracing.go 88.67% <89.28%> (-1.80%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 226a46f...dc493e3. Read the comment docs.

@github-actions
Copy link

github-actions bot commented Jul 18, 2022

Cadence Benchstat comparison

This branch with compared with the base branch onflow:master commit 226a46f
The command for i in {1..N}; do go test ./... -run=XXX -bench=. -benchmem -shuffle=on; done was used.
Bench tests were run a total of 7 times on each branch.

Collapsed results for better readability

old.txtnew.txt
time/opdelta
CheckContractInterfaceFungibleTokenConformance-2198µs ± 3%193µs ± 3%~(p=0.093 n=6+6)
ContractInterfaceFungibleToken-255.3µs ± 7%54.0µs ± 4%~(p=0.383 n=7+7)
InterpretRecursionFib-24.67ms ± 7%4.47ms ± 3%~(p=0.073 n=7+6)
NewInterpreter/new_interpreter-21.48µs ± 3%1.48µs ± 2%~(p=0.700 n=7+6)
NewInterpreter/new_sub-interpreter-22.94µs ± 3%3.24µs ± 4%+10.09%(p=0.001 n=7+7)
ParseArray-211.4ms ± 7%11.3ms ± 5%~(p=1.000 n=7+7)
ParseDeploy/byte_array-216.7ms ± 3%16.8ms ± 3%~(p=0.445 n=7+6)
ParseDeploy/decode_hex-21.50ms ± 2%1.54ms ± 4%~(p=0.234 n=6+7)
ParseFungibleToken/With_memory_metering-2280µs ± 5%280µs ± 6%~(p=1.000 n=7+7)
ParseFungibleToken/Without_memory_metering-2214µs ± 1%215µs ± 3%~(p=1.000 n=5+6)
ParseInfix-29.64µs ± 2%9.63µs ± 4%~(p=0.485 n=6+6)
QualifiedIdentifierCreation/One_level-23.59ns ± 3%3.57ns ± 3%~(p=0.620 n=7+7)
QualifiedIdentifierCreation/Three_levels-2195ns ± 3%193ns ± 2%~(p=0.559 n=7+7)
RuntimeFungibleTokenTransfer-21.50ms ±25%1.66ms ±18%~(p=0.259 n=7+7)
RuntimeResourceDictionaryValues-28.67ms ± 2%8.72ms ± 7%~(p=0.534 n=6+7)
ValueIsSubtypeOfSemaType-2119ns ± 1%121ns ± 2%~(p=0.223 n=6+6)
 
alloc/opdelta
CheckContractInterfaceFungibleTokenConformance-263.6kB ± 0%63.6kB ± 0%~(all equal)
ContractInterfaceFungibleToken-225.1kB ± 0%25.1kB ± 0%~(all equal)
InterpretRecursionFib-21.51MB ± 0%1.51MB ± 0%~(p=0.991 n=7+7)
NewInterpreter/new_interpreter-2928B ± 0%928B ± 0%~(all equal)
NewInterpreter/new_sub-interpreter-21.46kB ± 0%1.46kB ± 0%~(all equal)
ParseArray-22.82MB ± 2%2.87MB ± 0%~(p=0.138 n=7+6)
ParseDeploy/byte_array-24.40MB ± 4%4.41MB ± 5%~(p=0.456 n=7+7)
ParseDeploy/decode_hex-2214kB ± 0%214kB ± 0%~(p=0.140 n=7+7)
ParseFungibleToken/With_memory_metering-236.3kB ± 0%36.3kB ± 0%~(p=0.066 n=7+7)
ParseFungibleToken/Without_memory_metering-236.3kB ± 0%36.3kB ± 0%~(p=0.761 n=7+7)
ParseInfix-22.17kB ± 0%2.17kB ± 0%~(p=0.650 n=7+7)
QualifiedIdentifierCreation/One_level-20.00B 0.00B ~(all equal)
QualifiedIdentifierCreation/Three_levels-264.0B ± 0%64.0B ± 0%~(all equal)
RuntimeFungibleTokenTransfer-2228kB ± 0%229kB ± 1%~(p=0.051 n=6+7)
RuntimeResourceDictionaryValues-22.25MB ± 0%2.25MB ± 0%~(p=0.902 n=7+7)
ValueIsSubtypeOfSemaType-248.0B ± 0%48.0B ± 0%~(all equal)
 
allocs/opdelta
CheckContractInterfaceFungibleTokenConformance-21.05k ± 0%1.05k ± 0%~(all equal)
ContractInterfaceFungibleToken-2441 ± 0%441 ± 0%~(all equal)
InterpretRecursionFib-233.5k ± 0%33.5k ± 0%~(all equal)
NewInterpreter/new_interpreter-213.0 ± 0%13.0 ± 0%~(all equal)
NewInterpreter/new_sub-interpreter-242.0 ± 0%42.0 ± 0%~(all equal)
ParseArray-270.0k ± 0%70.0k ± 0%~(p=0.592 n=7+7)
ParseDeploy/byte_array-2105k ± 0%105k ± 0%~(p=0.592 n=7+7)
ParseDeploy/decode_hex-277.0 ± 0%77.0 ± 0%~(all equal)
ParseFungibleToken/With_memory_metering-21.06k ± 0%1.06k ± 0%~(all equal)
ParseFungibleToken/Without_memory_metering-21.06k ± 0%1.06k ± 0%~(all equal)
ParseInfix-266.0 ± 0%66.0 ± 0%~(all equal)
QualifiedIdentifierCreation/One_level-20.00 0.00 ~(all equal)
QualifiedIdentifierCreation/Three_levels-22.00 ± 0%2.00 ± 0%~(all equal)
RuntimeFungibleTokenTransfer-24.16k ± 0%4.16k ± 0%~(p=0.396 n=7+7)
RuntimeResourceDictionaryValues-237.5k ± 0%37.5k ± 0%~(p=1.000 n=7+7)
ValueIsSubtypeOfSemaType-21.00 ± 0%1.00 ± 0%~(all equal)
 

@SaveTheRbtz SaveTheRbtz added the Language Breaking Change Breaks Cadence contracts deployed on Mainnet label Jul 18, 2022
Copy link
Member

@turbolent turbolent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with tracing, which was added by Ramtin.

@ramtinms Could you please have a look here?

@turbolent
Copy link
Member

@SaveTheRbtz Could you please benchmark locally and see if this change has a significant impact?

@simonhf
Copy link

simonhf commented Jul 18, 2022

@SaveTheRbtz Could you please benchmark locally and see if this change has a significant impact?

@turbolent FYI I think there are two schools of thought about this:

  1. It doesn't matter how fast or slow open tracing is because we can just tweak the sampling rate appropriately, e.g. let's say the new API is twice as slow, so we'd just halve the sampling rate from every 16th TX to every 32nd TX... no harm done?
  2. Use a faster way to collect the actual timing info (the part which is speed critical), and then do all the other processing asynchronously (the part where it doesn't matter how long it takes because we have lots of spare CPU).

I did some timing of the old Jaeger instrumentation here [1] and discovered that surprisingly it's even slower than using zerolog just for the stats collection part which presumably doesn't actually send off any data to the Jaeger server.

FYI I also suggest using a different form of instrumentation which only records the minimum info -- e.g. timestamp and context -- during the time critical code, into a simple memory buffer. Then at the end of the time critical code, the recorded info can be processed asynchronously and even sent to multiple places concurrently, e.g. zlog, prometheus, and Jaeger. I show a working PoC on dummy code here [1] too. This method is the absolute fastest way to record the minimum info and in tests only spends about 50 nanoseconds per record event.

This means if we wanted to, we can record everything without sampling and it's still much faster than sampling with Jaeger. Why? In tests then just recording in the time critical code is about 50 times fasters than logging, which is already faster than Jaeger. Therefore if Jaeger is sampling every 16th TX, because this other technique is 50 times faster, it's possible to sample all TXs and still be faster. This also means that we can add almost arbitrary extra timing instrumentation and still sample all TXs and still be faster...

[1] https://www.notion.so/dapperlabs/Examine-instrumentation-timing-noodle-on-best-design-9abf3b0ce0c34b7fac6e4dc0c4f67603#e190789d2e33497aaa395f34da11712b

@SaveTheRbtz
Copy link
Contributor Author

SaveTheRbtz commented Jul 19, 2022

@turbolent on the cadence side perf impact is negligible since it only operates on (already escaping) data structures. Maybe even slightly faster on the newer version of the code given that it does not have a separate Time.Now() call. That said, let me run cadence specific tests just in case.

PS. Flow tracing microbenchmarks are here: onflow/flow-go#2806 (i'm missing a second reviewer there =))
PPS. I'm of the first school of thought: IMHO if we get into the nano-second territory we should just use pprof (incl. trace). For everything else we can always tweak sampling probability. (Tests are there to check that we do not block and maintain a sane amount of allocations.)

It is quite hard to get a clean run on my laptop and even on a linux server (likely due to Turbo boost), but at least it shows that allocations did not change.
name                                              old time/op    new time/op    delta
pkg:github.com/onflow/cadence/runtime goos:darwin goarch:arm64
RuntimeFungibleTokenTransfer-8                       348µs ± 2%     348µs ± 4%    ~     (p=0.541 n=8+9)
RuntimeResourceDictionaryValues-8                   2.35ms ± 3%    2.37ms ± 5%    ~     (p=0.631 n=10+10)
pkg:github.com/onflow/cadence/runtime/interpreter goos:darwin goarch:arm64
ValueIsSubtypeOfSemaType-8                          41.6ns ± 1%    41.5ns ± 2%    ~     (p=0.305 n=9+10)
pkg:github.com/onflow/cadence/runtime/parser goos:darwin goarch:arm64
ParseDeploy/byte_array-8                            6.84ms ± 1%    6.90ms ± 1%  +0.95%  (p=0.000 n=10+9)
ParseDeploy/decode_hex-8                             597µs ± 2%     600µs ± 1%    ~     (p=0.113 n=10+9)
ParseInfix-8                                        4.00µs ± 1%    4.01µs ± 1%    ~     (p=0.085 n=10+10)
ParseFungibleToken/Without_memory_metering-8        89.7µs ± 1%    90.3µs ± 1%    ~     (p=0.065 n=10+9)
ParseFungibleToken/With_memory_metering-8            109µs ± 2%     114µs ± 3%  +3.95%  (p=0.000 n=10+10)
ParseArray-8                                        4.62ms ± 1%    4.73ms ± 2%  +2.43%  (p=0.000 n=9+10)
pkg:github.com/onflow/cadence/runtime/sema goos:darwin goarch:arm64
QualifiedIdentifierCreation/One_level-8             2.06ns ± 1%    2.04ns ± 1%  -0.81%  (p=0.001 n=9+9)
QualifiedIdentifierCreation/Three_levels-8          63.2ns ± 2%    62.3ns ± 0%  -1.49%  (p=0.000 n=9+9)
pkg:github.com/onflow/cadence/runtime/tests/checker goos:darwin goarch:arm64
ContractInterfaceFungibleToken-8                    21.0µs ± 1%    20.9µs ± 0%  -0.78%  (p=0.001 n=9+10)
CheckContractInterfaceFungibleTokenConformance-8    65.7µs ± 1%    64.8µs ± 3%  -1.49%  (p=0.019 n=9+9)
pkg:github.com/onflow/cadence/runtime/tests/interpreter goos:darwin goarch:arm64
NewInterpreter/new_interpreter-8                     524ns ± 2%     521ns ± 1%    ~     (p=0.055 n=10+10)
NewInterpreter/new_sub-interpreter-8                1.15µs ± 3%    1.13µs ± 1%  -1.44%  (p=0.027 n=10+10)
InterpretRecursionFib-8                             1.64ms ± 3%    1.61ms ± 2%  -2.35%  (p=0.006 n=10+9)

name                                              old alloc/op   new alloc/op   delta
pkg:github.com/onflow/cadence/runtime goos:darwin goarch:arm64
RuntimeFungibleTokenTransfer-8                       233kB ± 1%     233kB ± 1%    ~     (p=0.549 n=10+9)
RuntimeResourceDictionaryValues-8                   2.23MB ± 0%    2.23MB ± 0%    ~     (p=0.497 n=10+9)
pkg:github.com/onflow/cadence/runtime/interpreter goos:darwin goarch:arm64
ValueIsSubtypeOfSemaType-8                           48.0B ± 0%     48.0B ± 0%    ~     (all equal)
pkg:github.com/onflow/cadence/runtime/parser goos:darwin goarch:arm64
ParseDeploy/byte_array-8                            4.38MB ± 3%    4.38MB ± 1%    ~     (p=0.853 n=10+10)
ParseDeploy/decode_hex-8                             214kB ± 0%     214kB ± 0%    ~     (p=0.438 n=10+10)
ParseInfix-8                                        2.17kB ± 0%    2.17kB ± 0%    ~     (p=1.000 n=10+10)
ParseFungibleToken/Without_memory_metering-8        36.3kB ± 0%    36.3kB ± 0%    ~     (p=0.641 n=10+10)
ParseFungibleToken/With_memory_metering-8           36.3kB ± 0%    36.3kB ± 0%    ~     (p=0.126 n=8+10)
ParseArray-8                                        2.78MB ± 1%    2.78MB ± 1%    ~     (p=0.853 n=10+10)
pkg:github.com/onflow/cadence/runtime/sema goos:darwin goarch:arm64
QualifiedIdentifierCreation/One_level-8              0.00B          0.00B         ~     (all equal)
QualifiedIdentifierCreation/Three_levels-8           64.0B ± 0%     64.0B ± 0%    ~     (all equal)
pkg:github.com/onflow/cadence/runtime/tests/checker goos:darwin goarch:arm64
ContractInterfaceFungibleToken-8                    25.1kB ± 0%    25.1kB ± 0%  -0.00%  (p=0.003 n=10+8)
CheckContractInterfaceFungibleTokenConformance-8    63.6kB ± 0%    63.6kB ± 0%  -0.00%  (p=0.029 n=10+7)
pkg:github.com/onflow/cadence/runtime/tests/interpreter goos:darwin goarch:arm64
NewInterpreter/new_interpreter-8                      928B ± 0%      928B ± 0%    ~     (all equal)
NewInterpreter/new_sub-interpreter-8                1.46kB ± 0%    1.46kB ± 0%    ~     (all equal)
InterpretRecursionFib-8                             1.51MB ± 0%    1.51MB ± 0%    ~     (p=0.598 n=10+10)

name                                              old allocs/op  new allocs/op  delta
pkg:github.com/onflow/cadence/runtime goos:darwin goarch:arm64
RuntimeFungibleTokenTransfer-8                       4.16k ± 0%     4.16k ± 0%    ~     (all equal)
RuntimeResourceDictionaryValues-8                    37.4k ± 0%     37.4k ± 0%    ~     (all equal)
pkg:github.com/onflow/cadence/runtime/interpreter goos:darwin goarch:arm64
ValueIsSubtypeOfSemaType-8                            1.00 ± 0%      1.00 ± 0%    ~     (all equal)
pkg:github.com/onflow/cadence/runtime/parser goos:darwin goarch:arm64
ParseDeploy/byte_array-8                              105k ± 0%      105k ± 0%    ~     (p=0.671 n=10+10)
ParseDeploy/decode_hex-8                              77.0 ± 0%      77.0 ± 0%    ~     (all equal)
ParseInfix-8                                          66.0 ± 0%      66.0 ± 0%    ~     (all equal)
ParseFungibleToken/Without_memory_metering-8         1.06k ± 0%     1.06k ± 0%    ~     (all equal)
ParseFungibleToken/With_memory_metering-8            1.06k ± 0%     1.06k ± 0%    ~     (all equal)
ParseArray-8                                         70.0k ± 0%     70.0k ± 0%    ~     (p=0.838 n=10+10)
pkg:github.com/onflow/cadence/runtime/sema goos:darwin goarch:arm64
QualifiedIdentifierCreation/One_level-8               0.00           0.00         ~     (all equal)
QualifiedIdentifierCreation/Three_levels-8            2.00 ± 0%      2.00 ± 0%    ~     (all equal)
pkg:github.com/onflow/cadence/runtime/tests/checker goos:darwin goarch:arm64
ContractInterfaceFungibleToken-8                       441 ± 0%       441 ± 0%    ~     (all equal)
CheckContractInterfaceFungibleTokenConformance-8     1.05k ± 0%     1.05k ± 0%    ~     (all equal)
pkg:github.com/onflow/cadence/runtime/tests/interpreter goos:darwin goarch:arm64
NewInterpreter/new_interpreter-8                      13.0 ± 0%      13.0 ± 0%    ~     (all equal)
NewInterpreter/new_sub-interpreter-8                  42.0 ± 0%      42.0 ± 0%    ~     (all equal)
InterpretRecursionFib-8                              33.5k ± 0%     33.5k ± 0%    ~     (all equal)

@turbolent
Copy link
Member

@simonhf @SaveTheRbtz Thank you for the info. Looks good!

@@ -1,10 +1,8 @@
github.com/onflow/cadence ()

github.com/davecgh/go-spew/spew (file)
github.com/onflow/cadence/runtime/interpreter ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this expected ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess no entry is equivalent to ()? @fxamacker do you maybe know?

Copy link
Contributor

@ramtinms ramtinms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great to me, thanks for using attributes instead of events (aka logs) here, makes more sense

@turbolent turbolent merged commit 807f1df into master Jul 21, 2022
@turbolent turbolent deleted the rbtz/otel0 branch July 21, 2022 22:18
bors bot added a commit to onflow/flow-go that referenced this pull request Jul 26, 2022
2823: [All] OpenTracing to OpenTelemetry migration r=SaveTheRbtz a=SaveTheRbtz

Issue: #2796

OpenTracing spec is deprecated: opentracing/specification#163.  Switching to the supported OpenTelemetry implementation.

Note that new tracing code would honor [OTEL env vars](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.12.0/specification/protocol/exporter.md) instead of JAEGER ones, e.g.:
`JAEGER_ENDPOINT` -> `OTEL_EXPORTER_OTLP_TRACES_ENDPOINT`, `OTEL_EXPORTER_OTLP_TRACES_INSECURE`, etc

(cc: `@sjonpaulbrown` `@haroldsphinx)`

New API is slightly slower and consumes more memory but does substantially less allocs (~ -50%):
```
name                              old time/op    new time/op    delta
StartSpanFromParent-8                485ns ± 6%     639ns ± 1%  +31.71%  (p=0.000 n=10+9)
StartTransactionSpan/cacheHit-8      993ns ± 4%     927ns ± 3%   -6.69%  (p=0.000 n=10+10)
StartTransactionSpan/cacheMiss-8    2.24µs ± 1%    2.19µs ± 1%   -2.42%  (p=0.000 n=10+10)

name                              old alloc/op   new alloc/op   delta
StartSpanFromParent-8                 577B ± 0%      864B ± 0%  +49.74%  (p=0.000 n=10+10)
StartTransactionSpan/cacheHit-8       776B ± 1%      944B ± 0%  +21.65%  (p=0.000 n=10+8)
StartTransactionSpan/cacheMiss-8    2.16kB ± 1%    2.39kB ± 0%  +10.62%  (p=0.000 n=10+10)

name                              old allocs/op  new allocs/op  delta
StartSpanFromParent-8                 7.00 ± 0%      4.00 ± 0%  -42.86%  (p=0.000 n=10+10)
StartTransactionSpan/cacheHit-8       10.3 ± 7%       6.0 ± 0%  -41.75%  (p=0.000 n=10+10)
StartTransactionSpan/cacheMiss-8      35.3 ± 2%      18.0 ± 0%  -49.01%  (p=0.000 n=10+10)
```

An example of both extensive tracing and cadence tracing enabled:
<img width="1459" alt="Screen Shot 2022-07-25 at 10 21 07 PM" src="https://user-images.githubusercontent.com/169976/180929610-edbfdc54-e604-401f-aea5-501af2f861b2.png">

- [x] test against localnet's Tempo instance.
- [x] noop tracer.
- [x] log tracer.
- [x] merge cadence bits onflow/cadence#1824.
- [x] merge onflow/flow-core-contracts#298.
- [x] update cadence.

Co-authored-by: Alexey Ivanov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Improvement Language Breaking Change Breaks Cadence contracts deployed on Mainnet Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants