Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi: update main package and btcutil to chainhash/v1.1.0, use optimized dsha256 #2075

Merged
merged 5 commits into from
Dec 19, 2023

Conversation

Roasbeef
Copy link
Member

In this commit, we update the top-level btcd package to use the latest
version of btcutil and also the chainhash package. With this version
bump, we can now use the new optimized dsha256 routine where applicable.

With this commit, I've covered most of the areas we'll hash an entire
transaction/block/header, but we may want to optimize some other areas
further, in particular, the witness sighash calc.

@Roasbeef Roasbeef force-pushed the btcutil-psbt-update-dec-2023 branch from fa537e1 to 9922761 Compare December 19, 2023 01:06
@coveralls
Copy link

coveralls commented Dec 19, 2023

Pull Request Test Coverage Report for Build 7268575324

  • 90 of 104 (86.54%) changed or added relevant lines in 4 files are covered.
  • 9 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.02%) to 56.588%

Changes Missing Coverage Covered Lines Changed/Added Lines %
txscript/sighash.go 81 95 85.26%
Files with Coverage Reduction New Missed Lines %
peer/peer.go 9 73.49%
Totals Coverage Status
Change from base Build 7264110533: -0.02%
Covered Lines: 28602
Relevant Lines: 50544

💛 - Coveralls

@Roasbeef
Copy link
Member Author

Here's a benchmark run vs master:

benchmark                         old ns/op     new ns/op     delta
BenchmarkCalcSigHash-8            1081499       1244944       +15.11%
BenchmarkCalcWitnessSigHash-8     34499         32224         -6.59%

benchmark                         old allocs     new allocs     delta
BenchmarkCalcSigHash-8            712            623            -12.50%
BenchmarkCalcWitnessSigHash-8     445            801            +80.00%

benchmark                         old bytes     new bytes     delta
BenchmarkCalcSigHash-8            1290508       933085        -27.70%
BenchmarkCalcWitnessSigHash-8     46992         19936         -57.58%

@Roasbeef
Copy link
Member Author

Here's another comparison, old is w/o the last commit, new is w/ it:

benchmark                          old ns/op     new ns/op     delta
BenchmarkCalcSigHash-32            1866235       1975372       +5.85%
BenchmarkCalcWitnessSigHash-32     65135         45449         -30.22%

benchmark                          old allocs     new allocs     delta
BenchmarkCalcSigHash-32            623            623            +0.00%
BenchmarkCalcWitnessSigHash-32     534            801            +50.00%

benchmark                          old bytes     new bytes     delta
BenchmarkCalcSigHash-32            933080        933081        +0.00%
BenchmarkCalcWitnessSigHash-32     58384         19936         -65.85%

@Roasbeef
Copy link
Member Author

Roasbeef commented Dec 19, 2023

Here's a run with the same computer as the first run above (master vs this PR), but this time with GOAMD64=v3:

benchmark                          old ns/op     new ns/op     delta
BenchmarkCalcSigHash-32            1989442       1837011       -7.66%
BenchmarkCalcWitnessSigHash-32     59216         43734         -26.14%

benchmark                          old allocs     new allocs     delta
BenchmarkCalcSigHash-32            712            623            -12.50%
BenchmarkCalcWitnessSigHash-32     445            801            +80.00%

benchmark                          old bytes     new bytes     delta
BenchmarkCalcSigHash-32            1290506       933081        -27.70%
BenchmarkCalcWitnessSigHash-32     46992         19936         -57.58%

Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice optimization! One go module issue, the rest are just non-blocking ideas for further optimizations since we touch an area that is used very often.

github.com/btcsuite/btcd/chaincfg/chainhash v1.1.0
github.com/davecgh/go-spew v1.1.1
github.com/stretchr/testify v1.7.0
)

require (
github.com/aead/siphash v1.0.1 // indirect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running go mod tidy on this file with Go 1.19 or 1.21 removed all these lines again for some reason... Should we bump the Go version of this package to fix? I think with 1.18 or 1.19 there was some change that made the two files more stable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, I built with go version go1.21.5 darwin/arm64.

I think maybe part of it is that our Go version declaration modules is a bit all over the place?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, could be. I bumped everything to 1.19 in my module cleanup PR: #1825

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, no you're right, go mod tidy eliminated a lot of stuff.

txscript/sighash.go Outdated Show resolved Hide resolved
@@ -289,7 +292,14 @@ func calcWitnessSignatureHashRaw(subScript []byte, sigHashes *TxSigHashes,
binary.LittleEndian.PutUint32(bHashType[:], uint32(hashType))
sigHash.Write(bHashType[:])

return chainhash.DoubleHashB(sigHash.Bytes()), nil
sigHashBytes := chainhash.DoubleHashRaw(func(w io.Writer) error {
// TODO(rosabeef): put entire calc func into this? then no
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do this, yes. Also, there's another instance of DoubleHashB just a few lines above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight decrease here in time/op:

⛰   benchstat old.txt new.txt
name                  old time/op    new time/op    delta
CalcWitnessSigHash-8    31.9µs ± 2%    31.5µs ± 1%  -1.31%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
CalcWitnessSigHash-8    19.9kB ± 0%    19.9kB ± 0%    ~     (all equal)

name                  old allocs/op  new allocs/op  delta
CalcWitnessSigHash-8       801 ± 0%       801 ± 0%    ~     (all equal)

sigHashBytes := chainhash.DoubleHashRaw(func(w io.Writer) error {
// First write out, then encode the transaction's version
// number.
var bVersion [4]byte
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use a couple of these 4 byte arrays. Since PutUint32 overwrites all bytes, I think it would be safe to re-use a single one and just call it scratch? It's likely all on the stack so probably doesn't make a huge difference though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use binary.Write first, but that made things slower, so then I switched to just using a single [8]byte and slicing into that.

Ended up with a nice speed up on all marks:

name                  old time/op    new time/op    delta
CalcWitnessSigHash-8    31.5µs ± 0%    29.2µs ± 0%   -7.05%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
CalcWitnessSigHash-8    19.9kB ± 0%    18.5kB ± 0%   -7.14%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
CalcWitnessSigHash-8       801 ± 0%       445 ± 0%  -44.44%  (p=0.000 n=10+10)

// The script code for a p2wkh is a length prefix
// varint for the next 25 bytes, followed by a
// re-creation of the original p2pkh pk script.
w.Write([]byte{0x19})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this would make a big difference, but to avoid multiple allocations, we could do:

w.Write([]byte{0x19, OP_DUP, OP_HASH160, OP_DATA_20})

instead. But non-blocking of course.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can give it a shot to compare w/ a benchmark.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't make much off a diff seemingly:

name                  old time/op    new time/op    delta
CalcWitnessSigHash-8    29.1µs ± 0%    29.6µs ± 2%  +1.70%  (p=0.000 n=9+9)

name                  old alloc/op   new alloc/op   delta
CalcWitnessSigHash-8    18.5kB ± 0%    18.5kB ± 0%    ~     (all equal)

name                  old allocs/op  new allocs/op  delta
CalcWitnessSigHash-8       445 ± 0%       445 ± 0%    ~     (all equal)

I guess the compiler can optimize those instances.

txscript/sighash.go Show resolved Hide resolved
In this commit, we update the top-level btcd package to use the latest
version of btcutil and also the chainhash package. With this version
bump, we can now use the new optimized dsha256 routine where applicable.

With this commit, I've covered most of the areas we'll hash an entire
transaction/block/header, but we may want to optimize some other areas
further, in particular, the witness sighash calc.
…sighash

In this commit, we optimize the sighash calc further by writing directly
into the buffer used for serialization by the sha256.New() instance
rather than to an intermediate buffer, which is then write to the hash
buffer.
We can write direly into the hash writer vs serializing into a buffer,
then writing that into the hash writer.
We used to use a lot of small buffers for serialization, but now we'll
use one buffer large enough, and slice into it when needed.

``
name                  old time/op    new time/op    delta
CalcWitnessSigHash-8    31.5µs ± 0%    29.2µs ± 0%   -7.05%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
CalcWitnessSigHash-8    19.9kB ± 0%    18.5kB ± 0%   -7.14%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
CalcWitnessSigHash-8       801 ± 0%       445 ± 0%  -44.44%  (p=0.000 n=10+10)
```
@Roasbeef Roasbeef force-pushed the btcutil-psbt-update-dec-2023 branch from 9922761 to 19008ed Compare December 19, 2023 23:02
@Roasbeef
Copy link
Member Author

The final comparison:

goos: linux
goarch: amd64
pkg: github.com/btcsuite/btcd/txscript
cpu: 13th Gen Intel(R) Core(TM) i9-13900K
                      │   old.txt    │               new.txt               │
                      │    sec/op    │   sec/op     vs base                │
CalcSigHash-32          2.151m ±  9%   1.910m ± 4%  -11.23% (p=0.009 n=10)
CalcWitnessSigHash-32   63.55µ ± 15%   42.25µ ± 6%  -33.53% (p=0.000 n=10)
geomean                 369.8µ         284.0µ       -23.18%

                      │    old.txt    │               new.txt                │
                      │     B/op      │     B/op      vs base                │
CalcSigHash-32          1260.3Ki ± 0%   911.2Ki ± 0%  -27.70% (p=0.000 n=10)
CalcWitnessSigHash-32    45.89Ki ± 0%   18.08Ki ± 0%  -60.61% (p=0.000 n=10)
geomean                  240.5Ki        128.3Ki       -46.63%

                      │  old.txt   │               new.txt                │
                      │ allocs/op  │ allocs/op   vs base                  │
CalcSigHash-32          712.0 ± 0%   623.0 ± 0%  -12.50% (p=0.000 n=10)
CalcWitnessSigHash-32   445.0 ± 0%   445.0 ± 0%        ~ (p=1.000 n=10) ¹
geomean                 562.9        526.5        -6.46%

@Roasbeef Roasbeef merged commit 0d666ff into btcsuite:master Dec 19, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants