Skip to content

Commit

Permalink
Merge pull request #133 from gogoex/bech32-mod-2
Browse files Browse the repository at this point in the history
Add bech32_mod for double public key encoding/decoding
  • Loading branch information
aguycalled authored Dec 4, 2023
2 parents 1fefaff + 7bc3b97 commit 9177239
Show file tree
Hide file tree
Showing 14 changed files with 818 additions and 7 deletions.
4 changes: 2 additions & 2 deletions .cirrus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ task:
cpu: 2
memory: 5G
docker_arguments:
CI_IMAGE_NAME_TAG: ubuntu:lunar
CI_IMAGE_NAME_TAG: ubuntu:23.04
FILE_ENV: "./ci/test/00_setup_env_native_tidy.sh"
# For faster CI feedback, immediately schedule the linters
<< : *CREDITS_TEMPLATE
Expand Down Expand Up @@ -210,7 +210,7 @@ task:
cpu: 6 # Increase CPU and Memory to avoid timeout
memory: 24G
docker_arguments:
CI_IMAGE_NAME_TAG: ubuntu:lunar
CI_IMAGE_NAME_TAG: ubuntu:23.04
FILE_ENV: "./ci/test/00_setup_env_native_tsan.sh"
env:
<< : *CIRRUS_EPHEMERAL_WORKER_TEMPLATE_ENV
Expand Down
3 changes: 3 additions & 0 deletions build_msvc/bench_bench_navcoin/bench_bench_navcoin.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@
<ClCompile Include="..\..\src\bench\bech32.cpp">
<ObjectFileName>$(IntDir)bench_bech32.obj</ObjectFileName>
</ClCompile>
<ClCompile Include="..\..\src\bench\bech32_mod.cpp">
<ObjectFileName>$(IntDir)bench_bech32_mod.obj</ObjectFileName>
</ClCompile>
<ClCompile Include="..\..\src\bench\bench.cpp">
<ObjectFileName>$(IntDir)bench_bench.obj</ObjectFileName>
</ClCompile>
Expand Down
2 changes: 1 addition & 1 deletion ci/test/00_setup_env_i686_multiprocess.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ export LC_ALL=C.UTF-8

export HOST=i686-pc-linux-gnu
export CONTAINER_NAME=ci_i686_multiprocess
export CI_IMAGE_NAME_TAG=ubuntu:20.04
export CI_IMAGE_NAME_TAG="docker.io/amd64/ubuntu:20.04"
export PACKAGES="cmake python3 llvm clang g++-multilib"
export DEP_OPTS="DEBUG=1 MULTIPROCESS=1"
export GOAL="install"
Expand Down
6 changes: 3 additions & 3 deletions ci/test/00_setup_env_mac.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@
export LC_ALL=C.UTF-8

export CONTAINER_NAME=ci_macos_cross
export CI_IMAGE_NAME_TAG=ubuntu:20.04 # Check that Focal can cross-compile to macos
export CI_IMAGE_NAME_TAG=ubuntu:22.04 # Check that Jammy can cross-compile to macos
export HOST=x86_64-apple-darwin
export PACKAGES="cmake libz-dev libtinfo5 python3-setuptools xorriso"
export PACKAGES="cmake libz-dev libtinfo5 python3-setuptools xorriso zip"
export XCODE_VERSION=12.2
export XCODE_BUILD_ID=12B45b
export RUN_UNIT_TESTS=false
export RUN_FUNCTIONAL_TESTS=false
export GOAL="deploy"
export BITCOIN_CONFIG="--enable-reduce-exports"
export BITCOIN_CONFIG="--enable-reduce-exports LDFLAGS=-Wno-error=unused-command-line-argument"
2 changes: 1 addition & 1 deletion ci/test/00_setup_env_native_nowallet_libbitcoinkernel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
export LC_ALL=C.UTF-8

export CONTAINER_NAME=ci_native_nowallet_libbitcoinkernel
export CI_IMAGE_NAME_TAG="ubuntu:20.04"
export CI_IMAGE_NAME_TAG="docker.io/ubuntu:20.04"
# Use minimum supported python3.8 and clang-8, see doc/dependencies.md
export PACKAGES="python3-zmq clang-8 llvm-8 libc++abi-8-dev libc++-8-dev"
export DEP_OPTS="NO_WALLET=1 CC=clang-8 CXX='clang++-8 -stdlib=libc++'"
Expand Down
287 changes: 287 additions & 0 deletions doc/bech32-mod-gen-poly.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,287 @@
# Bech32_mod generator polynomial generation

## Summary
We made modification to bech32 implementation of Bitcoin so that it worked with 165-character bech32 string perfectly detecting up to 5 errors.

To accomplish that, we replaced the 6-degree generator polynomial originally used by bech32 by an 8-degree one as [Bitcoin Cash's cashaddr implementation](https://github.com/bitcoin-cash-node/bitcoin-cash-node/blob/master/src/cashaddr.cpp) and [Jamtis](https://gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024) of Monero have done.

In order to find an 8-degree polynomial for our need, we followed the Jamtis polynomial search procedure which is explained in detail in [this document](https://gist.github.com/tevador/5b3fbbd0877a3412ede07263c6b2663d) with a little modificaiton to meet our requirements.

Here are the requirements we had:

1. The generator polynomial should be capable of perfectly detecting up to 5 errors in 165-character bech32 string.
- We encode 96-byte double public keys into bech32 format. Converting 96-byte vector utilizing all bits of each byte into those that only uses 5 bits of each byte, we end up with 154-byte vector (96 * 8 / 5 = 153.6). In addition, 8-byte checksum, 2-byte HRP and 1-byte separator are needed. Putting those together, the resulting bech32 string became 165-character long.
2. The generator polynomial should have the lowest false-positive error rate for 7 and 8 error cases when the input string is 50-character long.

To find a polynomial satisfying above requirements, we first generated 10-million random degree-8 polynomials, and computed false positive error rates for them.

Amongst all, there were two generator polynomials satisfying the first requirement:

```
U1PIRGA7
AJ4RJKVB
```

For both of the polynomials, we computed false positive errors rates for 7 and 8 error cases, and we concluded that `U1PIRGA7` performed better and chose it as the generator polynomial.

## Actual procedure

### 1. Generation of random 10-million degree-8 polynomials
We used [gen_crc.py](https://gist.github.com/tevador/5b3fbbd0877a3412ede07263c6b2663d#:~:text=2.1-,gen_crc.py,-The%20gen_crc.py) used in Jamtis search that is shown below:

```python
# gen_crc.py

import random

CHARSET = "0123456789ABCDEFGHIJKLMNOPQRSTUV"

def gen_to_str(val, degree):
gen_str = ""
for i in range(degree):
gen_str = CHARSET[int(val) % 32] + gen_str
val /= 32
return gen_str

def gen_crc(degree, count, seed=None):
random.seed(seed)
for i in range(count):
while True:
r = random.getrandbits(5 * degree)
if (r % 32) != 0:
break
print(gen_to_str(r, degree))

gen_crc(8, 10000000, 0x584d52)
```

### 2. Calculation of false-positive error rates

To see the performance of all generated polynomials, we used [crccollide.cpp](https://github.com/sipa/ezbase32/blob/master/crccollide.cpp) that is developed by Bitcoin developers. We compiled it with the default parameters as in:

```bash
$ g++ ezbase32/crccollide.cpp -o crccollide -lpthread -O3
```

Then we run it with 5 errors and 120-character threshold.

```bash
$ mkdir results1
$ parallel -a list.txt ./crccollide {} 5 120 ">" results1/{}.txt
```

The execution took approximately 25 days on Core i5-13500

```bash
39762158.30s user 2845631.54s system 1975% cpu 599:14:56.86 total
```

and generated a huge number of the output files in `results1` directory.

After removing polynomials with a result below the threshold by:

```bash
$ find results1 -name "*.txt" -type f -size -2k -delete
```

`16,976` polynomials were left in the `results1` directory.

```bash
$ ls -1 results1 | wc -l
16976
```

Each file in the `results1` directory looked like:

```bash
...
A00C78KL 123 0.000000000000000 0.000000000000000 0.000000000000000 0.000000000000000 0.000000000000000 1.031711484752184 # 100% done
A00C78KL 124 0.000000000000000 0.000000000000000 0.000000000000000 0.000000000000000 0.010575746914933 1.030752602001270 # 100% done
...
```

The descriptions of the columns are:
1. Polynomial in bech32 hex
1. Input string size
1. False positive error rate for 1-error case
1. False positive error rate for 2-error case
1. False positive error rate for 3-error case
1. False positive error rate for 4-error case
1. False positive error rate for 5-error case
1. False positive error rate for 6-error case

## 3. Extraction of polynomials satisfying the requirements

To extract polynomials that can perfectly detect up to 5 errors, we used `err6-high-perf.py` script below that is a modified version of Jamis's [crc_res.py](https://gist.github.com/tevador/5b3fbbd0877a3412ede07263c6b2663d#:~:text=2.3-,crc_res.py,-The%20crc_res.py) script:

```python
# err6-high-perf.py

import os
from typing import Optional, Tuple

def get_rate(filename, num_char) -> Optional[Tuple[str, float]]:
gen = ''
with open(filename) as file:
for line in file:
tokens = line.split()
if len(tokens) == 2 and tokens[1] == "starting":
gen = tokens[0].rstrip(':')
continue
if tokens[0] == gen:
curr_num_char = int(tokens[1])
if curr_num_char != num_char:
continue
err4 = float(tokens[1 + 4])
err5 = float(tokens[1 + 5])
err6 = float(tokens[1 + 6])
if err4 > 0 or err5 > 0:
return None
return (gen, err6)
return None

num_char = 165
dirpath = 'results1'
top_n = 10

gens = []

for entry in os.listdir(dirpath):
filename = os.path.join(dirpath, entry)
if os.path.isfile(filename):
res = get_rate(filename, num_char)
if res is not None:
gens.append(res)

gens.sort(key=lambda x: x[1])

for gen in gens[:top_n]:
print(f"{gen[0]}")
```

This script extracted 2 polynomials.

```bash
$ ./err6-high-perf.py > gens.txt
$ cat gens.txt
U1PIRGA7
AJ4RJKVB
```

Then we built [crccollide.cpp](https://github.com/sipa/ezbase32/blob/master/crccollide.cpp) again with `LENGTH=50` and `ERRORS=4` parameters, and calculated false positive error detection rates of the extracted generators for 7 an 8 error cases:

```bash
$ g++ ezbase32/crccollide.cpp -o crccollide_50_4 -lpthread -O3 -DLENGTH=50 -DERRORS=4 -DTHREADS=4
$ mkdir results2
$ parallel -a gens.txt ./crccollide_50_4 {} ">" results2/{}.txt
```

Comparing the results manually, we found that `U1PIRGA7` is slightly performing better and selected it as the best-performing generator polynomial.

## 4. Generation of mod constants
With the below `enc-gen-to-sage-code.py` script, we generated `SageMath` code to define `U1PIRGA7` as `G`:

```Python
# enc-gen-to-sage-code.py

import sys

if len(sys.argv) < 2:
exit(f'Usage: {sys.argv[0]} [8-char-poly]')

gen = sys.argv[1]

CHARSET = "0123456789ABCDEFGHIJKLMNOPQRSTUV"
degree = 8

def gen_to_str(gen):
gen_str = ""
for i in range(degree):
gen_str = CHARSET[int(gen) % 32] + gen_str
gen /= 32
return gen_str

def str_to_gen(s):
acc = 0
coeffs = []
for c in s:
acc <<= 5
i = CHARSET.index(c)
coeffs.append(i)
acc += i
return (acc, coeffs)

def pf_coeffs(coeffs):
terms = [f'x^{len(coeffs)}']
for (i,coeff) in enumerate(coeffs):
if i == len(coeffs) - 1:
terms.append(f'c({coeff})')
else:
terms.append(f'c({coeff})*x^{len(coeffs)-i-1}')
term_str = ' + '.join(terms)
return f'G = {term_str}'

acc_coeffs = str_to_gen(gen)
print(acc_coeffs)

recovered_gen = gen_to_str(acc_coeffs[0])
if recovered_gen != gen:
exit(f'Expected recovered generator to be {gen}, but got {recov
ered_gen}')

print(pf_coeffs(acc_coeffs[1]))
```

The output was:

```bash
$ ./enc-gen-to-sage-code.py U1PIRGA7
(1032724529479, [30, 1, 25, 18, 27, 16, 10, 7])
G = x^8 + c(30)*x^7 + c(1)*x^6 + c(25)*x^5 + c(18)*x^4 + c(27)*x^3
+ c(16)*x^2 + c(10)*x^1 + c(7)
```

Then we embedded the generated `G = ...` line to the below `SageMath` script which is a modified version of the script in `bech32.cpp` comment, and run it to generate `C++` code to compute a modulo by the generator polynomial.

```python
B = GF(2) # Binary field
BP.<b> = B[] # Polynomials over the binary field
F_mod = b**5 + b**3 + 1
F.<f> = GF(32, modulus=F_mod, repr='int') # GF(32) definition
FP.<x> = F[] # Polynomials over GF(32)
E_mod = x**2 + F.fetch_int(9)*x + F.fetch_int(23)
E.<e> = F.extension(E_mod) # GF(1024) extension field definition
for p in divisors(E.order() - 1): # Verify e has order 1023.
assert((e**p == 1) == (p % 1023 == 0))

c = lambda n: F.fetch_int(n)
G = x^8 + c(30)*x^7 + c(1)*x^6 + c(25)*x^5 + c(18)*x^4 + c(27)*x^3 + c(16)*x^2 + c(10)*x^1 + c(7)

print(G) # Print out the generator

mod_consts = []

for i in [1,2,4,8,16]: # Print out {1,2,4,8,16}*(g(x) mod x^6), packed in hex integers.
v = 0
for coef in reversed((F.fetch_int(i)*(G % x**8)).coefficients(sparse=True)):
v = v*32 + coef.integer_representation()
mod_consts.append("0x%x" % v)

for (i, mod_const) in enumerate(mod_consts):
p = 2**i
s = f' if (c0 & {p}) c ^= {mod_const}; // {{{p}}}k(x) ='
print(s)
```

The generated `C++` code was:

```c++
if (c0 & 1) c ^= 0xf0732dc147; // {1}k(x) =
if (c0 & 2) c ^= 0xa8b6dfa68e; // {2}k(x) =
if (c0 & 4) c ^= 0x193fabc83c; // {4}k(x) =
if (c0 & 8) c ^= 0x322fd3b451; // {8}k(x) =
if (c0 & 16) c ^= 0x640f37688b; // {16}k(x) =
```

We replaced the corresponding part of the `PolyMod` function with this to use `U1PIRGA7` as the generator polynomial.

2 changes: 2 additions & 0 deletions src/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ BITCOIN_CORE_H = \
banman.h \
base58.h \
bech32.h \
bech32_mod.h \
blockencodings.h \
blockfilter.h \
blsct/arith/mcl/mcl.h \
Expand Down Expand Up @@ -780,6 +781,7 @@ libbitcoin_common_a_CXXFLAGS = $(AM_CXXFLAGS) $(PIE_FLAGS)
libbitcoin_common_a_SOURCES = \
base58.cpp \
bech32.cpp \
bech32_mod.cpp \
blsct/arith/elements.cpp \
blsct/double_public_key.cpp \
blsct/public_key.cpp \
Expand Down
1 change: 1 addition & 0 deletions src/Makefile.bench.include
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ bench_bench_navcoin_SOURCES = \
bench/addrman.cpp \
bench/base58.cpp \
bench/bech32.cpp \
bench/bech32_mod.cpp \
bench/bench.cpp \
bench/bench.h \
bench/bench_bitcoin.cpp \
Expand Down
2 changes: 2 additions & 0 deletions src/Makefile.test.include
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ BITCOIN_TESTS =\
test/base58_tests.cpp \
test/base64_tests.cpp \
test/bech32_tests.cpp \
test/bech32_mod_tests.cpp \
test/bip32_tests.cpp \
test/blockchain_tests.cpp \
test/blockencodings_tests.cpp \
Expand Down Expand Up @@ -269,6 +270,7 @@ test_fuzz_fuzz_SOURCES = \
test/fuzz/banman.cpp \
test/fuzz/base_encode_decode.cpp \
test/fuzz/bech32.cpp \
test/fuzz/bech32_mod.cpp \
test/fuzz/bitdeque.cpp \
test/fuzz/block.cpp \
test/fuzz/block_header.cpp \
Expand Down
Loading

0 comments on commit 9177239

Please sign in to comment.