Optimize ASCII digit and hexdigit conversion by using subtraction #21

okaneco · 2022-12-08T04:25:51Z

Subtract digits by their "base" digit to simplify the number of match arms when converting from ASCII digit or hexdigit to u8
Use FromPrimitive::from_u8 in ascii_to_digit after checking for valid ASCII digit range
Condense ascii_to_hexdigit match arms and convert with from_u8
Add FromPrimitive trait bound to required trait impls
Add tests to verify correct parsing of digits and hex digits

I saw some significant wins across the board with this on the benchmarks, hopefully it wasn't just my CPU. ascii_to_hexdigit showed the biggest improvements likely due to the simplification of match arm statements.

Unfortunately, the only way I could figure out how to do this involved adding the FromPrimitive trait bound from num-traits in several places. I think adding another trait bound to a public facing trait makes it a breaking change but I'm not sure. If there's another way to do this, I'd be glad to change it.

Subtract digits by their "base" digit to simplify the number of match arms when converting from ASCII digit or hexdigit to u8 Use FromPrimitive::from_u8 in ascii_to_digit after checking for valid ASCII digit range Condense ascii_to_hexdigit match arms and convert with from_u8 Add FromPrimitive trait bound to required trait impls Add tests to verify correct parsing of digits and hex digits

pacman82 · 2022-12-08T07:17:47Z

Hello @okaneco,

Thanks for your PR.

I saw some significant wins across the board with this on the benchmarks

This is interesting. ascii_to_digit and ascii_to_hexdigit is supposed to be optimized to be evaluated at compile time. No doubt the function does run faster, but it should not matter much if all its possible results (there are only 10 or 16 after all) are not evaluated at runtime for typical usecases like parsing a u32.

I think adding another trait bound to a public facing trait makes it a breaking change but I'm not sure.

Yes it is. E.g. imagine a wrapper around a String which defines an empty text ("") as zero and a single capital I as one, with addition being string concatenation. atoi could then parse 3 into "III".

It is part of the charm of this library that it's definition of what an integer number is, is close to the one a math student might learn (intersections of all sets with neutral Element and successor for each element), so it is applicable to pretty much anything which can be interpreted as a natural number.

There is precedence for this conflict: In math integers do not overflow. Which is why a separate set of traits had been introduced to handle overflow checking behavior of primitives. This allows e.g. FromRadix10 to still work on NumBigInt which can not overflow, which is its point.

It is not unreasonable to assume there is a high correlation between types that can overflow, and the ones that can be constructed from primitives. Let me think about all this, for a bit.

Cheers, Markus

okaneco · 2022-12-08T14:53:59Z

No worries. Thanks for explaining the notion of an integer in this library, that makes more sense to me now.

I'm not an expert in assembly but I think this is roughly what's happening. The current ascii_to_digit is optimized, except this PR version with RangeInclusive::contains emits an lea instead of add. I think this ends up being slightly more efficient because it can do the addition in place instead of going to the arithmetic unit. The current implementation is equivalent to

~~let character = character.wrapping_sub(b'0'); if character < 10 { Some(n as I) } else { None }~~

~~https://rust.godbolt.org/z/rsvq197fc~~ Edit (2023/01/14): this was wrong, see #21 (comment)

ascii_to_hexdigit seems more straightforward in that it creates a massive table for the current implementation.

https://rust.godbolt.org/z/YjYGcW81n

pacman82 · 2022-12-17T00:18:57Z

Had a spare moment. I can confirm the speedups. I'll need a bit more spare time to think about the interface.

Use bitwise operation to convert ASCII lowercase to uppercase allowing for hex digits to share the same if-conditional branch

okaneco · 2023-01-14T21:12:23Z

Take your time, not a problem.

Small update, I was able to achieve another 15-35% speedup on 3 of the 4 hex benchmarks which brings those 3 benchmarks to being twice as fast as main on my machine.

I tried rewriting hexdigit to be non-breaking by removing use of FromPrimitive. The performance of the following version is ~30% faster than main but still 30% slower than using FromPrimitive.

fn ascii_to_hexdigit<I>(character: u8) -> Option<I>
where
    I: Zero + One,
{
    // Unsetting the 6th bit converts ASCII alphabetic lowercase to uppercase.
    //
    // b'A' = 0b_0100_0001 (decimal 65), b'F' = 0b_0100_0110 (decimal 70)
    // b'a' = 0b_0110_0001 (decimal 97), b'f' = 0b_0110_0110 (decimal 102)
    // b'a' & 0b_1101_1111 converts 'a' to 'A'.
    let mask = 0b_1101_1111;

    if matches!(character, b'0'..=b'9') {
        match character {
            b'0' => Some(nth(0)),
            b'1' => Some(nth(1)),
            b'2' => Some(nth(2)),
            b'3' => Some(nth(3)),
            b'4' => Some(nth(4)),
            b'5' => Some(nth(5)),
            b'6' => Some(nth(6)),
            b'7' => Some(nth(7)),
            b'8' => Some(nth(8)),
            b'9' => Some(nth(9)),
            _ => None,
        }
    } else {
        match character & mask {
            b'A' => Some(nth(10)),
            b'B' => Some(nth(11)),
            b'C' => Some(nth(12)),
            b'D' => Some(nth(13)),
            b'E' => Some(nth(14)),
            b'F' => Some(nth(15)),
            _ => None,
        }
    }
}

There was a mistake in the first Compiler Explorer link I posted for ascii_to_digit, when the third line of the ascii_to_digit_contains version is changed to Some((n - b'0') as i32) all 3 functions produce the same instructions. I'm not sure why it performs better in the benchmark.

Optimize hex conversion further, Add more tests

e25676c

Use bitwise operation to convert ASCII lowercase to uppercase allowing for hex digits to share the same if-conditional branch

pacman82 mentioned this pull request Mar 5, 2023

Suggestion: Check simdjson-rs numberparse.rs for performance ideas #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize ASCII digit and hexdigit conversion by using subtraction #21

Optimize ASCII digit and hexdigit conversion by using subtraction #21

okaneco commented Dec 8, 2022

pacman82 commented Dec 8, 2022

okaneco commented Dec 8, 2022 •

edited

Loading

pacman82 commented Dec 17, 2022

okaneco commented Jan 14, 2023

Optimize ASCII digit and hexdigit conversion by using subtraction #21

Are you sure you want to change the base?

Optimize ASCII digit and hexdigit conversion by using subtraction #21

Conversation

okaneco commented Dec 8, 2022

pacman82 commented Dec 8, 2022

okaneco commented Dec 8, 2022 • edited Loading

pacman82 commented Dec 17, 2022

okaneco commented Jan 14, 2023

okaneco commented Dec 8, 2022 •

edited

Loading