Trying to understand context coding #3

exilef · 2018-10-25T19:03:58Z

Hi Christoph, I was trying to understand how IZ does its magic and stumbled upon the code in table.cpp that I found interesting and struggle to understand completely.

I understood that you use CONTEXT_BITS = 4 to store the integer values 0..8 (so CONTEXT_COUNT = 9) that encode how many bits are used to code each color channel of a given pixel.

But what I struggle to understand is the meaning of MAX_CODE_LENGTH = 6, and how are the entries of staticdCount are chosen and then used to calculate the entries of staticdBits (for encoding) and decodeTable (for decoding). I figured out that you code the current context depending on the last one, but why and how exactly is this done?

Is this a standard approach that I am not aware of? Or have you explained the idea somewhere else?

Any further information is appreciated!

The text was updated successfully, but these errors were encountered:

cfeck · 2018-10-26T00:49:54Z

Hi, thanks for your interest in IZ.

In IZ, the context value is the number of bits needed to code the largest component difference between the predicted pixel and the actual pixel. The higher the context value, the more noisy is the pixel (assuming we cannot predict noise).

For example, if we predicted (R',G',B')=(123,45,67), but actual pixel is (R,G,B)=(120,50,60), then the deltas are (-3,5,-7). The sign bit is moved to the LSB, so we get the unsigned values (5,10,13). The largest value needs 4 bits: the context value for this pixel is 4. Taking into account wrapping of the 8 bit components, it can be shown that the context can have values 0 ... 8.

To code the example pixel, the deltas are coded as XXXX-0101-1010-1101, where XXXX is the code to store the context value 4, and the remaining bits are the binary representation of the unsigned (r,g,b) deltas.

We could always code the context value using CONTEXT_BITS (== 4) bits, but since log_2(9) = 3.17 < 4, we waste a lot, so we try to use less bits for the context value in the common cases. It is coded depended on the previous context value, because we expect that noisy pixels are surrounded by noisy pixels and flat pixels are surrounded by flat pixels.

In staticdCount, the number of bits for each code is listed. Each row is for one previous context, the column is for the context value to code (the table is 16x16 instead of 9x9 due to alignment). For our example, let's assume the previous pixel had the context value 3, then according to staticdCount[3][4] we would need 2 bits to code the 3→4 context value update. Our deltas then need 14 bits: XX-0101-1010-1101.

IZ uses fixed canonical Huffman codes, with up to MAX_CODE_LENGTH bits (6 in the implementation). If we use less than 4 bits for the common cases, we may need more than 4 bits for the uncommon cases. The maximum code length of 6 bits was found by experimentation, but the source code in table.cpp also suggests a code table for MAX_CODE_LENGTH == 4.

To understand the table generation, let's look at the first row (1, 3, 2, 5, 5, 6, 6, 6, 6) for previous context 0. It says that we store a single bit for context update 0→0, three bits for context update 0→1, two bits for context update 0→2 and so on.

staticdBits contains the actual bits for the canonical Huffman codes. For the first row in the table, the codes are: 0, 110, 10, 11100, 11101, 111100, 111101, 111110, 111111. (Exercise: Order these by code length to confirm they are canonical!) Sorting uses plain old qsort() in the implementation.

decodeTable needs to get the context value from the bits in the stream. If your stream starts with the bits 110110110110... then you would see that it is indeed context value 1 (leading 110 bits). You only need to examine up to MAX_CODE_LENGTH bits, which means decodeTable has MAX_CODE_VALUE (== 64) entries for each previous context. Eight of them (110xxx) all map to the value 1. The first 32 of them (0xxxxx) map to the value 0, and so on. Enable CHECKTABLE to see the resulting table entries; they start with 32x 0, then 16x 2, then 8x 1 etc.

All code tables are static. An improvement would be to dynamically adapt tables to the actual image statistics. That is done in Alex's Qic coder; I never got around to implement that for IZ.

Please ask if something remains unclear.

exilef · 2018-10-30T01:03:59Z

Thanks a lot for your detailed explanation! I now much better understand what is going on under the hood!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to understand context coding #3

Trying to understand context coding #3

exilef commented Oct 25, 2018

cfeck commented Oct 26, 2018

exilef commented Oct 30, 2018

Trying to understand context coding #3

Trying to understand context coding #3

Comments

exilef commented Oct 25, 2018

cfeck commented Oct 26, 2018

exilef commented Oct 30, 2018