-
Notifications
You must be signed in to change notification settings - Fork 22
Trying to understand context coding #3
Comments
Hi, thanks for your interest in IZ. In IZ, the context value is the number of bits needed to code the largest component difference between the predicted pixel and the actual pixel. The higher the context value, the more noisy is the pixel (assuming we cannot predict noise). For example, if we predicted (R',G',B')=(123,45,67), but actual pixel is (R,G,B)=(120,50,60), then the deltas are (-3,5,-7). The sign bit is moved to the LSB, so we get the unsigned values (5,10,13). The largest value needs 4 bits: the context value for this pixel is 4. Taking into account wrapping of the 8 bit components, it can be shown that the context can have values 0 ... 8. To code the example pixel, the deltas are coded as We could always code the context value using In IZ uses fixed canonical Huffman codes, with up to To understand the table generation, let's look at the first row (1, 3, 2, 5, 5, 6, 6, 6, 6) for previous context 0. It says that we store a single bit for context update 0→0, three bits for context update 0→1, two bits for context update 0→2 and so on.
All code tables are static. An improvement would be to dynamically adapt tables to the actual image statistics. That is done in Alex's Qic coder; I never got around to implement that for IZ. Please ask if something remains unclear. |
Thanks a lot for your detailed explanation! I now much better understand what is going on under the hood! |
Hi Christoph, I was trying to understand how IZ does its magic and stumbled upon the code in
table.cpp
that I found interesting and struggle to understand completely.I understood that you use
CONTEXT_BITS = 4
to store the integer values 0..8 (soCONTEXT_COUNT = 9
) that encode how many bits are used to code each color channel of a given pixel.But what I struggle to understand is the meaning of
MAX_CODE_LENGTH = 6
, and how are the entries ofstaticdCount
are chosen and then used to calculate the entries ofstaticdBits
(for encoding) anddecodeTable
(for decoding). I figured out that you code the current context depending on the last one, but why and how exactly is this done?Is this a standard approach that I am not aware of? Or have you explained the idea somewhere else?
Any further information is appreciated!
The text was updated successfully, but these errors were encountered: