Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choose a standard dict size #14

Open
nemequ opened this issue Nov 12, 2015 · 2 comments
Open

Choose a standard dict size #14

nemequ opened this issue Nov 12, 2015 · 2 comments

Comments

@nemequ
Copy link

nemequ commented Nov 12, 2015

AFAICT there is no guidance in LZHAM as to what the default dictionary size should be. Using 0 results in an error (instead of choosing a sensible default, as happens for most parameters).

The test program will use a dict_size_log2 of 28 on x86_64 and LZHAM_MAX_DICT_SIZE_LOG2_X86 (aka 26) on x86, but based on the comments in lzham.h:144, "The values of m_dict_size_log2, m_table_update_rate, m_table_max_update_interval, and m_table_update_interval_slow_rate MUST match during compression and decompression." Since 28 > LZHAM_MAX_DICT_SIZE_LOG2_X86, AFAICT it isn't possible to decompress something on x86 which has been compressed using the default parameters on x86_64.

lzham_lzcomp_internal.h seems to like a default value of 22.

I think lzham.h should define a LZHAM_DEFAULT_DICT_SIZE_LOG2, which should be <= LZHAM_MAX_DICT_SIZE_LOG2_X86 (somewhere around 22-24 seems reasonable, IMHO). That should then be used by lzhamtest regardless of architecture, unless a different value is specified on the command line.

@richgel999
Copy link
Owner

Yes, this is definitely reasonable and I'll make this change in lzham_codec_devel (which will be v1.1).

@jspohr
Copy link

jspohr commented Jan 21, 2018

Hi,
I have a question that regards the dict size, for which I didn't see enough need to open a separate issue. Under "Usage", the readme says:

Always try to use the smallest dictionary size that makes sense for the file or block you are compressing, i.e. don't use a 128MB dictionary for a 15KB file. The codec doesn't automatically choose for you because in streaming scenarios it has no idea how large the file or block will be.
The larger the dictionary, the more RAM is required during compression and decompression. I would avoid using more than 8-16MB dictionaries on iOS.

Could you please clarify whether there are any downsides to a large dictionary, besides memory usage? If I understand correctly, unbuffered decompression does not allocate memory for the dictionary, and a PC with virtual memory doesn't commit the parts of the dictionary that the compressor doesn't write to.
So in my scenario, where I compress individual game assets offline on PC (none of them larger than a few megs), and decompress unbuffered on device, can't I just use a large dictionary regardless of file size, and be done with it?
Thanks in advance for your answer, it's highly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants