Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large file error #19

Open
BorisMansencal opened this issue Nov 4, 2016 · 3 comments
Open

Large file error #19

BorisMansencal opened this issue Nov 4, 2016 · 3 comments

Comments

@BorisMansencal
Copy link

Hi,

First of all, thank you for this excellent open source codec.

I am using LZHAM to compress a rather large buffer (6140582460 bytes). I compress by chunks of maximum UINT32_MAX (=4294967295) bytes , but it fails (on one particular buffer).

Is UINT32_MAX allowed for lzham_compress_memory() or is it too large ?

In release I have a core dumped.
When compiled in Debug mode, I have the following backtrace:

#0  0x00007fffda8329c8 in raise () at /lib64/libc.so.6
#1  0x00007fffda83465a in abort () at /lib64/libc.so.6
#2  0x00007fffda82b187 in __assert_fail_base () at /lib64/libc.so.6
#3  0x00007fffda82b232 in  () at /lib64/libc.so.6
#4  0x00007fffd9f4c238 in lzham_debug_break() () at /net/Tools/lzham_codec/lzhamdecomp/lzham_platform.cpp:67
#5  0x00007fffd9f2ff7a in lzham_assert(char const*, char const*, unsigned int) (pExp=0x7fffd9f538c8 "new_capacity && (new_capacity > m_capacity)", pFile=0x7fffd9f53840 "/net/Tools/lzham_codec/lzhamdecomp/lzham_vector.cpp", line=28) at /net/Tools/lzham_codec/lzhamdecomp/lzham_assert.cpp:24
#6  0x00007fffd9f525d1 in lzham::elemental_vector::increase_capacity(unsigned int, bool, unsigned int, void (*)(void*, void*, unsigned int), bool) (this=0x7ffe3e6155f8, min_new_capacity=2147603813, grow_hint=true, element_size=1, pMover=0x0, nofail=true)
    at /net/Tools/lzham_codec/lzhamdecomp/lzham_vector.cpp:28
#7  0x00007fffda16909c in lzham::vector<unsigned char>::increase_capacity(unsigned int, bool, bool) (this=0x7ffe3e6155f8, min_new_capacity=2147603813, grow_hint=true, nofail=true) at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:573
#8  0x00007fffda168877 in lzham::vector<unsigned char>::try_resize(unsigned int, bool) (this=0x7ffe3e6155f8, new_size=2147603813, grow_hint=true)
    at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:183
#9  0x00007fffda170238 in lzham::vector<unsigned char>::insert(unsigned int, unsigned char const*, unsigned int) (this=0x7ffe3e6155f8, index=2147331299, p=0xa6e3e0 "A9\222<υ\266\241\354\301\256\r\346\022\277\271hQ\374F\"V\226\035\356['3\255\346\t\357\217\n\324\304\344n\034\325G\320\300\031\231\223\375\061$\213\314\321;}IY\271\335x\201\r\202\071bf\315~\210D/\aF^\347\256k\315\037fj\243'AߪJ\032\372.\307\317\061y\227ԋ`\352\307U\271\201\067n=ޚ\377\231\227?*Gf\347\270\363\275\271\372ؾ\205\061\002t\246\321\061\026\021+\347\b\221d\263F\275\373\354\376\331mޣ\233t\342\070m\025\354\220-|6\t/\177\037ߖΪN^I\030<LQ\307\066\aD\377\300ޝ\027\312\301\262\017\027\325\365A\245\004\222I\244z\231\377pyi"..., n=272514)
    at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:251
#10 0x00007fffda16f8e4 in lzham::vector<unsigned char>::append(lzham::vector<unsigned char> const&) (this=0x7ffe3e6155f8, other=...)
    at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:297
#11 0x00007fffda16e93d in lzham::lzcompressor::compress_block_internal(void const*, unsigned int) (this=0x7ffe3e612010, pBuf=0x7fff5226f010, buf_len=524288) at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp_internal.cpp:1951
#12 0x00007fffda16d31c in lzham::lzcompressor::compress_block(void const*, unsigned int) (this=0x7ffe3e612010, pBuf=0x7fff5226f010, buf_len=524288)
    at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp_internal.cpp:1459
#13 0x00007fffda16a85a in lzham::lzcompressor::put_bytes(void const*, unsigned int) (this=0x7ffe3e612010, pBuf=0x7ffe5a46f010, buf_len=4294967295)
    at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp_internal.cpp:515
#14 0x00007fffda167e3f in lzham::lzham_lib_compress_memory(lzham_compress_params const*, unsigned char*, unsigned long*, unsigned char const*, unsigned long, unsigned int*) (pParams=0x7fffffffdbb0, pDst_buf=0x7ffce0992010 "", pDst_len=0x7fffffffdb98, pSrc_buf=0x7ffe5a46f010 "?", src_len=4294967295, pAdler32=0x0) at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp.cpp:353
#15 0x00007ffff6b2ee06 in lzham_compress_memory(lzham_compress_params const*, lzham_uint8*, size_t*, lzham_uint8 const*, size_t, lzham_uint32*) (pParams=0x7fffffffdbb0, pDst_buf=0x7ffce0992010 "", pDst_len=0x7fffffffdb98, pSrc_buf=0x7ffe5a46f010 "?", src_len=4294967295, pAdler32=0x0)
    at /net/Tools/lzham_codec/lzhamdll/lzham_api.cpp:81

It seems I have a vector of capacity 2147487728, that size is to be increased to at least 2147603813,
but next_pow2(2147603813)=4294967296 > UINT32_MAX=4294967295.

Is it a bug or should I use smaller chunks ?

This is on Linux x86_64 (Fedora 22, gcc 5.3.1), on an Intel Xeon CPU E5-2695 v3.

Thank you for you help.

Boris.

@richgel999
Copy link
Owner

Hi Boris - I'll check into this ASAP. I'm guessing it's a bug in
lzham_compress_memory(),
because the main (low-level) API has been tested on extremely large files.

On Fri, Nov 4, 2016 at 10:27 AM, Boris Mansencal [email protected]
wrote:

Hi,

First of all, thank you for this excellent open source codec.

I am using LZHAM to compress a rather large buffer (6140582460 bytes). I
compress by chunks of maximum UINT32_MAX (=4294967295) bytes , but it fails
(on one particular buffer).

Is UINT32_MAX allowed for lzham_compress_memory() or is it too large ?

In release I have a core dumped.
When compiled in Debug mode, I have the following backtrace:

#0 0x00007fffda8329c8 in raise () at /lib64/libc.so.6
#1 0x00007fffda83465a in abort () at /lib64/libc.so.6
#2 0x00007fffda82b187 in assert_fail_base () at /lib64/libc.so.6
#3 0x00007fffda82b232 in () at /lib64/libc.so.6
#4 0x00007fffd9f4c238 in lzham_debug_break() () at /net/Tools/lzham_codec/lzhamdecomp/lzham_platform.cpp:67
#5 0x00007fffd9f2ff7a in lzham_assert(char const
, char const
, unsigned int) (pExp=0x7fffd9f538c8 "new_capacity && (new_capacity > m_capacity)", pFile=0x7fffd9f53840 "/net/Tools/lzham_codec/lzhamdecomp/lzham_vector.cpp", line=28) at /net/Tools/lzham_codec/lzhamdecomp/lzham_assert.cpp:24
#6 0x00007fffd9f525d1 in lzham::elemental_vector::increase_capacity(unsigned int, bool, unsigned int, void ()(void, void_, unsigned int), bool) (this=0x7ffe3e6155f8, min_new_capacity=2147603813, grow_hint=true, element_size=1, pMover=0x0, nofail=true)
at /net/Tools/lzham_codec/lzhamdecomp/lzham_vector.cpp:28
#7 0x00007fffda16909c in lzham::vector::increase_capacity(unsigned int, bool, bool) (this=0x7ffe3e6155f8, min_new_capacity=2147603813, grow_hint=true, nofail=true) at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:573
#8 0x00007fffda168877 in lzham::vector::try_resize(unsigned int, bool) (this=0x7ffe3e6155f8, new_size=2147603813, grow_hint=true)
at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:183
#9 0x00007fffda170238 in lzham::vector::insert(unsigned int, unsigned char const_, unsigned int) (this=0x7ffe3e6155f8, index=2147331299, p=0xa6e3e0 "A9\222<υ\266\241\354\301\256\r\346\022\277\271hQ\374F"V\226\035\356['3\255\346\t\357\217\n\324\304\344n\034\325G\320\300\031\231\223\375\061$\213\314\321;}IY\271\335x\201\r\202\071bf\315~\210D/\aF^\347\256k\315\037fj\243'AߪJ\032\372.\307\317\061y\227ԋ`\352\307U\271\201\067n=ޚ\377\231\227?Gf\347\270\363\275\271\372ؾ\205\061\002t\246\321\061\026\021+\347\b\221d\263F\275\373\354\376\331mޣ\233t\342\070m\025\354\220-|6\t/\177\037ߖΪN^I\030<LQ\307\066\aD\377\300ޝ\027\312\301\262\017\027\325\365A\245\004\222I\244z\231\377pyi"..., n=272514)
at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:251
#10 0x00007fffda16f8e4 in lzham::vector::append(lzham::vector const&) (this=0x7ffe3e6155f8, other=...)
at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:297
#11 0x00007fffda16e93d in lzham::lzcompressor::compress_block_internal(void const
, unsigned int) (this=0x7ffe3e612010, pBuf=0x7fff5226f010, buf_len=524288) at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp_internal.cpp:1951
#12 0x00007fffda16d31c in lzham::lzcompressor::compress_block(void const_, unsigned int) (this=0x7ffe3e612010, pBuf=0x7fff5226f010, buf_len=524288)
at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp_internal.cpp:1459
#13 0x00007fffda16a85a in lzham::lzcompressor::put_bytes(void const_, unsigned int) (this=0x7ffe3e612010, pBuf=0x7ffe5a46f010, buf_len=4294967295)
at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp_internal.cpp:515
#14 0x00007fffda167e3f in lzham::lzham_lib_compress_memory(lzham_compress_params const_, unsigned char_, unsigned long_, unsigned char const_, unsigned long, unsigned int_) (pParams=0x7fffffffdbb0, pDst_buf=0x7ffce0992010 "", pDst_len=0x7fffffffdb98, pSrc_buf=0x7ffe5a46f010 "?", src_len=4294967295, pAdler32=0x0) at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp.cpp:353
#15 0x00007ffff6b2ee06 in lzham_compress_memory(lzham_compress_params const_, lzham_uint8_, size_t_, lzham_uint8 const_, size_t, lzham_uint32_) (pParams=0x7fffffffdbb0, pDst_buf=0x7ffce0992010 "", pDst_len=0x7fffffffdb98, pSrc_buf=0x7ffe5a46f010 "?", src_len=4294967295, pAdler32=0x0)
at /net/Tools/lzham_codec/lzhamdll/lzham_api.cpp:81

It seems I have a vector of capacity 2147487728, that size is to be
increased to at least 2147603813,
but next_pow2(2147603813)=4294967296 > UINT32_MAX=4294967295.

Is it a bug or should I use smaller chunks ?

This is on Linux x86_64 (Fedora 22, gcc 5.3.1), on an Intel Xeon CPU
E5-2695 v3.

Thank you for you help.

Boris.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#19, or mute the thread
https://github.com/notifications/unsubscribe-auth/AGnRGc1Vd1udgJdsRoNK1rYAsFSCOKdwks5q62r8gaJpZM4Kpy2t
.

@BorisMansencal
Copy link
Author

Hi,
There may be a bug in lzham_compress_memory(), I don't know.

However, I am wondering if elemental_vector should be changed to better handle large sizes.
Currently, you have the following code:

bool elemental_vector::increase_capacity(uint min_new_capacity, bool grow_hint, uint element_size, object_mover pMover, bool nofail)
{
  [...]
      // new_capacity must be 64-bit when compiling on x64.
      size_t new_capacity = (size_t)min_new_capacity;
      if ((grow_hint) && (!math::is_power_of_2(static_cast<uint64>(new_capacity))))
         new_capacity = static_cast<uint>(math::next_pow2(static_cast<uint64>(new_capacity)));
      LZHAM_ASSERT(new_capacity && (new_capacity > m_capacity));
 [...]
}

I think there should be a check before the assert (when grow_hint==true):

if (new_capacity < min_new_capacity)
    return false;

The code could be changed to something like this:

bool elemental_vector::increase_capacity(uint min_new_capacity, bool grow_hint, uint element_size, object_mover pMover, bool nofail)
{
  [...]
      // new_capacity must be 64-bit when compiling on x64.
      size_t new_capacity = (size_t)min_new_capacity;
      if ((grow_hint) && (!math::is_power_of_2(static_cast<uint64>(new_capacity)))) {
         new_capacity = static_cast<uint>(math::next_pow2(static_cast<uint64>(new_capacity)));
         if (new_capacity < min_new_capacity) {
            new_capacity = UINT32_MAX;
            if (new_capacity < min_new_capacity)
           return false;
         }
      }
      LZHAM_ASSERT(new_capacity && (new_capacity > m_capacity));
 [...]
}

Here, if we have the grow_hint set to true, and next power of two is greater than UINT32_MAX, we set the new_capacity to UINT32_MAX (so not a power of two...).
With this code, the previous case (m_capacity=2147487728, min_new_capacity=2147603813, grow_hint=true) would not fail.

I have not tried to compress my buffer with this modification yet. I will ASAP.

@BorisMansencal
Copy link
Author

Actually, setting new_capacity to UINT32_MAX does not work.
Because when we call realloc, further in increase_capacity function,
returned actual_size may be greater than desired_size=element_size*new_capacity.
Then actual_size/element_size may not fit in an unsigned int.

I don't know how much bigger actual_size can be (it seems very implementation dependent according to malloc_usable_size man page). On my platform, if I set new_capacity to maximum UINT32_MAX-4096, I am able to compress my buffer.

So my code for elemental_vector looks like this:

bool elemental_vector::increase_capacity(uint min_new_capacity, bool grow_hint, uint element_size, object_mover pMover, bool nofail)
{
  [...]
      // new_capacity must be 64-bit when compiling on x64.
      size_t new_capacity = (size_t)min_new_capacity;
      if ((grow_hint) && (!math::is_power_of_2(static_cast<uint64>(new_capacity)))) {
         new_capacity = static_cast<uint>(math::next_pow2(static_cast<uint64>(new_capacity)));
         if (new_capacity < min_new_capacity) {
            new_capacity = UINT32_MAX-4096;
            if (new_capacity < min_new_capacity)
           return false;
         }
      }
      LZHAM_ASSERT(new_capacity && (new_capacity > m_capacity));
 [...]
}

Besides, I think the two following asserts could be added at the end of elemental_vector::increase_capacity(), just before the return true:

      LZHAM_ASSERT(min_new_capacity <= m_capacity );
      LZHAM_ASSERT(m_size <= m_capacity);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants