Make ppt_record_parser.IterStream.readinto() always return desired length #715

adepasquale · 2021-09-14T08:28:08Z

Found this bug while using oleobj.py on a PowerPoint file:

$ oleobj file.ppt
oleobj 0.56.1 - http://decalage.info/oletools
[redacted]
WARNING  Wanted to read 4096, got 2542

The extracted embedded file was not matching the hash of the real embedded file, so I traced back the code starting from the warning message here:

oletools/oletools/oleobj.py

Lines 645 to 646 in a7d1050

    
           log.warning('Wanted to read {0}, got {1}' 
        
                       .format(next_size, len(data)))

The problem is that olefile.py is expecting read() to return all bytes (except for the last sector):
https://github.com/decalage2/olefile/blob/cc0bdc07194fb7dc21e75a95c9e771e5240952b2/olefile/olefile.py#L666-L676

ppt_record_parser.IterStream is derived from io.RawIOBase which is unfortunately not guaranteed to return the desired bytes during read().

Since IterStream implementation was already buffered, I simply changed readinto() to always return the desired length whenever possible; you might want to change that to io.BufferedIOBase

IterStream is derived from io.RawIOBase which is not guaranteed to return the desired bytes during read(). Unfortunately, olefile.py is expecting read() to return all bytes (except for the last sector): https://github.com/decalage2/olefile/blob/cc0bdc07194fb7dc21e75a95c9e771e5240952b2/olefile/olefile.py#L666-L676 Since IterStream implementation was already buffered, I changed readinto() to always return the desired length whenever possible.

decalage2 self-requested a review September 15, 2021 18:24

decalage2 self-assigned this Sep 15, 2021

decalage2 added 🐛 bug ppt_parser labels Sep 15, 2021

decalage2 added this to the oletools 0.60 milestone Sep 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make ppt_record_parser.IterStream.readinto() always return desired length #715

Make ppt_record_parser.IterStream.readinto() always return desired length #715

adepasquale commented Sep 14, 2021

	log.warning('Wanted to read {0}, got {1}'
	.format(next_size, len(data)))

Make ppt_record_parser.IterStream.readinto() always return desired length #715

Are you sure you want to change the base?

Make ppt_record_parser.IterStream.readinto() always return desired length #715

Conversation

adepasquale commented Sep 14, 2021