Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix several problems with filename handling #284

Merged
merged 2 commits into from
Oct 20, 2024

Conversation

bgilbert
Copy link
Member

  • Starting in 1.2.0, OpenSlide() and OpenSlide.detect_format() have failed to accept filename arguments formatted as bytes because str(b'abc') == "b'abc'". In addition, filename arguments with invalid types (such as None) have been stringified and passed to OpenSlide, rather than raising an exception during conversion; we even had tests for this (!).
  • lowlevel has always encoded filename arguments to UTF-8, but on non-Windows it should have used the Python filesystem encoding instead (usually UTF-8 but not always). On Windows, OpenSlide 4.0.0+ expects UTF-8 rather than arbitrary bytes. (OpenSlide < 4.0.0 expects the system codepage, which isn't very useful in practice because of its limited character set, so we ignore that case for now.)
  • Type hints did not allow filename arguments to be bytes, nor did they allow os.PathLike subclasses which were not pathlib.Path (such as pathlib.PurePath).

Accept str, bytes, or os.PathLike for all filename arguments, and properly convert them to bytes for OpenSlide.

In addition, allow bytes in type hints for _utf8_p arguments in lowlevel. The high-level API doesn't accept bytes for any of the affected functionality, but lowlevel does, so encode that in the type signatures.

@openslide-bot
Copy link

openslide-bot commented Oct 20, 2024

DCO signed off ✔️

All commits have been signed off. You have certified to the terms of the Developer Certificate of Origin, version 1.1. In particular, you certify that this contribution has not been developed using information obtained under a non-disclosure agreement or other license terms that forbid you from contributing it under the GNU Lesser General Public License, version 2.1.

- Starting in 1.2.0, OpenSlide() and OpenSlide.detect_format() have failed
  to accept filename arguments formatted as bytes because str(b'abc') ==
  "b'abc'".  In addition, filename arguments with invalid types (such as
  None) have been stringified and passed to OpenSlide, rather than raising
  an exception during conversion; we even had tests for this (!).

- lowlevel has always encoded filename arguments to UTF-8, but on
  non-Windows it should have used the Python filesystem encoding instead
  (usually UTF-8 but not always).  On Windows, OpenSlide 4.0.0+ expects
  UTF-8 rather than arbitrary bytes.  (OpenSlide < 4.0.0 expects the system
  codepage, which isn't very useful in practice because of its limited
  character set, so we ignore that case for now.)

- Type hints did not allow filename arguments to be bytes, nor did they
  allow os.PathLike subclasses which were not pathlib.Path (such as
  pathlib.PurePath).

Accept str, bytes, or os.PathLike for all filename arguments, and properly
convert them to bytes for OpenSlide.

Fixes: 98c11bd ("Add support for pathlib.Path instances (openslide#123)")
Fixes: 5644229 ("tests: test passing invalid types to OpenSlide constructor")
Signed-off-by: Benjamin Gilbert <[email protected]>
The high-level API doesn't accept bytes for any of the affected
functionality, but lowlevel does, so encode that in the type signatures.

Signed-off-by: Benjamin Gilbert <[email protected]>
@bgilbert bgilbert merged commit aa2a01f into openslide:main Oct 20, 2024
52 checks passed
@bgilbert bgilbert deleted the filenames branch October 20, 2024 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants