Skip to content

Commit

Permalink
PdfBitmap.to_numpy() Use 2d shape for single-channel bitmap
Browse files Browse the repository at this point in the history
  • Loading branch information
mara004 committed Oct 26, 2024
1 parent d54d041 commit dc5db75
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 4 deletions.
1 change: 1 addition & 0 deletions docs/devel/changelog_staging.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
- Rendering / Bitmap
* Removed `PdfDocument.render()` (see deprecation rationale in v4.25 changelog). Instead, use `PdfPage.render()` with a loop or process pool.
* Removed `PdfBitmap.get_info()` and `PdfBitmapInfo`, which existed only on behalf of data transfer with `PdfDocument.render()`.
* `PdfBitmap.to_numpy()`: If the bitmap is single-channel (grayscale), use a 2d shape to avoid needlessly wrapping each pixel value in a list.
* `PdfBitmap.from_pil()`: Removed `recopy` param.
* Removed pdfium color scheme param from rendering, as it's not really useful: one can only set colors for certain object types, which are then forced on all instances of that type. This may flatten different colors into one, leading to a loss of visual information. To achieve a "dark theme" for light PDFs, we suggest to instead post-process rendered images with selective lightness inversion, as is now implemented in pypdfium2's rendering CLI.
- Pageobjects
Expand Down
9 changes: 5 additions & 4 deletions src/pypdfium2/_helpers/bitmap.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,8 @@ def to_numpy(self):
The array contains as many rows as the bitmap is high.
Each row contains as many pixels as the bitmap is wide.
The length of each pixel corresponds to the number of channels.
If there is more than one channel, each pixel will be an array of values per channel (shape depth 3).
If there is only one channel, the pixels are the values of that channel directly (shape depth 2).
The resulting array is supposed to share memory with the original bitmap buffer,
so changes to the buffer should be reflected in the array, and vice versa.
Expand All @@ -210,11 +211,11 @@ def to_numpy(self):

array = numpy.ndarray(
# layout: row major
shape = (self.height, self.width, self.n_channels),
shape = (self.height, self.width, self.n_channels) if self.n_channels > 1 else (self.height, self.width),
dtype = ctypes.c_ubyte,
buffer = self.buffer,
# number of bytes per item for each nesting level (outer->inner, i. e. row, pixel, value)
strides = (self.stride, self.n_channels, 1),
# number of bytes per item for each nesting level (outer->inner: row, pixel, value - or row, value for a single-channel bitmap)
strides = (self.stride, self.n_channels, 1) if self.n_channels > 1 else (self.stride, 1),
)

return array
Expand Down

0 comments on commit dc5db75

Please sign in to comment.