Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v5 devel branch #307

Draft
wants to merge 138 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
138 commits
Select commit Hold shift + click to select a range
9856cfc
Work on API-breaking changes (bookmarks)
mara004 Apr 4, 2024
0183d80
toc: update API test
mara004 Apr 4, 2024
d699a6c
test_cli: also capture stderr/logging
mara004 Apr 4, 2024
8d0d36f
Update test expectations
mara004 Apr 4, 2024
847281c
toc: better explain level == maxdepth scenario
mara004 Apr 4, 2024
f235226
Start tracking changes
mara004 Apr 4, 2024
4bfb461
slightly improve docs for get_count()
mara004 Apr 4, 2024
ac7903f
address various nits
mara004 Apr 4, 2024
517630a
Continue on document and bitmap
mara004 Apr 4, 2024
677c498
Work on `PdfImage.extract()`
mara004 Apr 4, 2024
4de863d
Fix some object pointer checks against None
mara004 Apr 4, 2024
ccfe923
Address `run check` findings
mara004 Apr 4, 2024
2360165
Expand constructor assignments
mara004 Apr 4, 2024
c581f5a
autorelease: add task
mara004 Apr 4, 2024
81f2b4a
slightly improve wording for v4.25 changelog
mara004 Apr 4, 2024
ddc3f3a
Remove deprecated version API
mara004 Apr 5, 2024
3acc545
Simplify version impl
mara004 Apr 5, 2024
9d87ae0
readme: remove python 3.7.6/3.8.1 incompat
mara004 Apr 5, 2024
df24061
Remove color scheme from rendering
mara004 Apr 5, 2024
b563200
Backport get_quad_points() from devel_old
mara004 Apr 5, 2024
30996b3
Apply renamings, update pageobjects CLI
mara004 Apr 5, 2024
02e23cf
Update changelog
mara004 Apr 5, 2024
55e969c
Update readme examples
mara004 Apr 5, 2024
c0fdc77
doc/comments
mara004 Apr 6, 2024
c9b8f15
style nits
mara004 Apr 8, 2024
e81c8d1
Take over PdfPosConv, with design explanation (untested)
mara004 Apr 8, 2024
d501ac2
Start merging back tests and tests_old
mara004 Apr 8, 2024
06d45c1
Add a somewhat elaborate posconv embeddertest
mara004 Apr 8, 2024
51ad9d0
make posconv api more flexible
mara004 Apr 8, 2024
702b198
update changelog
mara004 Apr 9, 2024
bb75dac
Actually get flattening to work (it was just a usage mistake)
mara004 Apr 9, 2024
4bbea67
slightly improve docs
mara004 Apr 9, 2024
29776b7
bitmap: fix assertion blunder
mara004 Apr 9, 2024
238e425
Add PdfPosConv unittest
mara004 Apr 9, 2024
c4756b8
update changelog & add task
mara004 Apr 9, 2024
cd064b9
Move get_posconv() to bitmap
mara004 Apr 9, 2024
ad4ec2c
Add experimental position normalizer
mara004 Apr 9, 2024
00a738b
docs: include changelog_staging also with non-main branches
mara004 Apr 9, 2024
6ec5bb4
XXX print out tag info
mara004 Apr 9, 2024
08f05d6
XXX show git status
mara004 Apr 9, 2024
953d454
continue on RTD
mara004 Apr 9, 2024
d72f498
slightly improve docs
mara004 Apr 9, 2024
ea9b3ad
Improve PdfBitmap.new_native() logic
mara004 Apr 10, 2024
0d9e478
Warn about pos normalizer having to be re-created
mara004 Apr 10, 2024
8532ce6
bases: style nits
mara004 Apr 10, 2024
30afc79
Remove PdfPosNormalizer experiment
mara004 Apr 10, 2024
1fcbfd8
fix `__all__` blunder
mara004 Apr 11, 2024
4fe9d0d
get_count(): fix doc blunder
mara004 Apr 11, 2024
ccc0b18
move up helper function
mara004 Apr 11, 2024
bf3e616
update changelog
mara004 Apr 11, 2024
2d90a63
Add error code annotation to PdfiumError (CC #308)
mara004 Apr 16, 2024
3f40c17
Prepare `get_text_range()` for pdfium change
mara004 Apr 17, 2024
2338365
docs
mara004 Apr 19, 2024
2bd4757
CLI/render: fix XFA document length recognition
mara004 Apr 21, 2024
875f911
add minor note on matrix multiplication (e, f)
mara004 Apr 21, 2024
cfb2f0d
posconv: ensure page is non-null
mara004 Apr 23, 2024
b5367c0
textpage/search: allow passthrough of caller flags
mara004 May 1, 2024
1ce93c3
cli/pageobjects: skip empty pages
mara004 May 1, 2024
df23f6c
Remove planned changes
mara004 May 1, 2024
d577349
minor readme improvements
mara004 May 5, 2024
640e80a
CLI/arrange: rm pointless var, better release implicit fh
mara004 May 5, 2024
8437cfd
CLI: clean up some comments
mara004 May 5, 2024
bdcd274
Merge remote-tracking branch 'origin/main' into devel_new
mara004 May 6, 2024
87a6547
Prepare for future release
mara004 May 6, 2024
1886f89
retain get_text_range() check for now
mara004 May 6, 2024
027b909
round off docs for `PdfBitmap.new_native()`
mara004 May 6, 2024
2f135e6
PdfImage.extract(): fix for filenames containing non-extension dot
mara004 May 6, 2024
cdc0c06
CLI/extract-images: increase default recursion depth
mara004 May 6, 2024
0f0dfb1
update changelog
mara004 May 6, 2024
37bde64
PdfImage.extract(): fix for filenames containing non-extension dot
mara004 May 6, 2024
863d85d
get_text_range(): adapt allocation to pdfium version
mara004 May 6, 2024
486a7af
Merge remote-tracking branch 'origin/main' into devel_new
mara004 May 6, 2024
555ba5e
PdfImage.extract(): slightly simplify path handling
mara004 May 7, 2024
d818677
Merge remote-tracking branch 'origin/main' into devel_new
mara004 May 7, 2024
c9115bd
slightly simplify get_filters(skip_simple=True)
mara004 May 7, 2024
b1e44f4
Merge remote-tracking branch 'origin/main' into devel_new
mara004 May 9, 2024
247873f
Update changelog according to backports
mara004 May 9, 2024
377b45e
Merge remote-tracking branch 'origin/main' into devel_new
mara004 May 9, 2024
9e509e7
Merge remote-tracking branch 'origin/main' into devel_new
mara004 May 9, 2024
72b60ed
consts: clean up comment
mara004 May 10, 2024
1eab5cb
PdfPage.get_objects(): increase default recursion depth
mara004 May 13, 2024
ca9c964
sligthly update docs for PdfImage.extract() again
mara004 May 13, 2024
f75e075
Add warning about textpage handles when removing text objects
mara004 May 13, 2024
bc8e18c
Explain PdfObject.close()
mara004 May 13, 2024
9a02214
Autoclose textpage handles when removing text pageobject
mara004 May 13, 2024
e38085f
Add some tasks regarding AutoCloseable.close()
mara004 May 13, 2024
d232689
Consistently call `PdfObject` `pageobject` in docs
mara004 May 13, 2024
428f4c3
PdfiumError: don't state the obvious
mara004 May 13, 2024
7cf09d8
docs/conf.py: comment out namedtuple handler
mara004 May 13, 2024
5c66a32
PdfBitmap: slightly improve docs for `new_foreign{_simple}()`
mara004 May 14, 2024
c8e4a06
Handle GetCharIndexAtPos() conforming with pdfium docs
mara004 May 15, 2024
14a7fbb
PdfPage.get_objects(): don't register objects as kids
mara004 May 18, 2024
992e9fe
abstractly reformulate bases task
mara004 May 18, 2024
d8515cb
Merge remote-tracking branch 'origin/main' into devel_new
mara004 May 18, 2024
439989f
Merge remote-tracking branch 'origin/main' into devel_new
mara004 May 18, 2024
59d0e99
CLI/extract-images: Fix another dotted filepath blunder
mara004 May 27, 2024
af81b47
Remove separate `_textpage_wrefs`
mara004 May 31, 2024
1347730
Merge remote-tracking branch 'origin/main' into devel_new
mara004 May 31, 2024
45de679
Clarify `Cannot close object; library is destroyed` condition
mara004 Jun 4, 2024
3596eb0
Correct PdfBookmark.get_count() docstring
mara004 Jul 4, 2024
85eadfb
rendering: lightness inversion for PIL
mara004 Jul 11, 2024
c907e1e
Add OpenCV lightness inversion
mara004 Jul 11, 2024
736101d
Implement opencv image exclusion
mara004 Jul 11, 2024
822c1b7
opencv: fill all polygons in one go
mara004 Jul 11, 2024
2746244
Revert "opencv: fill all polygons in one go"
mara004 Jul 11, 2024
e68d3da
Add some line breaks
mara004 Jul 11, 2024
428e970
pil/polygon: don't draw an outline
mara004 Jul 12, 2024
2bb6766
Add missing mkdir with refbindings (fixes #320)
mara004 Jul 12, 2024
775fb49
lightness inversion: expand pixel formats compat
mara004 Jul 12, 2024
bc42d19
Remove wrong comments
mara004 Jul 12, 2024
7694cea
[Experimental] Defer imports of optional dependencies
mara004 Jul 13, 2024
d7fc983
changelog: add ref to selective lightness inversion
mara004 Jul 13, 2024
7899758
Do engine imports in parent process with fork context
mara004 Jul 13, 2024
9d715cf
Use LazyLoader for deferred top-level imports
mara004 Jul 13, 2024
db65e00
Consistently use unary operator for inversion
mara004 Jul 13, 2024
7803b27
style
mara004 Jul 13, 2024
b495a1f
add task
mara004 Jul 13, 2024
e45150a
Update some wordings
mara004 Jul 14, 2024
d3e9a43
readme: slightly update wording in raw api guide
mara004 Jul 16, 2024
86bc8b1
Add reference to VikParuchuri's `pdftext`
mara004 Jul 16, 2024
f0dbf9c
version: clean up trailer
mara004 Jul 21, 2024
f33fa36
readme: improve raw api
mara004 Jul 21, 2024
c2aa668
Update a few docstrings
mara004 Jul 21, 2024
eb8b1b5
Rename "byte buffer" to "byte stream"
mara004 Jul 21, 2024
d29435d
doc nits
mara004 Jul 24, 2024
f15ac1b
fix typo
mara004 Jul 25, 2024
4cda54c
Update to new FPDFPageObj_TransformF()
mara004 Aug 1, 2024
7cc3cbe
Fix caller-side imports of deferred modules
mara004 Aug 1, 2024
bbc7f98
`PdfMatrix.mirror()`: Fix misleading terminology
mara004 Aug 11, 2024
98ed536
changelog: explicitly mention previous `_flatten()`
mara004 Aug 11, 2024
ee2f035
changelog nit
mara004 Aug 26, 2024
b3f7804
Update licensing docs
mara004 Sep 19, 2024
ef0854e
changelog: fix typo
mara004 Sep 19, 2024
d54d041
PdfPage.flatten(): add note regarding invalidation of handles
mara004 Sep 19, 2024
51d8899
`PdfBitmap.to_numpy()` Use 2d shape for single-channel bitmap
mara004 Oct 26, 2024
7f12cee
version.py: minor cleanup
mara004 Oct 27, 2024
195ce71
CLI(renderer/pageobjects): slightly improve code style
mara004 Oct 30, 2024
5362127
Fix some dirty code in pdfium build script
mara004 Nov 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/conda.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ jobs:
run: |
conda install -y pytest pillow numpy
conda install -y pypdfium2_${{ inputs.package }} --override-channels -c ./conda_dist/ -c pypdfium2-team -c bblanchon -c defaults
pytest tests/ tests_old/
pytest tests/

publish:

Expand Down
7 changes: 3 additions & 4 deletions .github/workflows/trigger_conda_raw.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,9 @@

name: Trigger conda_raw release
on:
# NOTE temporarily commented out, awaiting merge of the v5 branch
# schedule:
# # pdfium-binaries triggers conda on the first Monday of month at 4 o'clock UTC, so we'll want to rebuild after that, but before the next main release where we want to use the package
# - cron: '0 4 8 * *' # monthly, 8th day
schedule:
# pdfium-binaries triggers conda on the first Monday of month at 4 o'clock UTC, so we'll want to rebuild after that, but before the next main release where we want to use the package
- cron: '0 4 8 * *' # monthly, 8th day
workflow_dispatch:

jobs:
Expand Down
11 changes: 5 additions & 6 deletions .github/workflows/trigger_main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,11 @@

name: Trigger main release
on:
# NOTE temporarily commented out, awaiting merge of the v5 branch
# # https://github.com/bblanchon/pdfium-binaries/blob/master/.github/workflows/trigger.yml
# # https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule
# # https://crontab.guru/
# schedule:
# - cron: '0 4 10 * *' # monthly, 10th day
# https://github.com/bblanchon/pdfium-binaries/blob/master/.github/workflows/trigger.yml
# https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule
# https://crontab.guru/
schedule:
- cron: '0 4 10 * *' # monthly, 10th day
workflow_dispatch:

jobs:
Expand Down
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ build/
dist/
conda/*/out/
tests/output/
tests_old/output/

data/
!data/.gitkeep
Expand Down
4 changes: 2 additions & 2 deletions .reuse/dep5
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Files:
tests/resources/attachments.pdf
tests/resources/mona_lisa.jpg
Copyright: 2022 PDFium developers
License: BSD-3-Clause OR Apache-2.0
License: BSD-3-Clause, Apache-2.0
Comment:
Obtained from:
https://pdfium.googlesource.com/pdfium/+/refs/heads/main/testing/resources/bookmarks_circular.pdf
Expand All @@ -67,7 +67,7 @@ Files:
Copyright:
2022 PDFium developers
2024 geisserml <[email protected]>
License: BSD-3-Clause OR Apache-2.0
License: BSD-3-Clause, Apache-2.0

Files: tests/resources/images.pdf
Copyright:
Expand Down
2 changes: 1 addition & 1 deletion .reuse/dep5-wheel
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@ Copyright:
2024 PDFium developers
2024 Developers of projects mentioned in PdfiumThirdParty
2024 Benoît Blanchon and pdfium-binaries contributors
License: (Apache-2.0 OR BSD-3-Clause) AND LicenseRef-PdfiumThirdParty
License: (BSD-3-Clause, Apache-2.0) AND LicenseRef-PdfiumThirdParty
145 changes: 76 additions & 69 deletions README.md

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions autorelease/config.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"beta": false,
"major": false,
"beta": true,
"major": true,
"humble": null
}
}
4 changes: 2 additions & 2 deletions conda/helpers/recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,10 @@ about:
description: |
This package provides python helpers around pdfium.
Dependants are suggested to pin to a major version, but any tighter pinning is discouraged since it increases the risk for conflicts, and would lock you out from future fixes.
license: Apache-2.0 OR BSD-3-Clause
license: BSD-3-Clause, Apache-2.0
license_file:
- LICENSES/Apache-2.0.txt
- LICENSES/BSD-3-Clause.txt
- LICENSES/Apache-2.0.txt
- LICENSES/CC-BY-4.0.txt
dev_url: https://github.com/pypdfium2-team/pypdfium2
doc_url: https://pypdfium2.readthedocs.io
Expand Down
4 changes: 2 additions & 2 deletions conda/raw/recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,10 @@ about:
description: |
This package provides raw ctypes bindings to pdfium.
Important: DO NOT PIN to an exact version, as pypdfium2_raw itself pins pdfium-binaries to achieve ABI safety.
license: Apache-2.0 OR BSD-3-Clause
license: BSD-3-Clause, Apache-2.0
license_file:
- LICENSES/Apache-2.0.txt
- LICENSES/BSD-3-Clause.txt
- LICENSES/Apache-2.0.txt
- LICENSES/CC-BY-4.0.txt
dev_url: https://github.com/pypdfium2-team/pypdfium2
doc_url: https://pypdfium2.readthedocs.io
Expand Down
8 changes: 4 additions & 4 deletions docs/devel/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
## 4.26.0 (2024-01-10)

- Updated PDFium from `6164` to `6233`.
- Pin ctypesgen in sdist to prevent reoccurrence of {issue}`264` / {issue}`286`. As a drawback, the pin is never committed, so the sdist is not simply reproducible at this time due to dependence on the latest commit hash of the ctypesgen fork at build time.
- Pin ctypesgen in sdist to prevent re-occurrence of {issue}`264` / {issue}`286`. As a drawback, the pin is never committed, so the sdist is not simply reproducible at this time due to dependence on the latest commit hash of the ctypesgen fork at build time.
- Wheel tags: Added back `manylinux2014` in addition to `manylinux_{glibc_ver}` to be on the safe side. Suspected relation to the above issues.


Expand All @@ -59,10 +59,10 @@
#### Rationale for `PdfDocument.render()` deprecation

- The parallel rendering API unfortunately was an inherent design mistake: Multiprocessing is not meant to transfer large amounts of pixel data from workers to the main process.
- This was such a heavy drawback that it basically outweighed the parallelization, so there was no real performance advantage, only higher memory load.
- As a related problem, the worker pool produces bitmaps at an indepedent speed, regardless of where the receiving iteration might be, so bitmaps could queue up in memory, possibly causing an enormeous rise in memory consumption over time. This effect was pronounced e.g. with PNG saving via PIL, as exhibited in Facebook's `nougat` project.
- Bitmap transfer is so expensive that it essentially outweighed parallelization, so there was no real performance advantage, only higher memory load.
- As a related problem, the worker pool produces bitmaps at an independent speed, regardless of where the receiving iteration might be, so bitmaps could queue up in memory, possibly causing an enormeous rise in memory consumption over time. This effect was pronounced e.g. with PNG saving via PIL, as seen in Facebook's `nougat` project.
- Instead, each bitmap should be processed (e.g. saved) in the job which created it. Only a minimal, final result should be sent back to the main process (e.g. a file path).
- This means we cannot reasonably provide a generic parallel renderer, instead it needs to be implemented by callers.
- This means we cannot reasonably provide a generic parallel renderer; instead it needs to be implemented by callers.
- Historically, note that there had been even more faults in the implementation:
* Prior to `4.22.0`, the pool was always initialized with `os.cpu_count()` processes by default, even when rendering less pages.
* Prior to `4.20.0`, a full-scale input transfer was conducted on each job (rendering it unusable with bytes input). However, this can and should be done only once on process creation.
Expand Down
39 changes: 35 additions & 4 deletions docs/devel/changelog_staging.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,38 @@
<!-- List character: dash (-) -->

# Changelog for next release
- `PdfPage.get_objects()`: Don't register pageobjects as children, because they don't need to be closed by the caller when part of a page. This avoids excessive caching of weakrefs that are not cleaned up with the object they refer to.
- Autorelease: Swapped default condition for minor/patch update, as pypdfium2 changes are likely more API-significant than pdfium updates. Added ability for manual override.
- Fixed conda packaging: It is now required to explicitly specify `-c defaults` with `--override-channels`, presumably due to an upstream change.
- Bumped workflows to Python 3.12.

*API changes*
- Rendering / Bitmap
* Removed `PdfDocument.render()` (see deprecation rationale in v4.25 changelog). Instead, use `PdfPage.render()` with a loop or process pool.
* Removed `PdfBitmap.get_info()` and `PdfBitmapInfo`, which existed only on behalf of data transfer with `PdfDocument.render()`.
* `PdfBitmap.to_numpy()`: If the bitmap is single-channel (grayscale), use a 2d shape to avoid needlessly wrapping each pixel value in a list.
* `PdfBitmap.from_pil()`: Removed `recopy` param.
* Removed pdfium color scheme param from rendering, as it's not really useful: one can only set colors for certain object types, which are then forced on all instances of that type. This may flatten different colors into one, leading to a loss of visual information. To achieve a "dark theme" for light PDFs, we suggest to instead post-process rendered images with selective lightness inversion, as is now implemented in pypdfium2's rendering CLI.
- Pageobjects
* Renamed `PdfObject.get_pos()` to `.get_bounds()`.
* Renamed `PdfImage.get_size()` to `.get_px_size()`.
* `PdfImage.extract()`: Removed `fb_render` param because it does not fit in this API. If the image's rendered bitmap is desired, use `.get_bitmap(render=True)` in the first place.
- Renamed misleading `PdfMatrix.mirror()` parameters `v, h` to `invert_x, invert_y`, as the terms horizontal/vertical flip commonly refer to the transformation applied, not the axis around which is being flipped (i.e. the previous `v` meant flipping around the Y axis, which is vertical, but the resulting transform is inverting the X coordinates and thus actually horizontal). No behavior change if you did not use keyword arguments.
- `PdfDocument.get_toc()`: Replaced `PdfOutlineItem` namedtuple with method-oriented wrapper classes `PdfBookmark` and `PdfDest`, so callers may retrieve only the properties they actually need. This is closer to pdfium's original API and exposes the underlying raw objects. Provides signed count as-is rather than splitting in `n_kids` and `is_closed`. Also distinguishes between `dest is None` and a dest with unknown mode.
- `get_text_range()`: Removed implicit translation of default calls to `get_text_bounded()`, as pdfium reverted `FPDFText_GetText()` to UCS-2, which resolves the allocation concern. However, callers are encouraged to explicitly use `get_text_bounded()` for full Unicode support.
- Removed legacy version flags.

*Improvements and new features*
- Added `PdfPosConv` helper and `PdfBitmap.get_posconv(page)` for bidirectional translation between page and bitmap coordinates.
- Added `PdfObject.get_quad_points()` to get the corner points of an image or text object.
- Exposed `PdfPage.flatten()` (previously semi-private `_flatten()`), after having found out how to correctly use it. Added check and updated docs accordingly.
- Added context manager support to `PdfDocument`, so it can be used in a `with`-statement, because opening from a file path binds a file descriptor, which should be released explicitly, given OS limits on the number of open FDs.
- If document loading failed, `err_code` is now assigned to the `PdfiumError` instance so callers may programmatically handle the error subtype.
- Corrected some null pointer checks: we have to use `bool(ptr)` rather than `ptr is None`.
- Improved startup performance by deferring imports of optional dependencies to the point where they are actually needed, to avoid overhead if you do not use them.
- Simplified version classes (no API change expected).

*Project*
- Merged `tests_old/` back into `tests/`.
- Docs: Improved logic when to include the unreleased version warning and upcoming changelog.

<!-- TODO
See https://github.com/pypdfium2-team/pypdfium2/blob/devel_old/docs/devel/changelog_staging.md
for how to proceed. Note that some things have already been backported, and some rejected.
-->
2 changes: 1 addition & 1 deletion docs/source/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Changelog
=========

.. ifconfig:: build_type == 'latest'
.. ifconfig:: have_changes

.. warning::
This is a documentation build for an unreleased version of pypdfium2, so it is possible that new changes are not logged yet.
Expand Down
53 changes: 18 additions & 35 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,32 +8,21 @@
import os
import sys
import time
import collections
# import collections
from pathlib import Path

sys.path.insert(0, str(Path(__file__).parents[2] / "setupsrc"))
from pypdfium2_setup.packaging_base import (
run_cmd,
ProjectDir,
parse_git_tag,
get_next_changelog,
)


def _get_build_type():

# RTD uses git checkout --force origin/... which results in a detached HEAD state, so we cannot easily get the branch name
# Thus query for an RTD-specific environment variable instead
rtd_vn = os.environ.get("READTHEDOCS_VERSION_NAME", None)
if rtd_vn:
return rtd_vn

branch = run_cmd(["git", "branch", "--show-current"], cwd=ProjectDir, capture=True)
if branch == "main":
return "latest"
else:
return branch


build_type = _get_build_type()
# RTD modifies conf.py, so we have to ignore dirty state if on RTD
is_rtd = os.environ.get("READTHEDOCS", "").lower() == "true"
tag_info = parse_git_tag()
have_changes = tag_info["n_commits"] > 0 or (tag_info["dirty"] and not is_rtd)
if get_next_changelog():
assert have_changes

project = "pypdfium2"
author = "pypdfium2-team"
Expand Down Expand Up @@ -70,7 +59,6 @@ def _get_build_type():
"members": True,
"undoc-members": True,
"show-inheritance": True,
# "inherited-members": True,
"member-order": "bysource",
}
intersphinx_mapping = {
Expand All @@ -81,21 +69,16 @@ def _get_build_type():

# https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-rst_prolog
# .. |br| raw:: html

# <br/>
rst_prolog = """
.. |build_type| replace:: %(build_type)s
""" % dict(
build_type = build_type,
)


def remove_namedtuple_aliases(app, what, name, obj, skip, options):
if type(obj) is collections._tuplegetter:
return True
return skip
rst_prolog = f"""
.. |have_changes| replace:: {have_changes}
"""

# def remove_namedtuple_aliases(app, what, name, obj, skip, options):
# if type(obj) is collections._tuplegetter:
# return True
# return skip

def setup(app):
app.connect('autodoc-skip-member', remove_namedtuple_aliases)
app.add_config_value("build_type", "latest", "env")
# app.connect('autodoc-skip-member', remove_namedtuple_aliases)
app.add_config_value("have_changes", True, "env")
5 changes: 2 additions & 3 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
pypdfium2
=========

Welcome to the documentation for the support model of pypdfium2 (|build_type| build).
Welcome to the documentation for the support model of pypdfium2.

.. toctree::
:maxdepth: 2
Expand All @@ -16,9 +16,8 @@ Welcome to the documentation for the support model of pypdfium2 (|build_type| bu

.. toctree::
:maxdepth: 1
:caption: Progress
:caption: Release Notes

planned_changes
changelog


Expand Down
10 changes: 0 additions & 10 deletions docs/source/planned_changes.md

This file was deleted.

7 changes: 2 additions & 5 deletions docs/source/python_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,9 +76,6 @@ Version

.. automodule:: pypdfium2.version

.. deprecated:: 4.22
The legacy members ``V_PYPDFIUM2, V_LIBPDFIUM, V_BUILDNAME, V_PDFIUM_IS_V8, V_LIBPDFIUM_FULL`` will be removed in version 5.

Document
********
.. automodule:: pypdfium2._helpers.document
Expand All @@ -87,8 +84,8 @@ Page
****
.. automodule:: pypdfium2._helpers.page

Page Objects
************
Pageobjects
***********
.. automodule:: pypdfium2._helpers.pageobjects

Text Page
Expand Down
4 changes: 2 additions & 2 deletions docs/source/shell_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,8 @@ Image Converter
.. command-output:: pypdfium2 imgtopdf --help


Page Objects Info
*****************
Pageobjects Info
****************
.. command-output:: pypdfium2 pageobjects --help


Expand Down
2 changes: 1 addition & 1 deletion req/converters.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# NOTE In order to use numpy, the rendering CLI further needs `opencv-python`, but we don't currently cover that internally. As the import is guarded, we don't have to require it here.
# NOTE In order to use numpy, the rendering CLI further needs `opencv-python[-headless]`, but we don't currently cover that internally. As the import is guarded, we don't have to require it here.
pillow
numpy
10 changes: 5 additions & 5 deletions run
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@
args="${@:2}"

function check() {
autoflake src/ setupsrc/ tests/ tests_old/ setup.py docs/source/conf.py --recursive --remove-all-unused-imports --ignore-pass-statements --ignore-init-module-imports
codespell --skip="./docs/build,./tests/resources,./tests/output,./tests_old/output,./data,./sourcebuild,./dist,./.git,__pycache__,.mypy_cache,.hypothesis" -L "tabe,splitted,fith,flate"
autoflake src/ setupsrc/ tests/ setup.py docs/source/conf.py --recursive --remove-all-unused-imports --ignore-pass-statements --ignore-init-module-imports
codespell --skip="./docs/build,./tests/resources,./tests/output,./data,./sourcebuild,./dist,./.git,__pycache__,.mypy_cache,.hypothesis" -L "tabe,splitted,fith,flate"
reuse lint
}

function clean() {
rm -rf pypdfium2*.egg-info/ src/pypdfium2*.egg-info/ build/ dist/ data/* tests/output/* tests_old/output/* conda/bundle/out/ conda/helpers/out/ conda/raw/out/
rm -rf pypdfium2*.egg-info/ src/pypdfium2*.egg-info/ build/ dist/ data/* tests/output/* conda/bundle/out/ conda/helpers/out/ conda/raw/out/
}

function packaging_pypi() {
Expand All @@ -35,10 +35,10 @@ set -x
case $1 in

test)
python3 -m pytest tests/ tests_old/ $args;;
python3 -m pytest tests/ $args;;

coverage)
python3 -m coverage run --omit "tests/*,tests_old/*,src/pypdfium2_raw/bindings.py,setupsrc/*" -m pytest tests/ tests_old/ $args
python3 -m coverage run --omit "tests/*,src/pypdfium2_raw/bindings.py,setupsrc/*" -m pytest tests/ $args
python3 -m coverage report;;

docs-build)
Expand Down
3 changes: 1 addition & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ def run_setup(modnames, pl_name, pdfium_ver):
kwargs = dict(
name = "pypdfium2",
description = "Python bindings to PDFium",
license = "Apache-2.0 OR BSD-3-Clause",
license = "BSD-3-Clause, Apache-2.0, PdfiumThirdParty",
license_files = LICENSES_SHARED,
python_requires = ">= 3.6",
cmdclass = {},
Expand Down Expand Up @@ -132,7 +132,6 @@ def run_setup(modnames, pl_name, pdfium_ver):
kwargs["package_data"]["pypdfium2_raw"] = [VersionFN, BindingsFN, libname]
kwargs["cmdclass"]["bdist_wheel"] = bdist_factory(pl_name)
kwargs["distclass"] = BinaryDistribution
kwargs["license"] = f"({kwargs['license']}) AND LicenseRef-PdfiumThirdParty"
kwargs["license_files"] += LICENSES_WHEEL

if "pypdfium2" in kwargs["package_data"]:
Expand Down
Loading