Skip to content

Releases: weblyzard/inscriptis

Custom HTML Handling and HTML engine improvements

05 Mar 16:19
667b356
Compare
Choose a tag to compare
  • add working support for specifying custom html tags (fixes #81)
  • improved html_engine.py
  • improved typing across all modules
  • added unittests for
    • inscript
    • inscriptis-api
  • documentation update

Fix documentation build and update publish script.

17 Jan 06:33
Compare
Choose a tag to compare
  • fix building documentation on readthedocs.org
  • update publish script

Code cleanup, improved Web service and distribution

16 Jan 15:05
16404b0
Compare
Choose a tag to compare
  • added official Python 3.12 support
  • Inscriptis command line client
    • renamed inscript.py to inscript and install client via pip
    • added --timeout argument.
  • Inscriptis Web service:
    • migrate the Web service to FastAPI and uvicorn
    • enable install as an extra using pip install inscriptis[web-service]
  • code cleanup
  • migrate to pyproject.toml and poetry for package distribution
  • use black for code formatting
  • improved tox config and code checks

Official Python 3.11 support

07 Dec 08:18
Compare
Choose a tag to compare

Maintenance release adding Python 3.11 to the build pipeline.

Fixed handling of invalid length specifications

29 Aug 13:28
e7e4ade
Compare
Choose a tag to compare

This is a bugfix release correcting the handling of invalid length specifications (bug #63).

Correct handling of tail text in HTML comments

02 Aug 08:53
d7e2afa
Compare
Choose a tag to compare
  • fix: correctly handle HTML comments used to confuse HTML to text conversion (fixes #45).
  • fix: updated unittests to correctly work with lxml in Ubuntu 22.04.
  • add: updated and extended flake8 testing.

Support for custom HTML table separators and Python 3.10

22 Oct 10:52
6fa9516
Compare
Choose a tag to compare
  • support custom HTML tables separators (addresses #29).
  • extended documentation on the command line client and added a link to the JOSS paper on inscriptis.
  • officially support Python 3.10 and add it to the build pipeline.
  • fixed dependency resolution for tox builds.

Zenodo DOI and integrated feedback obtained through the Journal of Open Source Software review process

11 Oct 14:51
Compare
Choose a tag to compare
  • improved documentation based on feedback provided by @reality, @rlskoeser and @sbenthall as part of the Journal of Open Source Software review process.
  • the Inscriptis web service has been included into the Python package and can now be started with
     export FLASK_APP="inscriptis.service.web"
     python3 -m flask run

Integrated feedback obtained through the Journal of Open Source Software review process

11 Oct 14:11
Compare
Choose a tag to compare
  • improved documentation based on feedback provided by @reality, @rlskoeser and @sbenthall as part of the Journal of Open Source Software review process.
  • the Inscriptis web service has been included into the Python package and can now be started with
     export FLASK_APP="inscriptis.service.web"
     python3 -m flask run

Improved document model, parsing of borderline cases & HTML annotation support

12 Jul 08:48
5e5fcc3
Compare
Choose a tag to compare

Changes

HTML parsing:

  • new: improved model for handling text blocks and lines
  • chg: improved HTML parsing of tables, enumerations and margins; fixed borderline cases
  • chg: improved whitespace handling
  • add: cover more borderline cases with unit tests

Inscriptis core:

  • new: annotation support
  • new: processing of annotation rules and annotation output
  • new: type hints
  • add: extended and improved documentation

Inscript command line client:

  • new: added --annotation-rules option for annotation support.
  • new: added --post-processor option to export and visualize annotations (HTML, XML and surface form export)
  • chg: apply --encoding to Web URLs as well

Misc:

  • chg: migrated to the semantic versioning schema described on https://semver.org/ for versioning.

Note

In terms of functionality, this release corresponds to Inscriptis 2.0rc2.