Skip to content

Releases: axa-group/Parsr

v0.6: Merge branch 'develop'

21 Nov 14:12
Compare
Choose a tag to compare

Changes

  1. Added Jupyter Notebook
  2. Improved Headings detection module (Hight reduction of false positives)
  3. Improved Table detection module
  4. Improved Paragraph detection module
  5. Improved git Readme
  6. Several GUI & Server bug fixes

v0.5: Merge branch 'develop'

14 Nov 21:20
Compare
Choose a tag to compare

Changes

  1. New List detection Module (bullet and numeric type list)
  2. Improved Link detection module for pdfMiner extractor
  3. Improved Heading detection module (font usage ratio used to detect headings)
  4. Markdown exporter updated to export tables using standard syntax instead of html syntax
  5. Improved overall output accuracy
  6. Several GUI improvements

Dependencies

  1. Added GraphicksMagick for GUI thumbnails generation

v0.4

24 Oct 11:57
Compare
Choose a tag to compare

Changes

  1. Highly improved LinesToParagraph module
  2. Highly improved Headings detection module.
  3. Promotion of pdfminer as the primary PDF extracter + related output cleaning.
  4. Improved text redundancy/duplication detection and treatment.
  5. Leaner docker implementation for faster deploys.
  6. Several Vue UI improvements (demo/vue-viewer), including text inspector, forward, next buttons, and more.
  7. Several bugfixes in markdown export, including more flexible tables including rawspans and colspans.
  8. Windows deployment improvements under both bare-metal and docker flavors.

v0.3

19 Sep 13:03
Compare
Choose a tag to compare

Changes

  • Vital bugfix related to the table extraction module
  • Word inspection mode in the Vue UI, including heirarchy highlighting
  • Made pdf2json the default extractor for pdfs

v0.2

16 Sep 14:32
Compare
Choose a tag to compare

Changes

  • New responsive Vue based UI for visualising output under demo/vue-viewer
  • New pdf extractor option: pdfminer
  • New image extractor option: google-vision
  • Multilingual documentation (starting with Chinese for now)
  • Better naming of different areas (input, output..) of the pipeline for easier code understadability.
  • Improved header detection
  • Externalized default configuration files for modules (module/abc/defaultConfig.json)
  • Several bugfixes

v0.1

04 Sep 07:06
Compare
Choose a tag to compare

Initial Release