Skip to content

Releases: dbmdz/solr-ocrhighlighting

0.1

06 Jun 16:22
Compare
Choose a tag to compare
0.1

Changes:

  • f1b6ebf Release 0.1
  • 6dd907a docs: Fix unit test link
  • 59f2ac9 Change namespace to de.digitalcollections
  • 7786d5d GitLab CI: Move zipping of release binaries to publish step
  • a7484ad CI: Build linux binary for offsets-parser
  • c52958c Set up CI with Azure Pipelines
  • 7c356bb Throw an exception in case a resolved file cannot be found
  • 994ba36 Add more helpful error messages for out-of-band exceptions
  • ab757bf Bump version.solr from 8.1.0 to 8.1.1
  • 0f146e6 Merge branch 'yogendrasoni-master'
See more
  • caf10cd fix null pointer exception when selecting non external field
  • 9d31466 Vendor Guava Utf8 util for compatiblity with older Solr versions
  • c7e9d11 Update offsets-parser/README.md
  • 83726d8 Add README for offsets-parser
  • 2bd3cf4 Use Guava method to determine encoded UTF8 size of Java strings
  • a58be88 Fix IndexOutOfBunds error when adjust final offset
  • 1b32007 CI: Only publish docs on master branch
  • af891a7 Test multi-file snippets for UTF8 files.
  • 2327c8d MultiCharIter: More sanity checks to prevent invalid state
  • 7f5aaac Fix ALTO test to reflect changes in whitespace handling
  • 23a0537 docs: Update with info on n:1 file:docs mapping
  • 59c45b3 Add support for n:1 mapping of files:docs (closes #28)
  • abc8cee Add IterableCharSeq implementation that treats multiple files as one
  • d511425 ALTO: fix a few bugs in the passage formatter
  • 2edd8ba ascii_escape: Add mode to overwrite input file
  • 04c8538 ContextBreakIterator: Unify this references
  • 99876b9 Implement multi-page snippets (#29)
  • 747f8fc Bump version.solr from 8.0.0 to 8.1.0
  • 489b4c3 Fix test runner for Java 12
  • 047328a Fix git branch-switcharoo messups
  • 258ce09 miniocr: Fix handling of edge coordinates
  • 79793f3 Get rid of code for handling external UTF16 files
  • 2040cfe ascii_escape: Fix py2/py3 compatibility
  • 0794008 docs: Fix example screenshot
  • 9e57d19 docs: Resolve remaining TODOs in README
  • 4792fce docs: Remove netlify requirements.txt
  • 5062271 docs: Add CI config for publishing to GitHub Pages
  • f0e73a7 docs: Reflect changes in ALTO analyzer
  • 2ec5e88 Add repository url
  • 6132eb3 docs: Add explanation of default delimiter
  • 1cfb167 docs: Add instructions on how to compile
  • 382a62c docs: More editing
  • e3ca283 Add script to perform ASCII-encoding/XML-escaping
  • 27f64cd Fix license link in README
  • 9b31918 docs: Fix markdown typo
  • 27d890a docs: Fix format links on index page
  • 9a7dd2c More work on docs
  • 712a004 Add first draft of documentation, move a lot of stuff out of the README
  • 375903f Minor file fixes
  • dca3cfb Fix bugs in ALTO handling, thanks @mbennett-uoe for the helpful discussion
  • 7a18cbc Add test for entity removal bug fix + test data
  • da4b25c Fix ALTO regexp to correctly match TextBlock/Page/etc entities in Passage Formatter
  • bb1ab7e example: Fix regular field highlighting bug in frontend
  • 8a813f5 Make sure that closing highlighting tags come before any other closing tags
  • f40b948 example: Also include metadata in index to showcase mixed highlighting
  • 1e4a0ce Add unit test for mixed regular and OCR highlighting
  • 2cd10e7 Add a description for the plugin
  • 0d9d49c Don't provide a default summary if no matches were found in the text
  • 0bdf4b3 Add missing docstring
  • f9c601e Refactor block-limitation logic to fix bugs and inconsistencies
  • 8b2d5fb hocr: Allow multiple implementations of generic block types (fixes #20)
  • 2fdd55f README: State correct default value for hl.ocr.limitBlock
  • 1b8bbae Use Integer.parseInt instead of Integer.valueOf
  • 4c06bdd Refactor snippet parsing across formats to increase code reuse
  • 47f8938 Update READMEs (fixes #17)
  • 670eb11 Code style fixes
  • 9f24bac Add support for hl.absoluteHighlights option (implements #6)
  • 8d7c22b example: Offload image serving to remote server
  • d9d15cc Refactor fragment truncation logic for more code-reuse
  • c767949 Fix multiple highlighting bugs (#11) [ #9 ]
  • 9a907a3 Fix bug that caused page numbers to be missed (#10)
  • 322efa9 Bump assertj-core from 3.11.1 to 3.12.2
  • 0b903a5 Bump version.junit from 5.3.2 to 5.4.2
  • 4adcefc Bump slf4j-nop from 1.7.25 to 1.7.26
  • 0223846 Bump maven-shade-plugin from 3.2.0 to 3.2.1
  • 7de1983 Merge pull request #8 from dbmdz/solr8-compatibility
  • b7830ee Add compatibility with Solr 8.0...
  • ec180b2 Bump version.solr from 7.6.0 to 8.0.0
  • 11ac53c ci: use generic CI configuration (production mode)
  • 3962564 pom: fix distribution management section
  • c225fc4 mvn: add settings.xml
  • d59969b ci: use generic CI configuration (currently in testing mode)
  • 146f270 Add .gitattributes to exclude test resources from linguist
  • 433d0da Update README.md
  • 540d151 Update README.md
  • 94958d6 Update README, rename to solr-ocrhighlighting', make example scripts Python 3.6 compatible
  • f45def1 Change license to MIT
  • c2237d6 Merge branch 'alto-highlighting' into 'master'
  • 76c57e5 Resolve a lot of TODOs in the README
  • da1db0f Add test for masked indexing
  • 7944cb5 Update README: Formats, Indexing of sub-structures
  • 004f74b Add support for indexing escaped ASCII ALTO content.
  • ca73ddb Merge branch 'iiif-example' into 'master'
  • b25183f Fix bug where the end of a match could be outside of its passage
  • 27d4dac example: Fix IIIF response when no matches were found in volume
  • 3575dd2 Don't batch in example ingest script
  • b32d121 Update IIIF example
  • 18ccac6 Fix UV integration in nginx config
  • fb71ee7 Add UV to example, speed up manifest generation
  • 3626328 iiif-example: Update to new response format
  • 4b38b6b example-iiif: Make all addresses configurable via CFG_ env variables
  • 1625bce Add small IIIF content search implementation to example
  • 3971d4d Merge branch 'fix-scorer' into 'master' [ #22 ]
  • 313f900 Fix tests that were affected by scoring change
  • 6d3356c Allow customization of passage scoring boost (closes #22)
  • 0e6209d Remove redundant method from OcrSnippet
  • 03c197c Rename snippetCount to numTotal
  • e41757d Fix bug in distributed search, don't handle regular fields
  • 0d3b0fa Update README
  • daeafa7 Fix for interpoliation
  • 16cba03 Move Solr highlighting logic to its own compoment, don't merge with regular highlighting
  • 11e9c2e...
Read more