-
Notifications
You must be signed in to change notification settings - Fork 17
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add simple lemmatizer token filter, move versions to gradle propertie…
…s, clean docs dir
- Loading branch information
Showing
122 changed files
with
1,233 additions
and
59,324 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,28 @@ | ||
Thanks to David Weiss for | ||
The plugin bundle wouldn't be possible without the hard work of many authors | ||
who generously published their work under an open source license. | ||
|
||
https://github.com/dweiss/compound-splitter | ||
This file should contain all the credits to them. If you miss a credit, please | ||
notify me about it and it will be added as soon as possible. | ||
|
||
The ICU analysis is heavily based on Apache Lucene ICU | ||
|
||
https://github.com/apache/lucene-solr/tree/master/lucene/analysis/icu | ||
|
||
The AutoPhraseTokenFilter is derived from | ||
|
||
https://github.com/lucidworks/auto-phrase-tokenfilter | ||
|
||
The ConcatTokenFilter is authored by Sujit Pal and was taken from | ||
|
||
http://sujitpal.blogspot.de/2011/07/lucene-token-concatenating-tokenfilter_30.html | ||
|
||
The Decompound token filter is a reworked implementation of the | ||
link:http://wortschatz.uni-leipzig.de/~cbiemann/software/toolbox/Baseforms%20Tool.htm[Baseforms Tool] | ||
found in the http://wortschatz.uni-leipzig.de/~cbiemann/software/toolbox/index.htm[ASV toolbox] | ||
of http://asv.informatik.uni-leipzig.de/staff/Chris_Biemann[Chris Biemann], | ||
Automatische Sprachverarbeitung of Leipzig University. | ||
|
||
The FSA in package org.xbib.elastixsearch.common.fsa which provides the dictionary structure for | ||
the baseform tokenizer is a derived version of | ||
|
||
https://github.com/morfologik/morfologik-stemming/tree/master/morfologik-fsa/src/main/java/morfologik/fsa |
Oops, something went wrong.