Skip to content
This repository has been archived by the owner on Jan 22, 2021. It is now read-only.

Commit

Permalink
Release 0.6.0
Browse files Browse the repository at this point in the history
  • Loading branch information
komu committed Sep 18, 2019
1 parent 7c21cce commit 2d368e5
Show file tree
Hide file tree
Showing 3 changed files with 129 additions and 124 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
## 0.6.0 (2019-09-18)

- New version that is compatible with ES 7.3.2.

## 0.5.0 (2017-01-16)

- New version that is compatible with ES 5.1.1.
Expand Down
247 changes: 124 additions & 123 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,123 +1,124 @@
# Voikko Analysis for Elasticsearch

The Voikko Analysis plugin provides Finnish language analysis using [Voikko](http://voikko.puimula.org/).

## Supported versions

| Plugin version | Elasticsearch version |
| -------------- | ----------------------|
| [0.5.0](https://github.com/EvidentSolutions/elasticsearch-analysis-voikko/blob/v0.5.0/README.md) | 5.1.1 |
| [0.4.0](https://github.com/EvidentSolutions/elasticsearch-analysis-voikko/blob/v0.4.0/README.md) | 2.2.1 |
| [0.3.0](https://github.com/EvidentSolutions/elasticsearch-analysis-voikko/blob/v0.3.0/README.md) | 1.5.2 |

If you are not installing the latest version, follow the links in the table to see installation instructions for the old version.

## Installing

### Installing Voikko

The plugin needs `libvoikko` shared library to work. Details of installing the library varies
based on operating system. In Debian based systems `apt-get install libvoikko1` should work.

Next, you'll need to download [morpho dictionary](http://www.puimula.org/htp/testing/voikko-snapshot/dict-morpho.zip)
(for libvoikko version 4.0+, use [morpho dict v5](http://www.puimula.org/htp/testing/voikko-snapshot-v5/dict-morpho.zip)
instead).
Unzip this into Voikko's dictionary directory (e.g. `/usr/lib/voikko` in Debian) or into a directory you specify with
`dictionaryPath` configuration property.

### Installing the plugin

Finally, to install the plugin, run:

```
bin/elasticsearch-plugin install https://github.com/EvidentSolutions/elasticsearch-analysis-voikko/releases/download/v0.5.0/elasticsearch-analysis-voikko-0.5.0.zip
```

### Security policy

Elasticsearch ships with a pretty restrictive security policy. Plugins can specify the permissions
that they need in `plugin-security.policy`. However, elasticsearch-analysis-voikko uses
[JNA library](https://github.com/java-native-access/jna) which is already distributed with Elasticsearch
and therefore can't be included in the plugin zip. This means that the security policy bundled with the
plugin will not apply to JNA, yet it should be able to load `libvoikko` from the system.

Therefore you need to create a custom security policy, granting Elasticsearch itself the permission
to load `libvoikko`:

```
grant {
permission java.io.FilePermission "<<ALL FILES>>", "read";
permission java.lang.reflect.ReflectPermission "newProxyInPackage.org.puimula.libvoikko";
};
```

(You don't really need to grant read access to `<<ALL FILES>>`, you can pass the location
of `libvoikko` instead.)

Save this as `custom-elasticsearch.policy` and tell Elasticsearch to load it:

```
export ES_JAVA_OPTS=-Djava.security.policy=file:/path/to/custom-elasticsearch.policy
```

### Verify installation

After installing the plugin, you can quickly verify that it works by executing:

```
curl -XGET 'localhost:9200/_analyze' -d '
{
"tokenizer" : "finnish",
"filter" : [{"type": "voikko", "libraryPath": "/directory/of/libvoikko", "dictionaryPath": "/directory/of/voikko/dictionaries"}],
"text" : "Testataan voikon analyysiä tällä tavalla yksinkertaisesti."
}'
```

If this works without error messages, you can proceed to configure the plugin index.

## Configuring

Include `finnish` tokenizer and `voikko` filter in your analyzer, for example:

```json
{
"index": {
"analysis": {
"analyzer": {
"default": {
"tokenizer": "finnish",
"filter": ["lowercase", "voikkoFilter"]
}
},
"filter": {
"voikkoFilter": {
"type": "voikko"
}
}
}
}
}
```

You can use the following filter options to customize the behaviour of the filter:

| Parameter | Default value | Description |
|-------------------|------------------|--------------------------------------------------|
| language | fi_FI | Language to use |
| dictionaryPath | system dependent | path to voikko dictionaries |
| analyzeAll | false | Use all analysis possibilities or just the first |
| minimumWordSize | 3 | minimum length of words to analyze |
| maximumWordSize | 100 | maximum length of words to analyze |
| libraryPath | system dependent | path to directory containing libvoikko |
| poolMaxSize | 10 | maximum amount of Voikko-instances to pool |
| analysisCacheSize | 1024 | number of analysis results to cache |

## Development

To run the tests, you need to specify `voikko.home` system property which should point to
a directory containing libvoikko shared library and subdirectory `dicts` which contains
the [morpho dictionary](http://www.puimula.org/htp/testing/voikko-snapshot/dict-morpho.zip).

## License

This library is released under the [Apache License, Version 2.0](http://apache.org/licenses/LICENSE-2.0).
# Voikko Analysis for Elasticsearch

The Voikko Analysis plugin provides Finnish language analysis using [Voikko](http://voikko.puimula.org/).

## Supported versions

| Plugin version | Elasticsearch version |
| -------------- | ----------------------|
| [0.6.0](https://github.com/EvidentSolutions/elasticsearch-analysis-voikko/blob/v0.6.0/README.md) | 7.3.2 |
| [0.5.0](https://github.com/EvidentSolutions/elasticsearch-analysis-voikko/blob/v0.5.0/README.md) | 5.1.1 |
| [0.4.0](https://github.com/EvidentSolutions/elasticsearch-analysis-voikko/blob/v0.4.0/README.md) | 2.2.1 |
| [0.3.0](https://github.com/EvidentSolutions/elasticsearch-analysis-voikko/blob/v0.3.0/README.md) | 1.5.2 |

If you are not installing the latest version, follow the links in the table to see installation instructions for the old version.

## Installing

### Installing Voikko

The plugin needs `libvoikko` shared library to work. Details of installing the library varies
based on operating system. In Debian based systems `apt-get install libvoikko1` should work.

Next, you'll need to download [morpho dictionary](http://www.puimula.org/htp/testing/voikko-snapshot/dict-morpho.zip)
(for libvoikko version 4.0+, use [morpho dict v5](http://www.puimula.org/htp/testing/voikko-snapshot-v5/dict-morpho.zip)
instead).
Unzip this into Voikko's dictionary directory (e.g. `/usr/lib/voikko` in Debian) or into a directory you specify with
`dictionaryPath` configuration property.

### Installing the plugin

Finally, to install the plugin, run:

```
bin/elasticsearch-plugin install https://github.com/EvidentSolutions/elasticsearch-analysis-voikko/releases/download/v0.6.0/elasticsearch-analysis-voikko-0.6.0.zip
```

### Security policy

Elasticsearch ships with a pretty restrictive security policy. Plugins can specify the permissions
that they need in `plugin-security.policy`. However, elasticsearch-analysis-voikko uses
[JNA library](https://github.com/java-native-access/jna) which is already distributed with Elasticsearch
and therefore can't be included in the plugin zip. This means that the security policy bundled with the
plugin will not apply to JNA, yet it should be able to load `libvoikko` from the system.

Therefore you need to create a custom security policy, granting Elasticsearch itself the permission
to load `libvoikko`:

```
grant {
permission java.io.FilePermission "<<ALL FILES>>", "read";
permission java.lang.reflect.ReflectPermission "newProxyInPackage.org.puimula.libvoikko";
};
```

(You don't really need to grant read access to `<<ALL FILES>>`, you can pass the location
of `libvoikko` instead.)

Save this as `custom-elasticsearch.policy` and tell Elasticsearch to load it:

```
export ES_JAVA_OPTS=-Djava.security.policy=file:/path/to/custom-elasticsearch.policy
```

### Verify installation

After installing the plugin, you can quickly verify that it works by executing:

```
curl -XGET 'localhost:9200/_analyze' -d '
{
"tokenizer" : "finnish",
"filter" : [{"type": "voikko", "libraryPath": "/directory/of/libvoikko", "dictionaryPath": "/directory/of/voikko/dictionaries"}],
"text" : "Testataan voikon analyysiä tällä tavalla yksinkertaisesti."
}'
```

If this works without error messages, you can proceed to configure the plugin index.

## Configuring

Include `finnish` tokenizer and `voikko` filter in your analyzer, for example:

```json
{
"index": {
"analysis": {
"analyzer": {
"default": {
"tokenizer": "finnish",
"filter": ["lowercase", "voikkoFilter"]
}
},
"filter": {
"voikkoFilter": {
"type": "voikko"
}
}
}
}
}
```

You can use the following filter options to customize the behaviour of the filter:

| Parameter | Default value | Description |
|-------------------|------------------|--------------------------------------------------|
| language | fi_FI | Language to use |
| dictionaryPath | system dependent | path to voikko dictionaries |
| analyzeAll | false | Use all analysis possibilities or just the first |
| minimumWordSize | 3 | minimum length of words to analyze |
| maximumWordSize | 100 | maximum length of words to analyze |
| libraryPath | system dependent | path to directory containing libvoikko |
| poolMaxSize | 10 | maximum amount of Voikko-instances to pool |
| analysisCacheSize | 1024 | number of analysis results to cache |

## Development

To run the tests, you need to specify `voikko.home` system property which should point to
a directory containing libvoikko shared library and subdirectory `dicts` which contains
the [morpho dictionary](http://www.puimula.org/htp/testing/voikko-snapshot/dict-morpho.zip).

## License

This library is released under the [Apache License, Version 2.0](http://apache.org/licenses/LICENSE-2.0).
2 changes: 1 addition & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ apply plugin: 'elasticsearch.esplugin'
apply plugin: 'nebula.maven-publish' // esplugin needs nebula
apply plugin: 'maven-publish'

version = '0.5.0'
version = '0.6.0'

javadoc {
options.encoding = 'UTF-8'
Expand Down

0 comments on commit 2d368e5

Please sign in to comment.