Skip to content

sherlok/sherlastic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sherlastic Elasticsearch Plugin

Semantic enrichment for Elasticsearch using Sherlok.

Overview

Sherlastic provides an Elasticsearch plugin to enhance its index with semantic information from Sherlok. That is, text mining is applied on every document being indexed and on every incoming search query. The extracted semantic information can then be used to improve the relevance of search results.

Version

Version Elasticsearch
master 1.6.X

Issues/Questions

Please file an issue.

Installation

Install Sherlastic Plugin

$ $ES_HOME/bin/plugin --install io.sherlok/sherlastic/1.6.0

Getting Started

Add Sherlok Analyzer

First, you need to add a sherlok analyzer when creating your index:

$ curl -XPUT 'localhost:9200/my_index' -d '{
  "index":{
    "analysis":{
      "analyzer":{
        "sherlok_analyzer":{
          "type":"custom",
          "tokenizer":"standard",
          "filter":["sherlok"],
          "pipeline": "02.ruta.annotate.countries",
          "version" : "1",
          "mapping" : {
            "org.sherlok.ruta.Country:iso" : "country_iso"
          }
        }
      }
    }
  }
}'

Feel free to change tokenizer, char_filter and filter settings, but sherlok filter needs to be added as a last filter.

Add Sherlok field

Put sherlok field into an index mapipng:

$ curl -XPUT "localhost:9200/my_index/my_type/_mapping" -d '{
  "my_type":{
    "properties":{
      "message":{
        "type":"string",
        "copy_to":"sherlok_value"
      },
      "sherlok_value":{
        "type":"sherlok",
        "sherlok_analyzer":"sherlok_analyzer"
      }
    }
  }
}'

Get Content created by Sherlok

Add the following document:

$ curl -XPUT "localhost:9200/my_index/my_type/1" -d '{
  "message":"Fess is Java based full text search server provided as OSS product."
}'

The sherlok value is calculated automatically when adding the document. You can check it as below:

$ curl -XGET "localhost:9200/my_index/my_type/1?pretty&fields=sherlok_value,_source" 

The response is:

{
  "_index" : "my_index",
  "_type" : "my_type",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source":{
      "message":"Fess is Java based full text search server provided as OSS product."
    },
  "fields" : {
    "sherlok_value" : [ "TODO" ]
  }
}

About

Semantic enrichment for Elasticsearch using Sherlok

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages