-
Notifications
You must be signed in to change notification settings - Fork 0
Creating an ElasticSearch index from UIMA analysis results
We use the semedico-app UIMA pipeline(s) to create an ElasticSearch index on basis of UIMA NLP analysis results.
The semedico-app uses the internal JULIE Lab database reader to retrieve the actual NLP analysis results from our PostgreSQL database. The database is filled using the jules-preprocessing-pipelines project.
The central mechanics for the creation of an ElasticSearch index are given by the jules-cas-to-elasticsearch-consumer. This project offers code to - more or less - easily create Document
objects. Such documents are modelled closely to ElasticSearch / Lucene documents in that each document is basically a collection of named fields. As such, a document could be interpreted as a row in a conventional database table.
Documents are created by writing extensions of the FieldsGenerator
class and using them with JsonWriter
(creates JSON files, very well for development of FieldsGenerators) or ElasticSearchConsumer
.
Currently, we need to write an appropriate FieldsGenerator for index format GePi will use.