-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch 5.x (update the ES client) #74
Comments
There's no plan in place yet, and I'm still not sure what's the best way to keep compatibility with both ES 2.x and 5.x. We could start a branch for version 5.x for now, and maybe switch the client to use the REST API so it's not tied to any specific dependency version. |
I can help with this change, even if you use the REST API there some differences in the mapping i.e string type replaced by text and keywords. Maybe is necessary create a property to specify the ES version used and implement the two mappings. Regarding the ES 5.x, in the future, the crawler could have a new feature to use the ingest node to improve the indexing performance in some cases. |
Right, the REST API would only allow to remove elasticsearch specific versions from the classpath. We still would require such changes you mentioned. We could also have an plugin architecture in place, so that we can have different plugins for different ES versions. But that will probably take more time and will happen only after we solve issue #70. Once we support ES 5.x, we can also include support for the ingest node feature. Any help is welcome! |
Apparently the official ES REST client is compatible with all versions: https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_maven_repository.html |
Great! |
We use a code style derived from the Google Style Guide, but with 4 spaces for tabs. A Eclipse Code Formatter configuration file is available in the repository. Regarding the REST client, I still haven't had enough time to finish anything, but it looks like it's just a wrapper around Apache's HTTP async client, so it doesn't do anything special to keep compatibility with multiple ES versions. We need to handle changes in the API ourselves. |
I've just pushed an initial working version that works with 5.x (also tested with 1.x) using the REST client in branch target_storage.data_format.elasticsearch.rest.hosts:
- http://localhost:9200
target_storage.data_format.elasticsearch.rest.connect_timeout: 30000
target_storage.data_format.elasticsearch.rest.socket_timeout: 30000
target_storage.data_format.elasticsearch.rest.max_retry_timeout_millis: 90000 When any host is configured via It still lacks some more testing and I'm seeing timeout exceptions sometimes. |
Thank you, |
Thanks for testing! Please, let met know if I can close this issue and merge it into branch |
Hello @aecio ! We would like to make a few more tests until Monday and then, if everything is working, we are ready to close the issue. Is that fine for you? Thanks! |
Sure. I'm planning to release version 0.8 once this issue is closed. |
Hello @aecio, |
Hello,
I made some tests and the crawler is not working with the version 5.x of the Elasticsearch.
The ES client is out of date. Is there any plan to support the ES version 5.x?
Would be great support the version 5.x and have some options to use some features provided by the ingestion node.
The text was updated successfully, but these errors were encountered: