Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch 5.x (update the ES client) #74

Closed
samuelzvir opened this issue Apr 12, 2017 · 12 comments
Closed

Elasticsearch 5.x (update the ES client) #74

samuelzvir opened this issue Apr 12, 2017 · 12 comments
Milestone

Comments

@samuelzvir
Copy link

Hello,

I made some tests and the crawler is not working with the version 5.x of the Elasticsearch.
The ES client is out of date. Is there any plan to support the ES version 5.x?
Would be great support the version 5.x and have some options to use some features provided by the ingestion node.

@aecio
Copy link
Member

aecio commented Apr 12, 2017

There's no plan in place yet, and I'm still not sure what's the best way to keep compatibility with both ES 2.x and 5.x. We could start a branch for version 5.x for now, and maybe switch the client to use the REST API so it's not tied to any specific dependency version.

@samuelzvir
Copy link
Author

I can help with this change, even if you use the REST API there some differences in the mapping i.e string type replaced by text and keywords. Maybe is necessary create a property to specify the ES version used and implement the two mappings. Regarding the ES 5.x, in the future, the crawler could have a new feature to use the ingest node to improve the indexing performance in some cases.

@aecio
Copy link
Member

aecio commented Apr 13, 2017

Right, the REST API would only allow to remove elasticsearch specific versions from the classpath. We still would require such changes you mentioned.

We could also have an plugin architecture in place, so that we can have different plugins for different ES versions. But that will probably take more time and will happen only after we solve issue #70.

Once we support ES 5.x, we can also include support for the ingest node feature. Any help is welcome!

@aecio
Copy link
Member

aecio commented Apr 14, 2017

Apparently the official ES REST client is compatible with all versions: https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_maven_repository.html
I'm gonna give it a try!

@samuelzvir
Copy link
Author

Great!
I made some tests with the Java client (current approach with version 5). But I agree with you, the es rest client has a better approach to keep compatibility.
Do you have a guide to contributions? i.e to follow the project patterns...
Have you had success with the es rest?

@aecio
Copy link
Member

aecio commented Apr 17, 2017

We use a code style derived from the Google Style Guide, but with 4 spaces for tabs. A Eclipse Code Formatter configuration file is available in the repository.

Regarding the REST client, I still haven't had enough time to finish anything, but it looks like it's just a wrapper around Apache's HTTP async client, so it doesn't do anything special to keep compatibility with multiple ES versions. We need to handle changes in the API ourselves.

@aecio
Copy link
Member

aecio commented Apr 18, 2017

I've just pushed an initial working version that works with 5.x (also tested with 1.x) using the REST client in branch aecio/issue-74 (176a6cd). Currently, it can be configured using the following parameters in ache.yml:

target_storage.data_format.elasticsearch.rest.hosts:
  - http://localhost:9200
target_storage.data_format.elasticsearch.rest.connect_timeout: 30000
target_storage.data_format.elasticsearch.rest.socket_timeout: 30000
target_storage.data_format.elasticsearch.rest.max_retry_timeout_millis: 90000

When any host is configured via target_storage.data_format.elasticsearch.rest.hosts, it switches from the transport client (v1.x) to the REST client that is compatible with version v5.x.

It still lacks some more testing and I'm seeing timeout exceptions sometimes.

@samuelzvir
Copy link
Author

Thank you,
I am indexing with this configuration and is working. Also, @jefersonbraun can give a feedback if we get some exceptions.

@aecio aecio added this to the 0.8 milestone Apr 19, 2017
@aecio
Copy link
Member

aecio commented Apr 20, 2017

Thanks for testing! Please, let met know if I can close this issue and merge it into branch master.

@jefersonbraun
Copy link

Hello @aecio ! We would like to make a few more tests until Monday and then, if everything is working, we are ready to close the issue. Is that fine for you? Thanks!

@aecio
Copy link
Member

aecio commented Apr 20, 2017

Sure. I'm planning to release version 0.8 once this issue is closed.

@samuelzvir
Copy link
Author

Hello @aecio,
Ready to merge.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants