-
-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[timeseries] Add initial support for elasticsearch #99 #164
base: master
Are you sure you want to change the base?
[timeseries] Add initial support for elasticsearch #99 #164
Conversation
be49865
to
184f7a6
Compare
184f7a6
to
d383f17
Compare
6205616
to
de11205
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good progress @nepython, see my comments below.
@@ -385,7 +415,12 @@ call in your custom code (eg: a custom check class), you can do so as follows: | |||
"MEAN(buffered) AS buffered FROM {key} WHERE time >= '{time}' AND " | |||
"content_type = '{content_type}' AND object_id = '{object_id}' " | |||
"GROUP BY time(1d)" | |||
) | |||
), | |||
'elasticsearch': _make_query({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't you change the code so that calling this functon from the configuration is not necessary?
You can loop over the data structure and call it when it's initialized, this way we make things easy for users and we avoid them come to complain to us in the support channels 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that can be done and I have done the same for built-in charts. _make_query
is just a utility function which will update the aggregation for a default_chart_query
defined in openwisp_monitoring.db.backends.elasticsearch.queries. So queries returned via _make_query
will always retain the structure of default_chart_query
.
I wanted to leave the option of directly using a dsl query with timeseries_db.query
, exactly like how we can query InfluxDB directly using the same function.
A full dsl query will look like this,
{'query': {'nested': {'path': 'tags',
'query': {'bool': {'must': [{'match': {'tags.object_id': {'query': '9a39a5ae-146b-4a50-b113-f9381b8c1721'}}},
{'match': {'tags.content_type': {'query': 'config.device'}}}]}}}},
'_source': False,
'size': 0,
'aggs': {'GroupByTime': {'nested': {'path': 'points',
'aggs': {'set_range': {'filter': {'range': {'points.time': {'from': 'now-1d/d',
'to': 'now/d'}}},
'aggs': {'time': {'date_histogram': {'field': 'points.time',
'fixed_interval': '10m',
'format': 'date_time_no_millis',
'order': {'_key': 'desc'},
'time_zone': 'Asia/Kolkata'},
'aggs': {'nest': {'nested': {'path': 'points.fields',
'aggs': {'CPU_load': {'avg': {'field': 'points.fields.cpu_usage'}}}}}}}}}}}}}}
So, if we make _make_query
as a compulsion, we might be cutting down a user's freedom to query via DSL. Personally, I would like to give user this freedom (this would enable him to just put a query like above in chart configuration and it will work) 😄.
c789856
to
e07bc63
Compare
@@ -22,6 +24,45 @@ services: | |||
INFLUXDB_DB: openwisp2 | |||
INFLUXDB_USER: openwisp | |||
INFLUXDB_USER_PASSWORD: openwisp | |||
# clustered version of elasticsearch is used as that might be used in production | |||
es01: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to run Elasticsearch in a High Available environment. Testing HA capabilities is ElasticSearch's job 😃 . We can simply make sure that setting up a multi-nodes cluster works but IMHO it is enough for us to run tests in only one instance.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree though there are some problems that I was facing with elasticsearch docker due to which too I am using two nodes 😅. Can you please check out if it's possible to run elasticsearch on a single port (I am not sure about this as I could not :/ ) and then I can adapt. Thanks!
7bad1b0
to
e98535a
Compare
Currently, things to do:~
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, things to do:~
- Fix cardinality aggregation returning an extra value for
wifi_clients
causingtest_get_device_metrics_csv
to fail.
Please resolve this asap.
- Elasticsearch tests are currently consuming
~120s
to run. This can be reduced to75s
by retaining indices. There are other optimizations too which shall be implemented together in [timeseries] Optimize elasticsearch #168.
We may do this in a second step. We make it work first, then we optimize.
e98535a
to
9c53057
Compare
I need help, mentioned in opening comment 😬. I am still trying to resolve but couldn't find any fix, will question this on
Ok |
I see that you mentioned it, can you please write a detailed explanation of what is going on and provide links to the external info you mention? |
b5a9d0d
to
9a80da6
Compare
9a80da6
to
fa5759a
Compare
fa5759a
to
319ec3d
Compare
93039cf
to
91a648f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nepython how's it going with fixing the build here?
Please focus on completing this
@nemesisdesign, definitely I am working to fix the build, working mostly on https://github.com/nepython/openwisp-monitoring/tree/issues/99-add-initial-support-second-timeseries_db to try out minor changes but so far no good. I asked a query on S/O in the morning (due to no response on elasticsearch discussion forum 😶), it can be found at https://stackoverflow.com/questions/63590693/elasticsearch-cardinality-aggregation-returning-one-extra-count. As far as I can visualize, the clients Chart sometimes returns an extra count and sometimes it doesn't. In the current travis build one test is failing due to this. Locally, that one test is always passing now (but it was failing earlier 5-6 days ago just don't know why it is passing now :/). What I am trying to figure out right now is why sometimes the clients charts returns an extra count and correct count the other times. Is this even an issue on our end or on elasticsearch's (unsure)? I posted a detailed status on the PR and a request too to provide me with some JSON data or access to any instance in which the user is ready to experiment with elasticsearch (as you had mentioned in our last call) few days back on the IM channel https://gitter.im/openwisp/openwisp-monitoring?at=5f3f709f49148b41c9619185 and reminded the same personally but didn't receive any response 😓. Right now if we are in a hurry to merge this PR, we can do two things:~
This is a big add-on and something I have worked really hard upon so would really like to see this getting merged though even after a month I am not able to solve this one single issue (which is blocking everything else) :(. I am determined to fix this issue even after GSoC and see this module getting released and be helpful to others 😃. |
1a5f91b
to
624987b
Compare
624987b
to
044a29f
Compare
Checks:
Closes #99, #68
Need Help
There is one test (test_get_device_metrics_csv) failing on travis (it's probably caused by
cardinality aggregation
countingNone
fromdate_histogram
as a value). I am unable to find a solution to this. In real case this should be rare as can be confirmed by the passing oftest_wifi_hostapd
.PS: Query posted on https://discuss.elastic.co/t/cardinality-aggregation-always-returning-one-extra-count/243462