-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spike - Indexer performance on different Data Persistence Model designs #255
Comments
OpenSearch BenchmarkCurrently reading the docs to understand how OSB works. Some notes:
I managed to run OSB locally using Pyenv. Details
(.venv) @alex-GL66 ➜ opensearch-benchmark python3 -m venv .venv; source .venv/bin/activate
(.venv) @alex-GL66 ➜ opensearch-benchmark pip install opensearch-benchmark
(.venv) @alex-GL66 ➜ opensearch-benchmark export JAVA17_HOME=/usr/lib/jvm/temurin-17-jdk-amd64
(.venv) @alex-GL66 ➜ opensearch-benchmark opensearch-benchmark execute-test --distribution-version=2.13.0 --workload percolator --test-mode
____ _____ __ ____ __ __
/ __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
/ / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
/_/
[INFO] [Test Execution ID]: cf58479a-77c9-4694-8b88-bfee848cdfa6
[INFO] Preparing for test execution ...
[INFO] Downloading OpenSearch 2.13.0 (844.4 MB total size) [100%]
[INFO] Downloading workload data (191 bytes total size) [100.0%]
[INFO] Decompressing workload data from [/home/alex/wazuh/opensearch-benchmark/.benchmark/benchmarks/data/percolator/queries-2-1k.json.bz2] to [/home/alex/wazuh/opensearch-benchmark/.benchmark/benchmarks/data/percolator/queries-2-1k.json] ... [OK]
[INFO] Preparing file offset table for [/home/alex/wazuh/opensearch-benchmark/.benchmark/benchmarks/data/percolator/queries-2-1k.json] ... [OK]
[INFO] Executing test with workload [percolator], test_procedure [append-no-conflicts] and provision_config_instance ['defaults'] with version [2.13.0].
Running delete-index [100% done]
Running create-index [100% done]
Running check-cluster-health [100% done]
Running index [100% done]
Running refresh-after-index [100% done]
Running force-merge [100% done]
Running refresh-after-force-merge [100% done]
Running wait-until-merges-finish [100% done]
Running percolator_with_content_president_bush [100% done]
Running percolator_with_content_saddam_hussein [100% done]
Running percolator_with_content_hurricane_katrina [100% done]
Running percolator_with_content_google [100% done]
Running percolator_no_score_with_content_google [100% done]
Running percolator_with_highlighting [100% done]
Running percolator_with_content_ignore_me [100% done]
Running percolator_no_score_with_content_ignore_me [100% done]
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|---------------------------------------------------------------:|-------------------------------------------:|------------:|-------:|
| Cumulative indexing time of primary shards | | 0.0122667 | min |
| Min cumulative indexing time across primary shards | | 0 | min |
| Median cumulative indexing time across primary shards | | 0.00209167 | min |
| Max cumulative indexing time across primary shards | | 0.004 | min |
| Cumulative indexing throttle time of primary shards | | 0 | min |
| Min cumulative indexing throttle time across primary shards | | 0 | min |
| Median cumulative indexing throttle time across primary shards | | 0 | min |
| Max cumulative indexing throttle time across primary shards | | 0 | min |
| Cumulative merge time of primary shards | | 0 | min |
| Cumulative merge count of primary shards | | 0 | |
| Min cumulative merge time across primary shards | | 0 | min |
| Median cumulative merge time across primary shards | | 0 | min |
| Max cumulative merge time across primary shards | | 0 | min |
| Cumulative merge throttle time of primary shards | | 0 | min |
| Min cumulative merge throttle time across primary shards | | 0 | min |
| Median cumulative merge throttle time across primary shards | | 0 | min |
| Max cumulative merge throttle time across primary shards | | 0 | min |
| Cumulative refresh time of primary shards | | 0.00226667 | min |
| Cumulative refresh count of primary shards | | 30 | |
| Min cumulative refresh time across primary shards | | 0 | min |
| Median cumulative refresh time across primary shards | | 0.000358333 | min |
| Max cumulative refresh time across primary shards | | 0.0007 | min |
| Cumulative flush time of primary shards | | 0 | min |
| Cumulative flush count of primary shards | | 0 | |
| Min cumulative flush time across primary shards | | 0 | min |
| Median cumulative flush time across primary shards | | 0 | min |
| Max cumulative flush time across primary shards | | 0 | min |
| Total Young Gen GC time | | 0 | s |
| Total Young Gen GC count | | 0 | |
| Total Old Gen GC time | | 0 | s |
| Total Old Gen GC count | | 0 | |
| Store size | | 4.31528e-05 | GB |
| Translog size | | 3.07336e-07 | GB |
| Heap used for segments | | 0 | MB |
| Heap used for doc values | | 0 | MB |
| Heap used for terms | | 0 | MB |
| Heap used for norms | | 0 | MB |
| Heap used for points | | 0 | MB |
| Heap used for stored fields | | 0 | MB |
| Segment count | | 22 | |
| Min Throughput | index | 10299.1 | docs/s |
| Mean Throughput | index | 10299.1 | docs/s |
| Median Throughput | index | 10299.1 | docs/s |
| Max Throughput | index | 10299.1 | docs/s |
| 50th percentile latency | index | 81.7099 | ms |
| 100th percentile latency | index | 91.5731 | ms |
| 50th percentile service time | index | 81.7099 | ms |
| 100th percentile service time | index | 91.5731 | ms |
| error rate | index | 0 | % |
| Min Throughput | wait-until-merges-finish | 72.58 | ops/s |
| Mean Throughput | wait-until-merges-finish | 72.58 | ops/s |
| Median Throughput | wait-until-merges-finish | 72.58 | ops/s |
| Max Throughput | wait-until-merges-finish | 72.58 | ops/s |
| 100th percentile latency | wait-until-merges-finish | 13.1389 | ms |
| 100th percentile service time | wait-until-merges-finish | 13.1389 | ms |
| error rate | wait-until-merges-finish | 0 | % |
| Min Throughput | percolator_with_content_president_bush | 32.24 | ops/s |
| Mean Throughput | percolator_with_content_president_bush | 32.24 | ops/s |
| Median Throughput | percolator_with_content_president_bush | 32.24 | ops/s |
| Max Throughput | percolator_with_content_president_bush | 32.24 | ops/s |
| 100th percentile latency | percolator_with_content_president_bush | 37.6739 | ms |
| 100th percentile service time | percolator_with_content_president_bush | 6.40732 | ms |
| error rate | percolator_with_content_president_bush | 0 | % |
| Min Throughput | percolator_with_content_saddam_hussein | 115.68 | ops/s |
| Mean Throughput | percolator_with_content_saddam_hussein | 115.68 | ops/s |
| Median Throughput | percolator_with_content_saddam_hussein | 115.68 | ops/s |
| Max Throughput | percolator_with_content_saddam_hussein | 115.68 | ops/s |
| 100th percentile latency | percolator_with_content_saddam_hussein | 14.9318 | ms |
| 100th percentile service time | percolator_with_content_saddam_hussein | 5.95973 | ms |
| error rate | percolator_with_content_saddam_hussein | 0 | % |
| Min Throughput | percolator_with_content_hurricane_katrina | 84.38 | ops/s |
| Mean Throughput | percolator_with_content_hurricane_katrina | 84.38 | ops/s |
| Median Throughput | percolator_with_content_hurricane_katrina | 84.38 | ops/s |
| Max Throughput | percolator_with_content_hurricane_katrina | 84.38 | ops/s |
| 100th percentile latency | percolator_with_content_hurricane_katrina | 18.1493 | ms |
| 100th percentile service time | percolator_with_content_hurricane_katrina | 5.96843 | ms |
| error rate | percolator_with_content_hurricane_katrina | 0 | % |
| Min Throughput | percolator_with_content_google | 47.06 | ops/s |
| Mean Throughput | percolator_with_content_google | 47.06 | ops/s |
| Median Throughput | percolator_with_content_google | 47.06 | ops/s |
| Max Throughput | percolator_with_content_google | 47.06 | ops/s |
| 100th percentile latency | percolator_with_content_google | 27.8973 | ms |
| 100th percentile service time | percolator_with_content_google | 6.37702 | ms |
| error rate | percolator_with_content_google | 0 | % |
| Min Throughput | percolator_no_score_with_content_google | 101.72 | ops/s |
| Mean Throughput | percolator_no_score_with_content_google | 101.72 | ops/s |
| Median Throughput | percolator_no_score_with_content_google | 101.72 | ops/s |
| Max Throughput | percolator_no_score_with_content_google | 101.72 | ops/s |
| 100th percentile latency | percolator_no_score_with_content_google | 17.8059 | ms |
| 100th percentile service time | percolator_no_score_with_content_google | 7.73091 | ms |
| error rate | percolator_no_score_with_content_google | 0 | % |
| Min Throughput | percolator_with_highlighting | 81.3 | ops/s |
| Mean Throughput | percolator_with_highlighting | 81.3 | ops/s |
| Median Throughput | percolator_with_highlighting | 81.3 | ops/s |
| Max Throughput | percolator_with_highlighting | 81.3 | ops/s |
| 100th percentile latency | percolator_with_highlighting | 20.5377 | ms |
| 100th percentile service time | percolator_with_highlighting | 7.81483 | ms |
| error rate | percolator_with_highlighting | 0 | % |
| Min Throughput | percolator_with_content_ignore_me | 17.47 | ops/s |
| Mean Throughput | percolator_with_content_ignore_me | 17.47 | ops/s |
| Median Throughput | percolator_with_content_ignore_me | 17.47 | ops/s |
| Max Throughput | percolator_with_content_ignore_me | 17.47 | ops/s |
| 100th percentile latency | percolator_with_content_ignore_me | 85.7778 | ms |
| 100th percentile service time | percolator_with_content_ignore_me | 28.0983 | ms |
| error rate | percolator_with_content_ignore_me | 0 | % |
| Min Throughput | percolator_no_score_with_content_ignore_me | 54.39 | ops/s |
| Mean Throughput | percolator_no_score_with_content_ignore_me | 54.39 | ops/s |
| Median Throughput | percolator_no_score_with_content_ignore_me | 54.39 | ops/s |
| Max Throughput | percolator_no_score_with_content_ignore_me | 54.39 | ops/s |
| 100th percentile latency | percolator_no_score_with_content_ignore_me | 26.549 | ms |
| 100th percentile service time | percolator_no_score_with_content_ignore_me | 7.92226 | ms |
| error rate | percolator_no_score_with_content_ignore_me | 0 | % |
--------------------------------
[INFO] SUCCESS (took 89 seconds)
-------------------------------- We'll work on creating a Vagrant environment with 3 OpenSearch nodes and OpenSearch Benchmark installed on each of them to perform the tests. See Running distributed loads. |
Update
|
With this command, we can run the default Note This operation is time consuming. opensearch-benchmark execute-test --pipeline=benchmark-only --workload=http_logs --target-host=https://localhost:9200 --client-options=basic_auth_user:admin,basic_auth_password:"${OPENSEARCH_INITIAL_ADMIN_PASSWORD}",verify_certs:false There are 2 ways of creating custom workloads:
|
I tested creating a workload from an existing cluster, for which I used a test AIO deployment with real world data. I used this command: opensearch-benchmark create-workload \
--workload="wazuh-test" \
--target-hosts="https://localhost:9200" \
--client-options="basic_auth_user:'admin',basic_auth_password:'admin',verify_certs:false" \
--indices="wazuh-alerts-4.x-2024.04.22" \
--output-path="./wazuh-workload" I then run the test: opensearch-benchmark execute-test \
--pipeline="benchmark-only" \
--workload-path="./wazuh-workload/wazuh-test" \
--target-host="https://localhost:9200" \
--client-options="basic_auth_user:'admin',basic_auth_password:'admin',verify_certs:false" Below is the result: # ./run_custom_workload.sh
____ _____ __ ____ __ __
/ __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
/ / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
/_/
[INFO] [Test Execution ID]: 586e8225-db0d-4f26-bcb8-ce616f6b8ec6
[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
[INFO] Executing test with workload [wazuh-test], test_procedure [default-test-procedure] and provision_config_instance ['external'] with version [7.10.2].
[WARNING] merges_total_time is 93391 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_total_time is 59258 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 559684 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 21547 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running delete-index [100% done]
Running create-index [100% done]
Running cluster-health [100% done]
Running index-append [100% done]
Running refresh-after-index [100% done]
Running force-merge [100% done]
Running refresh-after-force-merge [100% done]
Running wait-until-merges-finish [100% done]
Running match-all [100% done]
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|---------------------------------------------------------------:|-------------------------:|-----------:|-------:|
| Cumulative indexing time of primary shards | | 0.985733 | min |
| Min cumulative indexing time across primary shards | | 0 | min |
| Median cumulative indexing time across primary shards | | 0 | min |
| Max cumulative indexing time across primary shards | | 0.142783 | min |
| Cumulative indexing throttle time of primary shards | | 0 | min |
| Min cumulative indexing throttle time across primary shards | | 0 | min |
| Median cumulative indexing throttle time across primary shards | | 0 | min |
| Max cumulative indexing throttle time across primary shards | | 0 | min |
| Cumulative merge time of primary shards | | 1.56205 | min |
| Cumulative merge count of primary shards | | 6115 | |
| Min cumulative merge time across primary shards | | 0 | min |
| Median cumulative merge time across primary shards | | 0 | min |
| Max cumulative merge time across primary shards | | 0.178017 | min |
| Cumulative merge throttle time of primary shards | | 0 | min |
| Min cumulative merge throttle time across primary shards | | 0 | min |
| Median cumulative merge throttle time across primary shards | | 0 | min |
| Max cumulative merge throttle time across primary shards | | 0 | min |
| Cumulative refresh time of primary shards | | 9.35108 | min |
| Cumulative refresh count of primary shards | | 57555 | |
| Min cumulative refresh time across primary shards | | 0 | min |
| Median cumulative refresh time across primary shards | | 0 | min |
| Max cumulative refresh time across primary shards | | 1.29032 | min |
| Cumulative flush time of primary shards | | 0.3596 | min |
| Cumulative flush count of primary shards | | 726 | |
| Min cumulative flush time across primary shards | | 0 | min |
| Median cumulative flush time across primary shards | | 0 | min |
| Max cumulative flush time across primary shards | | 0.122833 | min |
| Total Young Gen GC time | | 0.014 | s |
| Total Young Gen GC count | | 1 | |
| Total Old Gen GC time | | 0 | s |
| Total Old Gen GC count | | 0 | |
| Store size | | 0.116531 | GB |
| Translog size | | 0.00978717 | GB |
| Heap used for segments | | 0 | MB |
| Heap used for doc values | | 0 | MB |
| Heap used for terms | | 0 | MB |
| Heap used for norms | | 0 | MB |
| Heap used for points | | 0 | MB |
| Heap used for stored fields | | 0 | MB |
| Segment count | | 722 | |
| Min Throughput | index-append | 6846.26 | docs/s |
| Mean Throughput | index-append | 6846.26 | docs/s |
| Median Throughput | index-append | 6846.26 | docs/s |
| Max Throughput | index-append | 6846.26 | docs/s |
| 50th percentile latency | index-append | 240.359 | ms |
| 100th percentile latency | index-append | 243.049 | ms |
| 50th percentile service time | index-append | 240.359 | ms |
| 100th percentile service time | index-append | 243.049 | ms |
| error rate | index-append | 0 | % |
| Min Throughput | wait-until-merges-finish | 24.51 | ops/s |
| Mean Throughput | wait-until-merges-finish | 24.51 | ops/s |
| Median Throughput | wait-until-merges-finish | 24.51 | ops/s |
| Max Throughput | wait-until-merges-finish | 24.51 | ops/s |
| 100th percentile latency | wait-until-merges-finish | 37.1372 | ms |
| 100th percentile service time | wait-until-merges-finish | 37.1372 | ms |
| error rate | wait-until-merges-finish | 0 | % |
| Min Throughput | match-all | 3.02 | ops/s |
| Mean Throughput | match-all | 3.03 | ops/s |
| Median Throughput | match-all | 3.03 | ops/s |
| Max Throughput | match-all | 3.05 | ops/s |
| 50th percentile latency | match-all | 6.85162 | ms |
| 90th percentile latency | match-all | 7.55348 | ms |
| 99th percentile latency | match-all | 8.55737 | ms |
| 100th percentile latency | match-all | 9.84485 | ms |
| 50th percentile service time | match-all | 4.84304 | ms |
| 90th percentile service time | match-all | 5.46714 | ms |
| 99th percentile service time | match-all | 6.36302 | ms |
| 100th percentile service time | match-all | 7.92025 | ms |
| error rate | match-all | 0 | % |
---------------------------------
[INFO] SUCCESS (took 109 seconds)
---------------------------------
|
It looks like we can indeed run tasks concurrently, using the |
Using the method to create a custom workload described above, I created a workload with the following test procedures: root@os-benchmarks:~# cat benchmarks/wazuh-alerts/test_procedures/default.json
{
"name": "parallel-any",
"description": "Workload completed-by property",
"schedule": [
{
"parallel": {
"tasks": [
{
"name": "parellel-task-1",
"operation": {
"operation-type": "bulk",
"bulk-size": 1000
},
"clients": 100
},
{
"name": "parellel-task-2",
"operation": {
"operation-type": "bulk",
"bulk-size": 1000
},
"clients": 100
}
]
}
}
]
} This was run with the following docker environment: services:
opensearch-benchmark:
image: opensearchproject/opensearch-benchmark:1.6.0
hostname: opensearch-benchmark
depends_on:
opensearch-node1:
condition: service_healthy
permissions-setter:
condition: service_completed_successfully
container_name: opensearch-benchmark
volumes:
- ./benchmarks:/opensearch-benchmark/.benchmark
environment:
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD}
#command: execute-test --target-hosts https://opensearch-node1:9200 --pipeline benchmark-only --workload geonames --client-options basic_auth_user:admin,basic_auth_password:${OPENSEARCH_INITIAL_ADMIN_PASSWORD},verify_certs:false --test-mode
command: execute-test --pipeline="benchmark-only" --workload-path="/opensearch-benchmark/.benchmark/wazuh-alerts" --target-host="https://opensearch-node1:9200" --client-options="basic_auth_user:admin,basic_auth_password:${OPENSEARCH_INITIAL_ADMIN_PASSWORD},verify_certs:false"
networks:
- opensearch-net
opensearch-node1: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/)
image: opensearchproject/opensearch:2.14.0
container_name: opensearch-node1
hostname: opensearch-node1
environment:
- cluster.name=opensearch-cluster # Name the cluster
- node.name=opensearch-node1 # Name the node that will run in this container
- discovery.seed_hosts=opensearch-node1,opensearch-node2 # Nodes to look for when discovering the cluster
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2 # Nodes eligibile to serve as cluster manager
- bootstrap.memory_lock=true # Disable JVM heap memory swapping
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD} # Sets the demo admin user password when using demo configuration (for OpenSearch 2.12 and later)
ulimits:
memlock:
soft: -1 # Set memlock to unlimited (no soft or hard limit)
hard: -1
nofile:
soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536
hard: 65536
volumes:
- opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container
healthcheck:
test: curl -sku admin:${OPENSEARCH_INITIAL_ADMIN_PASSWORD} https://localhost:9200/_cat/health | grep -q opensearch-cluster
start_period: 10s
start_interval: 3s
ports:
- 9200:9200 # REST API
- 9600:9600 # Performance Analyzer
networks:
- opensearch-net # All of the containers will join the same Docker bridge network
opensearch-node2:
image: opensearchproject/opensearch:2.14.0 # This should be the same image used for opensearch-node1 to avoid issues
container_name: opensearch-node2
hostname: opensearch-node2
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node2
- discovery.seed_hosts=opensearch-node1,opensearch-node2
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD}
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data2:/usr/share/opensearch/data
networks:
- opensearch-net
opensearch-dashboards:
image: opensearchproject/opensearch-dashboards:2.14.0 # Make sure the version of opensearch-dashboards matches the version of opensearch installed on other nodes
container_name: opensearch-dashboards
depends_on:
opensearch-node1:
condition: service_healthy
ports:
- 5601:5601 # Map host port 5601 to container port 5601
expose:
- "5601" # Expose port 5601 for web access to OpenSearch Dashboards
environment:
OPENSEARCH_HOSTS: '["https://opensearch-node1:9200","https://opensearch-node2:9200"]' # Define the OpenSearch nodes that OpenSearch Dashboards will query
networks:
- opensearch-net
permissions-setter:
image: alpine:3.14
container_name: permissions-setter
volumes:
- ./benchmarks:/benchmark
entrypoint: /bin/sh
command: >
-c '
chmod -R a+rw /benchmark
'
volumes:
opensearch-data1:
opensearch-data2:
networks:
opensearch-net: Below are the results of the test: / __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
/ / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
/_/
[INFO] [Test Execution ID]: 41430721-7ced-41b4-b363-8eaf19f73221
[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
[WARNING] refresh_total_time is 6 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running parellel-task-2,parellel-task-1 [100% done][INFO] Executing test with workload [wazuh-alerts], test_procedure [parallel-any] and provision_config_instance ['external'] with version [2.14.0].
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|---------------------------------------------------------------:|----------------:|------------:|-------:|
| Cumulative indexing time of primary shards | | 0.0863 | min |
| Min cumulative indexing time across primary shards | | 0 | min |
| Median cumulative indexing time across primary shards | | 0 | min |
| Max cumulative indexing time across primary shards | | 0.0863 | min |
| Cumulative indexing throttle time of primary shards | | 0 | min |
| Min cumulative indexing throttle time across primary shards | | 0 | min |
| Median cumulative indexing throttle time across primary shards | | 0 | min |
| Max cumulative indexing throttle time across primary shards | | 0 | min |
| Cumulative merge time of primary shards | | 0 | min |
| Cumulative merge count of primary shards | | 0 | |
| Min cumulative merge time across primary shards | | 0 | min |
| Median cumulative merge time across primary shards | | 0 | min |
| Max cumulative merge time across primary shards | | 0 | min |
| Cumulative merge throttle time of primary shards | | 0 | min |
| Min cumulative merge throttle time across primary shards | | 0 | min |
| Median cumulative merge throttle time across primary shards | | 0 | min |
| Max cumulative merge throttle time across primary shards | | 0 | min |
| Cumulative refresh time of primary shards | | 0.0001 | min |
| Cumulative refresh count of primary shards | | 77 | |
| Min cumulative refresh time across primary shards | | 0 | min |
| Median cumulative refresh time across primary shards | | 0 | min |
| Max cumulative refresh time across primary shards | | 8.33333e-05 | min |
| Cumulative flush time of primary shards | | 0 | min |
| Cumulative flush count of primary shards | | 0 | |
| Min cumulative flush time across primary shards | | 0 | min |
| Median cumulative flush time across primary shards | | 0 | min |
| Max cumulative flush time across primary shards | | 0 | min |
| Total Young Gen GC time | | 0.086 | s |
| Total Young Gen GC count | | 7 | |
| Total Old Gen GC time | | 0 | s |
| Total Old Gen GC count | | 0 | |
| Store size | | 0.0571623 | GB |
| Translog size | | 0.0342067 | GB |
| Heap used for segments | | 0 | MB |
| Heap used for doc values | | 0 | MB |
| Heap used for terms | | 0 | MB |
| Heap used for norms | | 0 | MB |
| Heap used for points | | 0 | MB |
| Heap used for stored fields | | 0 | MB |
| Segment count | | 73 | |
| Min Throughput | parellel-task-1 | 6930.8 | docs/s |
| Mean Throughput | parellel-task-1 | 6930.8 | docs/s |
| Median Throughput | parellel-task-1 | 6930.8 | docs/s |
| Max Throughput | parellel-task-1 | 6930.8 | docs/s |
| 50th percentile latency | parellel-task-1 | 963.212 | ms |
| 100th percentile latency | parellel-task-1 | 1050.9 | ms |
| 50th percentile service time | parellel-task-1 | 963.212 | ms |
| 100th percentile service time | parellel-task-1 | 1050.9 | ms |
| error rate | parellel-task-1 | 0 | % |
| Min Throughput | parellel-task-2 | 752.03 | docs/s |
| Mean Throughput | parellel-task-2 | 752.03 | docs/s |
| Median Throughput | parellel-task-2 | 752.03 | docs/s |
| Max Throughput | parellel-task-2 | 752.03 | docs/s |
| 50th percentile latency | parellel-task-2 | 991.137 | ms |
| 100th percentile latency | parellel-task-2 | 1094.09 | ms |
| 50th percentile service time | parellel-task-2 | 991.137 | ms |
| 100th percentile service time | parellel-task-2 | 1094.09 | ms |
| error rate | parellel-task-2 | 0 | % |
--------------------------------
[INFO] SUCCESS (took 16 seconds)
-------------------------------- The root@os-benchmarks:~# curl -ku admin:Secret.Password.1234 https://localhost:9200/_cat/indices?s=store.size
green open .opensearch-observability VEodRP5XRWCaUTyIxp947g 1 1 0 0 416b 208b
green open .ql-datasources kKh4Hp4HQeaM17jF-h4ZFg 1 1 0 0 416b 208b
green open .plugins-ml-config KqwdywM0QpGBHGDBgxTsPA 1 1 1 0 7.8kb 3.9kb
green open .kibana_92668751_admin_1 75yGMhz_S42b_lxukdV5zA 1 1 1 0 10.3kb 5.1kb
green open .kibana_1 ehjT55saT0S_2ragBN9O_g 1 1 1 0 10.3kb 5.1kb
green open .opendistro_security tHeF1aZ6SImXyyB_TGVeDA 1 1 10 0 97.8kb 48.9kb
green open security-auditlog-2024.06.12 iGBt812KR4CMYZR0WAlprA 1 1 55 0 143.3kb 63.9kb
green open queries WaVFFYN-QDyU4WCO-kDdPA 5 0 1000 0 196.2kb 196.2kb
green open security-auditlog-2024.06.25 9QI-2iC7QM2ocX6LvfHHBw 1 1 263 0 530.9kb 274.2kb
green open wazuh-alerts-4.x-2024.05 ji1Q8AvcQHePD2LeSSbRDg 1 1 31480 0 47.6mb 22.9mb |
An OpenSearch Benchmark workload can run various types of operations. The bulk operation seems to be the only one at the document level, and I cannot find a mentions of it being capable of bulk-updating or bulk-deleting documents. There still might be a way to achieve this since operations like |
The nyc taxis sample workload seems to include an update operation. |
New Here is an example of a reindex operation being referenced in a test procedure, and the corresponding definition of the custom operation: |
I tried creating a workload that uses the My corpora looks as follows: test.json
workload.json
test_procedures/default.json
I set the
This resulted in the "metadata" lines of the After closer inspection, I realized that the We need to determine whether we are to:
|
I run two benchmarks on indexing operations only. The {
"name": "single-bulk",
"description": "Customized test procedure with a single bulk request indexing 10k wazuh-alerts documents.",
"schedule": [
{
"operation": {
"name": "single-bulk-index-task",
"operation-type": "bulk",
"bulk-size": 10000
}
}
]
} {
"name": "parallel-any",
"description": "Customized test procedure with a parallel bulk requests indexing 5k, 3k, 1.5k and 0.5k wazuh-alerts documents in parallel bulks.",
"schedule": [
{
"parallel": {
"tasks": [
{
"name": "5k-events-task",
"operation": {
"operation-type": "bulk",
"bulk-size": 5000
},
"clients": 1
},
{
"name": "3k-events-task",
"operation": {
"operation-type": "bulk",
"bulk-size": 3000
},
"clients": 1
},
{
"name": "1.5k-events-task",
"operation": {
"operation-type": "bulk",
"bulk-size": 1500
},
"clients": 1
},
{
"name": "0.5k-events-task",
"operation": {
"operation-type": "bulk",
"bulk-size": 500
},
"clients": 1
}
]
}
}
]
} ResultsSingle 10k bulk: ____ _____ __ ____ __ __
/ __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
/ / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
/_/
[INFO] [Test Execution ID]: 6e80dbbd-f96b-4fe9-b685-0b63710abb0e
[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
[WARNING] indexing_total_time is 42 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 339 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running single-bulk-index-task [100% done]
[INFO] Executing test with workload [wazuh-alerts-single-bulk], test_procedure [default-test-procedure] and provision_config_instance ['external'] with version [2.14.0].
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|---------------------------------------------------------------:|-----------------------:|-----------:|-------:|
| Cumulative indexing time of primary shards | | 0.03075 | min |
| Min cumulative indexing time across primary shards | | 0 | min |
| Median cumulative indexing time across primary shards | | 0.00035 | min |
| Max cumulative indexing time across primary shards | | 0.02785 | min |
| Cumulative indexing throttle time of primary shards | | 0 | min |
| Min cumulative indexing throttle time across primary shards | | 0 | min |
| Median cumulative indexing throttle time across primary shards | | 0 | min |
| Max cumulative indexing throttle time across primary shards | | 0 | min |
| Cumulative merge time of primary shards | | 0.00353333 | min |
| Cumulative merge count of primary shards | | 5 | |
| Min cumulative merge time across primary shards | | 0 | min |
| Median cumulative merge time across primary shards | | 0 | min |
| Max cumulative merge time across primary shards | | 0.00353333 | min |
| Cumulative merge throttle time of primary shards | | 0 | min |
| Min cumulative merge throttle time across primary shards | | 0 | min |
| Median cumulative merge throttle time across primary shards | | 0 | min |
| Max cumulative merge throttle time across primary shards | | 0 | min |
| Cumulative refresh time of primary shards | | 0.0314 | min |
| Cumulative refresh count of primary shards | | 127 | |
| Min cumulative refresh time across primary shards | | 0 | min |
| Median cumulative refresh time across primary shards | | 0.002825 | min |
| Max cumulative refresh time across primary shards | | 0.0139333 | min |
| Cumulative flush time of primary shards | | 0 | min |
| Cumulative flush count of primary shards | | 0 | |
| Min cumulative flush time across primary shards | | 0 | min |
| Median cumulative flush time across primary shards | | 0 | min |
| Max cumulative flush time across primary shards | | 0 | min |
| Total Young Gen GC time | | 0.162 | s |
| Total Young Gen GC count | | 17 | |
| Total Old Gen GC time | | 0 | s |
| Total Old Gen GC count | | 0 | |
| Store size | | 0.0294357 | GB |
| Translog size | | 0.0379527 | GB |
| Heap used for segments | | 0 | MB |
| Heap used for doc values | | 0 | MB |
| Heap used for terms | | 0 | MB |
| Heap used for norms | | 0 | MB |
| Heap used for points | | 0 | MB |
| Heap used for stored fields | | 0 | MB |
| Segment count | | 26 | |
| Min Throughput | single-bulk-index-task | 940.61 | docs/s |
| Mean Throughput | single-bulk-index-task | 1205.66 | docs/s |
| Median Throughput | single-bulk-index-task | 1205.66 | docs/s |
| Max Throughput | single-bulk-index-task | 1470.72 | docs/s |
| 50th percentile latency | single-bulk-index-task | 3858.7 | ms |
| 100th percentile latency | single-bulk-index-task | 6775.95 | ms |
| 50th percentile service time | single-bulk-index-task | 3858.7 | ms |
| 100th percentile service time | single-bulk-index-task | 6775.95 | ms |
| error rate | single-bulk-index-task | 0 | % |
--------------------------------
[INFO] SUCCESS (took 21 seconds)
-------------------------------- Parallel indexing ____ _____ __ ____ __ __
/ __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
/ / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
/_/
[INFO] [Test Execution ID]: 63d78463-b816-44de-9a5f-16a08084a061
[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
[INFO] Preparing file offset table for [/opensearch-benchmark/.benchmark/wazuh-alerts-parallelized/wazuh-alerts-benchmark-data-documents.json] ... [OK]
[WARNING] indexing_total_time is 36 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 328 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running 3k-events-task,0.5k-events-task,5k-events-task,1.5k-events-task [100% done][INFO] Executing test with workload [wazuh-alerts-parallelized], test_procedure [parallel-any] and provision_config_instance ['external'] with version [2.14.0].
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|---------------------------------------------------------------:|-----------------:|-----------:|-------:|
| Cumulative indexing time of primary shards | | 0.270033 | min |
| Min cumulative indexing time across primary shards | | 0 | min |
| Median cumulative indexing time across primary shards | | 0.0003 | min |
| Max cumulative indexing time across primary shards | | 0.262817 | min |
| Cumulative indexing throttle time of primary shards | | 0 | min |
| Min cumulative indexing throttle time across primary shards | | 0 | min |
| Median cumulative indexing throttle time across primary shards | | 0 | min |
| Max cumulative indexing throttle time across primary shards | | 0 | min |
| Cumulative merge time of primary shards | | 0.0474 | min |
| Cumulative merge count of primary shards | | 16 | |
| Min cumulative merge time across primary shards | | 0 | min |
| Median cumulative merge time across primary shards | | 0 | min |
| Max cumulative merge time across primary shards | | 0.0384833 | min |
| Cumulative merge throttle time of primary shards | | 0 | min |
| Min cumulative merge throttle time across primary shards | | 0 | min |
| Median cumulative merge throttle time across primary shards | | 0 | min |
| Max cumulative merge throttle time across primary shards | | 0 | min |
| Cumulative refresh time of primary shards | | 0.0934167 | min |
| Cumulative refresh count of primary shards | | 215 | |
| Min cumulative refresh time across primary shards | | 0 | min |
| Median cumulative refresh time across primary shards | | 0.00273333 | min |
| Max cumulative refresh time across primary shards | | 0.0502833 | min |
| Cumulative flush time of primary shards | | 0 | min |
| Cumulative flush count of primary shards | | 0 | |
| Min cumulative flush time across primary shards | | 0 | min |
| Median cumulative flush time across primary shards | | 0 | min |
| Max cumulative flush time across primary shards | | 0 | min |
| Total Young Gen GC time | | 0.654 | s |
| Total Young Gen GC count | | 65 | |
| Total Old Gen GC time | | 0 | s |
| Total Old Gen GC count | | 0 | |
| Store size | | 0.111155 | GB |
| Translog size | | 0.153145 | GB |
| Heap used for segments | | 0 | MB |
| Heap used for doc values | | 0 | MB |
| Heap used for terms | | 0 | MB |
| Heap used for norms | | 0 | MB |
| Heap used for points | | 0 | MB |
| Heap used for stored fields | | 0 | MB |
| Segment count | | 24 | |
| Min Throughput | 5k-events-task | 544 | docs/s |
| Mean Throughput | 5k-events-task | 567.73 | docs/s |
| Median Throughput | 5k-events-task | 544.24 | docs/s |
| Max Throughput | 5k-events-task | 614.94 | docs/s |
| 50th percentile latency | 5k-events-task | 2398.31 | ms |
| 100th percentile latency | 5k-events-task | 9173.81 | ms |
| 50th percentile service time | 5k-events-task | 2398.31 | ms |
| 100th percentile service time | 5k-events-task | 9173.81 | ms |
| error rate | 5k-events-task | 0 | % |
| Min Throughput | 3k-events-task | 516.46 | docs/s |
| Mean Throughput | 3k-events-task | 619 | docs/s |
| Median Throughput | 3k-events-task | 608.69 | docs/s |
| Max Throughput | 3k-events-task | 732.44 | docs/s |
| 50th percentile latency | 3k-events-task | 1587.65 | ms |
| 100th percentile latency | 3k-events-task | 5794.57 | ms |
| 50th percentile service time | 3k-events-task | 1587.65 | ms |
| 100th percentile service time | 3k-events-task | 5794.57 | ms |
| error rate | 3k-events-task | 0 | % |
| Min Throughput | 1.5k-events-task | 326.99 | docs/s |
| Mean Throughput | 1.5k-events-task | 564.57 | docs/s |
| Median Throughput | 1.5k-events-task | 585.95 | docs/s |
| Max Throughput | 1.5k-events-task | 780.17 | docs/s |
| 50th percentile latency | 1.5k-events-task | 1036.56 | ms |
| 100th percentile latency | 1.5k-events-task | 4576.54 | ms |
| 50th percentile service time | 1.5k-events-task | 1036.56 | ms |
| 100th percentile service time | 1.5k-events-task | 4576.54 | ms |
| error rate | 1.5k-events-task | 0 | % |
| Min Throughput | 0.5k-events-task | 180.57 | docs/s |
| Mean Throughput | 0.5k-events-task | 484.79 | docs/s |
| Median Throughput | 0.5k-events-task | 520.7 | docs/s |
| Max Throughput | 0.5k-events-task | 851.66 | docs/s |
| 50th percentile latency | 0.5k-events-task | 364.129 | ms |
| 90th percentile latency | 0.5k-events-task | 725.394 | ms |
| 100th percentile latency | 0.5k-events-task | 2762.02 | ms |
| 50th percentile service time | 0.5k-events-task | 364.129 | ms |
| 90th percentile service time | 0.5k-events-task | 725.394 | ms |
| 100th percentile service time | 0.5k-events-task | 2762.02 | ms |
| error rate | 0.5k-events-task | 0 | % |
--------------------------------
[INFO] SUCCESS (took 26 seconds)
-------------------------------- |
It was determined that we need to test the optimal ingest bulk size in the 10-100MB range. I'm currently working on setting up a benchmark as the one above on a 3 node cluster on top of 3 EC2 instances. The benchmark can be run locally from any terminal. |
I created a number of wazuh-alerts json files with file sizes ranging from 5MB through 100MB in 5MB increments. root@os-benchmarks:~/benchmarks/wazuh-alerts# ls -lh
total 1.1G
-rwxrwxrwx 1 root root 1.1K Jul 1 20:05 generate_config.sh
-rwxrwxrwx 1 root root 270 Jul 1 19:14 generate_files.sh
drwxrwxrwx 2 root root 4.0K Jul 1 20:16 operations
drwxrwxrwx 2 root root 4.0K Jul 1 20:31 test_procedures
-rw-rw-rw- 1 root root 10M Jul 1 19:14 wazuh-alerts-10.json
-rw-rw-rw- 1 root root 100M Jul 1 19:06 wazuh-alerts-100.json
-rw-rw-rw- 1 root root 15M Jul 1 19:14 wazuh-alerts-15.json
-rw-rw-rw- 1 root root 20M Jul 1 19:14 wazuh-alerts-20.json
-rw-rw-rw- 1 root root 25M Jul 1 19:14 wazuh-alerts-25.json
-rw-rw-rw- 1 root root 30M Jul 1 19:14 wazuh-alerts-30.json
-rw-rw-rw- 1 root root 35M Jul 1 19:14 wazuh-alerts-35.json
-rw-rw-rw- 1 root root 40M Jul 1 19:14 wazuh-alerts-40.json
-rw-rw-rw- 1 root root 45M Jul 1 19:14 wazuh-alerts-45.json
-rw-rw-rw- 1 root root 5.0M Jul 1 19:14 wazuh-alerts-5.json
-rw-rw-rw- 1 root root 50M Jul 1 19:14 wazuh-alerts-50.json
-rw-rw-rw- 1 root root 55M Jul 1 19:14 wazuh-alerts-55.json
-rw-rw-rw- 1 root root 60M Jul 1 19:14 wazuh-alerts-60.json
-rw-rw-rw- 1 root root 65M Jul 1 19:14 wazuh-alerts-65.json
-rw-rw-rw- 1 root root 70M Jul 1 19:14 wazuh-alerts-70.json
-rw-rw-rw- 1 root root 75M Jul 1 19:14 wazuh-alerts-75.json
-rw-rw-rw- 1 root root 80M Jul 1 19:14 wazuh-alerts-80.json
-rw-rw-rw- 1 root root 85M Jul 1 19:14 wazuh-alerts-85.json
-rw-rw-rw- 1 root root 90M Jul 1 19:14 wazuh-alerts-90.json
-rw-rw-rw- 1 root root 95M Jul 1 19:14 wazuh-alerts-95.json
-rw-rw-rw- 1 root root 141K Jun 27 11:28 wazuh-alerts.json
-rw-rw-rw- 1 root root 6.2K Jul 1 20:36 workload.json The workload.json{% import "benchmark.helpers" as benchmark with context %}
{
"version": 2,
"description": "Tracker-generated workload for wazuh-alerts",
"indices": [
{
"name": "wazuh-alerts-5",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-10",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-15",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-20",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-25",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-30",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-35",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-40",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-45",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-50",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-55",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-60",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-65",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-70",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-75",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-80",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-85",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-90",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-95",
"body": "wazuh-alerts.json"
},
{
"name": "wazuh-alerts-100",
"body": "wazuh-alerts.json"
}
],
"corpora": [
{
"name": "wazuh-alerts-5",
"documents": [
{
"target-index": "wazuh-alerts-5",
"source-file": "wazuh-alerts-5.json",
"document-count": 3426
}
]
},
{
"name": "wazuh-alerts-10",
"documents": [
{
"target-index": "wazuh-alerts-10",
"source-file": "wazuh-alerts-10.json",
"document-count": 6616
}
]
},
{
"name": "wazuh-alerts-15",
"documents": [
{
"target-index": "wazuh-alerts-15",
"source-file": "wazuh-alerts-15.json",
"document-count": 9958
}
]
},
{
"name": "wazuh-alerts-20",
"documents": [
{
"target-index": "wazuh-alerts-20",
"source-file": "wazuh-alerts-20.json",
"document-count": 13933
}
]
},
{
"name": "wazuh-alerts-25",
"documents": [
{
"target-index": "wazuh-alerts-25",
"source-file": "wazuh-alerts-25.json",
"document-count": 17180
}
]
},
{
"name": "wazuh-alerts-30",
"documents": [
{
"target-index": "wazuh-alerts-30",
"source-file": "wazuh-alerts-30.json",
"document-count": 20404
}
]
},
{
"name": "wazuh-alerts-35",
"documents": [
{
"target-index": "wazuh-alerts-35",
"source-file": "wazuh-alerts-35.json",
"document-count": 23737
}
]
},
{
"name": "wazuh-alerts-40",
"documents": [
{
"target-index": "wazuh-alerts-40",
"source-file": "wazuh-alerts-40.json",
"document-count": 27706
}
]
},
{
"name": "wazuh-alerts-45",
"documents": [
{
"target-index": "wazuh-alerts-45",
"source-file": "wazuh-alerts-45.json",
"document-count": 30998
}
]
},
{
"name": "wazuh-alerts-50",
"documents": [
{
"target-index": "wazuh-alerts-50",
"source-file": "wazuh-alerts-50.json",
"document-count": 34187
}
]
},
{
"name": "wazuh-alerts-55",
"documents": [
{
"target-index": "wazuh-alerts-55",
"source-file": "wazuh-alerts-55.json",
"document-count": 37774
}
]
},
{
"name": "wazuh-alerts-60",
"documents": [
{
"target-index": "wazuh-alerts-60",
"source-file": "wazuh-alerts-60.json",
"document-count": 41473
}
]
},
{
"name": "wazuh-alerts-65",
"documents": [
{
"target-index": "wazuh-alerts-65",
"source-file": "wazuh-alerts-65.json",
"document-count": 44729
}
]
},
{
"name": "wazuh-alerts-70",
"documents": [
{
"target-index": "wazuh-alerts-70",
"source-file": "wazuh-alerts-70.json",
"document-count": 47947
}
]
},
{
"name": "wazuh-alerts-75",
"documents": [
{
"target-index": "wazuh-alerts-75",
"source-file": "wazuh-alerts-75.json",
"document-count": 51993
}
]
},
{
"name": "wazuh-alerts-80",
"documents": [
{
"target-index": "wazuh-alerts-80",
"source-file": "wazuh-alerts-80.json",
"document-count": 55225
}
]
},
{
"name": "wazuh-alerts-85",
"documents": [
{
"target-index": "wazuh-alerts-85",
"source-file": "wazuh-alerts-85.json",
"document-count": 58442
}
]
},
{
"name": "wazuh-alerts-90",
"documents": [
{
"target-index": "wazuh-alerts-90",
"source-file": "wazuh-alerts-90.json",
"document-count": 61854
}
]
},
{
"name": "wazuh-alerts-95",
"documents": [
{
"target-index": "wazuh-alerts-95",
"source-file": "wazuh-alerts-95.json",
"document-count": 65786
}
]
},
{
"name": "wazuh-alerts-100",
"documents": [
{
"target-index": "wazuh-alerts-100",
"source-file": "wazuh-alerts-100.json",
"document-count": 69053
}
]
}
],
"test_procedures": [
{{ benchmark.collect(parts="test_procedures/*.json") }}
]
}
test_procedures/default.json{
"name": "Wazuh Alerts Ingestion Test",
"description": "Test ingestion in 5MB increments",
"default": true,
"schedule": [
{
"operation": {
"name": "bulk-index-5-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-5",
"bulk-size": 3426
}
},
{
"operation": {
"name": "bulk-index-10-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-10",
"bulk-size": 6616
}
},
{
"operation": {
"name": "bulk-index-15-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-15",
"bulk-size": 9958
}
},
{
"operation": {
"name": "bulk-index-20-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-20",
"bulk-size": 13933
}
},
{
"operation": {
"name": "bulk-index-25-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-25",
"bulk-size": 17180
}
},
{
"operation": {
"name": "bulk-index-30-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-30",
"bulk-size": 20404
}
},
{
"operation": {
"name": "bulk-index-35-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-35",
"bulk-size": 23737
}
},
{
"operation": {
"name": "bulk-index-40-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-40",
"bulk-size": 27706
}
},
{
"operation": {
"name": "bulk-index-45-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-45",
"bulk-size": 30998
}
},
{
"operation": {
"name": "bulk-index-50-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-50",
"bulk-size": 34187
}
},
{
"operation": {
"name": "bulk-index-55-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-55",
"bulk-size": 37774
}
},
{
"operation": {
"name": "bulk-index-60-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-60",
"bulk-size": 41473
}
},
{
"operation": {
"name": "bulk-index-65-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-65",
"bulk-size": 44729
}
},
{
"operation": {
"name": "bulk-index-70-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-70",
"bulk-size": 47947
}
},
{
"operation": {
"name": "bulk-index-75-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-75",
"bulk-size": 51993
}
},
{
"operation": {
"name": "bulk-index-80-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-80",
"bulk-size": 55225
}
},
{
"operation": {
"name": "bulk-index-85-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-85",
"bulk-size": 58442
}
},
{
"operation": {
"name": "bulk-index-90-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-90",
"bulk-size": 61854
}
},
{
"operation": {
"name": "bulk-index-95-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-95",
"bulk-size": 65786
}
},
{
"operation": {
"name": "bulk-index-100-mb",
"operation-type": "bulk",
"corpora": "wazuh-alerts-100",
"bulk-size": 69053
}
}
]
}
Local resultsroot@os-benchmarks:~# docker compose up opensearch-benchmark
[+] Running 3/0
✔ Container opensearch-node1 Running 0.0s
✔ Container permissions-setter Created 0.0s
✔ Container opensearch-benchmark Created 0.0s
Attaching to opensearch-benchmark
opensearch-benchmark |
opensearch-benchmark | ____ _____ __ ____ __ __
opensearch-benchmark | / __ \____ ___ ____ / ___/___ ____ ___________/ /_ / __ )___ ____ _____/ /_ ____ ___ ____ ______/ /__
opensearch-benchmark | / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \ / __ / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
opensearch-benchmark | / /_/ / /_/ / __/ / / /__/ / __/ /_/ / / / /__/ / / / / /_/ / __/ / / / /__/ / / / / / / / / /_/ / / / ,<
opensearch-benchmark | \____/ .___/\___/_/ /_/____/\___/\__,_/_/ \___/_/ /_/ /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/ /_/|_|
opensearch-benchmark | /_/
opensearch-benchmark |
opensearch-benchmark | [INFO] [Test Execution ID]: 83bb247f-44e8-46f0-a763-73f10b6d4577
opensearch-benchmark | [INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
opensearch-benchmark | [WARNING] merges_total_time is 1059 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
opensearch-benchmark | [WARNING] indexing_total_time is 16708 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
opensearch-benchmark | [WARNING] refresh_total_time is 11434 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
opensearch-benchmark | [WARNING] flush_total_time is 38 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
opensearch-benchmark | Running bulk-index-5-mb [100% done]
Running bulk-index-10-mb [100% done]
Running bulk-index-15-mb [100% done]
Running bulk-index-20-mb [100% done]
Running bulk-index-25-mb [100% done]
Running bulk-index-30-mb [100% done]
Running bulk-index-35-mb [100% done]
Running bulk-index-40-mb [100% done]
Running bulk-index-45-mb [100% done]
opensearch-benchmark | [ERROR] rejected_execution_exception ({'error': {'root_cause': [{'type': 'rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, coordinating_operation_bytes=56497001, max_coordinating_and_primary_bytes=53687091]'}], 'type': 'rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, coordinating_operation_bytes=56497001, max_coordinating_and_primary_bytes=53687091]'}, 'status': 429})
Running bulk-index-50-mb [100% done]
opensearch-benchmark | [ERROR] rejected_execution_exception ({'error': {'root_cause': [{'type': 'rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, coordinating_operation_bytes=62166679, max_coordinating_and_primary_bytes=53687091]'}], 'type': 'rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, coordinating_operation_bytes=62166679, max_coordinating_and_primary_bytes=53687091]'}, 'status': 429})
Running bulk-index-55-mb [100% done]
opensearch-benchmark | [ERROR] circuit_breaking_exception ({'error': {'root_cause': [{'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [516691048/492.7mb], which is larger than the limit of [510027366/486.3mb], real usage: [387461696/369.5mb], new bytes reserved: [129229352/123.2mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=129229352/123.2mb]', 'bytes_wanted': 516691048, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}], 'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [516691048/492.7mb], which is larger than the limit of [510027366/486.3mb], real usage: [387461696/369.5mb], new bytes reserved: [129229352/123.2mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=129229352/123.2mb]', 'bytes_wanted': 516691048, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}, 'status': 429})
Running bulk-index-60-mb [100% done]
opensearch-benchmark | [ERROR] rejected_execution_exception ({'error': {'root_cause': [{'type': 'rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, coordinating_operation_bytes=73480223, max_coordinating_and_primary_bytes=53687091]'}], 'type': 'rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, coordinating_operation_bytes=73480223, max_coordinating_and_primary_bytes=53687091]'}, 'status': 429})
Running bulk-index-65-mb [100% done]
opensearch-benchmark | [ERROR] circuit_breaking_exception ({'error': {'root_cause': [{'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [541932828/516.8mb], which is larger than the limit of [510027366/486.3mb], real usage: [391200848/373mb], new bytes reserved: [150731980/143.7mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=150731980/143.7mb]', 'bytes_wanted': 541932828, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}], 'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [541932828/516.8mb], which is larger than the limit of [510027366/486.3mb], real usage: [391200848/373mb], new bytes reserved: [150731980/143.7mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=150731980/143.7mb]', 'bytes_wanted': 541932828, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}, 'status': 429})
Running bulk-index-70-mb [100% done]
opensearch-benchmark | [ERROR] circuit_breaking_exception ({'error': {'root_cause': [{'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [555845794/530mb], which is larger than the limit of [510027366/486.3mb], real usage: [394297376/376mb], new bytes reserved: [161548418/154mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=161548418/154mb]', 'bytes_wanted': 555845794, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}], 'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [555845794/530mb], which is larger than the limit of [510027366/486.3mb], real usage: [394297376/376mb], new bytes reserved: [161548418/154mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=161548418/154mb]', 'bytes_wanted': 555845794, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}, 'status': 429})
Running bulk-index-75-mb [100% done]
opensearch-benchmark | [ERROR] circuit_breaking_exception ({'error': {'root_cause': [{'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [572270300/545.7mb], which is larger than the limit of [510027366/486.3mb], real usage: [399970840/381.4mb], new bytes reserved: [172299460/164.3mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=172299460/164.3mb]', 'bytes_wanted': 572270300, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}], 'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [572270300/545.7mb], which is larger than the limit of [510027366/486.3mb], real usage: [399970840/381.4mb], new bytes reserved: [172299460/164.3mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=172299460/164.3mb]', 'bytes_wanted': 572270300, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}, 'status': 429})
Running bulk-index-80-mb [100% done]
opensearch-benchmark | [ERROR] circuit_breaking_exception ({'error': {'root_cause': [{'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [591658024/564.2mb], which is larger than the limit of [510027366/486.3mb], real usage: [408608040/389.6mb], new bytes reserved: [183049984/174.5mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=183049984/174.5mb]', 'bytes_wanted': 591658024, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}], 'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [591658024/564.2mb], which is larger than the limit of [510027366/486.3mb], real usage: [408608040/389.6mb], new bytes reserved: [183049984/174.5mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=183049984/174.5mb]', 'bytes_wanted': 591658024, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}, 'status': 429})
Running bulk-index-85-mb [100% done]
opensearch-benchmark | [ERROR] rejected_execution_exception ({'error': {'root_cause': [{'type': 'rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, coordinating_operation_bytes=101732377, max_coordinating_and_primary_bytes=53687091]'}], 'type': 'rejected_execution_exception', 'reason': 'rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, coordinating_operation_bytes=101732377, max_coordinating_and_primary_bytes=53687091]'}, 'status': 429})
Running bulk-index-90-mb [100% done]
opensearch-benchmark | [ERROR] circuit_breaking_exception ({'error': {'root_cause': [{'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [526842176/502.4mb], which is larger than the limit of [510027366/486.3mb], real usage: [322218352/307.2mb], new bytes reserved: [204623824/195.1mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=204623824/195.1mb]', 'bytes_wanted': 526842176, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}], 'type': 'circuit_breaking_exception', 'reason': '[parent] Data too large, data for [<http_request>] would be [526842176/502.4mb], which is larger than the limit of [510027366/486.3mb], real usage: [322218352/307.2mb], new bytes reserved: [204623824/195.1mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=204623824/195.1mb]', 'bytes_wanted': 526842176, 'bytes_limit': 510027366, 'durability': 'TRANSIENT'}, 'status': 429})
Running bulk-index-95-mb [100% done]
opensearch-benchmark | [ERROR]
Running bulk-index-100-mb [100% done][INFO] Executing test with workload [wazuh-alerts], test_procedure [Wazuh Alerts Ingestion Test] and provision_config_instance ['external'] with version [2.14.0].
opensearch-benchmark |
opensearch-benchmark |
opensearch-benchmark | ------------------------------------------------------
opensearch-benchmark | _______ __ _____
opensearch-benchmark | / ____(_)___ ____ _/ / / ___/_________ ________
opensearch-benchmark | / /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
opensearch-benchmark | / __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
opensearch-benchmark | /_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
opensearch-benchmark | ------------------------------------------------------
opensearch-benchmark |
opensearch-benchmark | | Metric | Task | Value | Unit |
opensearch-benchmark | |---------------------------------------------------------------:|------------------:|------------:|-------:|
opensearch-benchmark | | Cumulative indexing time of primary shards | | 0.512283 | min |
opensearch-benchmark | | Min cumulative indexing time across primary shards | | 0 | min |
opensearch-benchmark | | Median cumulative indexing time across primary shards | | 0.0343667 | min |
opensearch-benchmark | | Max cumulative indexing time across primary shards | | 0.0954833 | min |
opensearch-benchmark | | Cumulative indexing throttle time of primary shards | | 0 | min |
opensearch-benchmark | | Min cumulative indexing throttle time across primary shards | | 0 | min |
opensearch-benchmark | | Median cumulative indexing throttle time across primary shards | | 0 | min |
opensearch-benchmark | | Max cumulative indexing throttle time across primary shards | | 0 | min |
opensearch-benchmark | | Cumulative merge time of primary shards | | 0.01765 | min |
opensearch-benchmark | | Cumulative merge count of primary shards | | 74 | |
opensearch-benchmark | | Min cumulative merge time across primary shards | | 0 | min |
opensearch-benchmark | | Median cumulative merge time across primary shards | | 0 | min |
opensearch-benchmark | | Max cumulative merge time across primary shards | | 0.01765 | min |
opensearch-benchmark | | Cumulative merge throttle time of primary shards | | 0 | min |
opensearch-benchmark | | Min cumulative merge throttle time across primary shards | | 0 | min |
opensearch-benchmark | | Median cumulative merge throttle time across primary shards | | 0 | min |
opensearch-benchmark | | Max cumulative merge throttle time across primary shards | | 0 | min |
opensearch-benchmark | | Cumulative refresh time of primary shards | | 0.223817 | min |
opensearch-benchmark | | Cumulative refresh count of primary shards | | 787 | |
opensearch-benchmark | | Min cumulative refresh time across primary shards | | 0 | min |
opensearch-benchmark | | Median cumulative refresh time across primary shards | | 0.008775 | min |
opensearch-benchmark | | Max cumulative refresh time across primary shards | | 0.103083 | min |
opensearch-benchmark | | Cumulative flush time of primary shards | | 0.00116667 | min |
opensearch-benchmark | | Cumulative flush count of primary shards | | 5 | |
opensearch-benchmark | | Min cumulative flush time across primary shards | | 0 | min |
opensearch-benchmark | | Median cumulative flush time across primary shards | | 0 | min |
opensearch-benchmark | | Max cumulative flush time across primary shards | | 0.000533333 | min |
opensearch-benchmark | | Total Young Gen GC time | | 0.502 | s |
opensearch-benchmark | | Total Young Gen GC count | | 116 | |
opensearch-benchmark | | Total Old Gen GC time | | 0 | s |
opensearch-benchmark | | Total Old Gen GC count | | 0 | |
opensearch-benchmark | | Store size | | 0.268493 | GB |
opensearch-benchmark | | Translog size | | 0.463153 | GB |
opensearch-benchmark | | Heap used for segments | | 0 | MB |
opensearch-benchmark | | Heap used for doc values | | 0 | MB |
opensearch-benchmark | | Heap used for terms | | 0 | MB |
opensearch-benchmark | | Heap used for norms | | 0 | MB |
opensearch-benchmark | | Heap used for points | | 0 | MB |
opensearch-benchmark | | Heap used for stored fields | | 0 | MB |
opensearch-benchmark | | Segment count | | 58 | |
opensearch-benchmark | | Min Throughput | bulk-index-5-mb | 7803.9 | docs/s |
opensearch-benchmark | | Mean Throughput | bulk-index-5-mb | 7803.9 | docs/s |
opensearch-benchmark | | Median Throughput | bulk-index-5-mb | 7803.9 | docs/s |
opensearch-benchmark | | Max Throughput | bulk-index-5-mb | 7803.9 | docs/s |
opensearch-benchmark | | 100th percentile latency | bulk-index-5-mb | 429.577 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-5-mb | 429.577 | ms |
opensearch-benchmark | | error rate | bulk-index-5-mb | 0 | % |
opensearch-benchmark | | Min Throughput | bulk-index-10-mb | 8445.34 | docs/s |
opensearch-benchmark | | Mean Throughput | bulk-index-10-mb | 8445.34 | docs/s |
opensearch-benchmark | | Median Throughput | bulk-index-10-mb | 8445.34 | docs/s |
opensearch-benchmark | | Max Throughput | bulk-index-10-mb | 8445.34 | docs/s |
opensearch-benchmark | | 100th percentile latency | bulk-index-10-mb | 774.252 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-10-mb | 774.252 | ms |
opensearch-benchmark | | error rate | bulk-index-10-mb | 0 | % |
opensearch-benchmark | | Min Throughput | bulk-index-15-mb | 9598 | docs/s |
opensearch-benchmark | | Mean Throughput | bulk-index-15-mb | 9598 | docs/s |
opensearch-benchmark | | Median Throughput | bulk-index-15-mb | 9598 | docs/s |
opensearch-benchmark | | Max Throughput | bulk-index-15-mb | 9598 | docs/s |
opensearch-benchmark | | 100th percentile latency | bulk-index-15-mb | 1028.43 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-15-mb | 1028.43 | ms |
opensearch-benchmark | | error rate | bulk-index-15-mb | 0 | % |
opensearch-benchmark | | Min Throughput | bulk-index-20-mb | 9030.53 | docs/s |
opensearch-benchmark | | Mean Throughput | bulk-index-20-mb | 9030.53 | docs/s |
opensearch-benchmark | | Median Throughput | bulk-index-20-mb | 9030.53 | docs/s |
opensearch-benchmark | | Max Throughput | bulk-index-20-mb | 9030.53 | docs/s |
opensearch-benchmark | | 100th percentile latency | bulk-index-20-mb | 1530.53 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-20-mb | 1530.53 | ms |
opensearch-benchmark | | error rate | bulk-index-20-mb | 0 | % |
opensearch-benchmark | | Min Throughput | bulk-index-25-mb | 9354.05 | docs/s |
opensearch-benchmark | | Mean Throughput | bulk-index-25-mb | 9354.05 | docs/s |
opensearch-benchmark | | Median Throughput | bulk-index-25-mb | 9354.05 | docs/s |
opensearch-benchmark | | Max Throughput | bulk-index-25-mb | 9354.05 | docs/s |
opensearch-benchmark | | 100th percentile latency | bulk-index-25-mb | 1821.08 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-25-mb | 1821.08 | ms |
opensearch-benchmark | | error rate | bulk-index-25-mb | 0 | % |
opensearch-benchmark | | Min Throughput | bulk-index-30-mb | 10155.6 | docs/s |
opensearch-benchmark | | Mean Throughput | bulk-index-30-mb | 10155.6 | docs/s |
opensearch-benchmark | | Median Throughput | bulk-index-30-mb | 10155.6 | docs/s |
opensearch-benchmark | | Max Throughput | bulk-index-30-mb | 10155.6 | docs/s |
opensearch-benchmark | | 100th percentile latency | bulk-index-30-mb | 1992.57 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-30-mb | 1992.57 | ms |
opensearch-benchmark | | error rate | bulk-index-30-mb | 0 | % |
opensearch-benchmark | | Min Throughput | bulk-index-35-mb | 9319.89 | docs/s |
opensearch-benchmark | | Mean Throughput | bulk-index-35-mb | 9319.89 | docs/s |
opensearch-benchmark | | Median Throughput | bulk-index-35-mb | 9319.89 | docs/s |
opensearch-benchmark | | Max Throughput | bulk-index-35-mb | 9319.89 | docs/s |
opensearch-benchmark | | 100th percentile latency | bulk-index-35-mb | 2526.72 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-35-mb | 2526.72 | ms |
opensearch-benchmark | | error rate | bulk-index-35-mb | 0 | % |
opensearch-benchmark | | Min Throughput | bulk-index-40-mb | 8431.44 | docs/s |
opensearch-benchmark | | Mean Throughput | bulk-index-40-mb | 8431.44 | docs/s |
opensearch-benchmark | | Median Throughput | bulk-index-40-mb | 8431.44 | docs/s |
opensearch-benchmark | | Max Throughput | bulk-index-40-mb | 8431.44 | docs/s |
opensearch-benchmark | | 100th percentile latency | bulk-index-40-mb | 3258.53 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-40-mb | 3258.53 | ms |
opensearch-benchmark | | error rate | bulk-index-40-mb | 0 | % |
opensearch-benchmark | | Min Throughput | bulk-index-45-mb | 8907.38 | docs/s |
opensearch-benchmark | | Mean Throughput | bulk-index-45-mb | 8907.38 | docs/s |
opensearch-benchmark | | Median Throughput | bulk-index-45-mb | 8907.38 | docs/s |
opensearch-benchmark | | Max Throughput | bulk-index-45-mb | 8907.38 | docs/s |
opensearch-benchmark | | 100th percentile latency | bulk-index-45-mb | 3453.96 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-45-mb | 3453.96 | ms |
opensearch-benchmark | | error rate | bulk-index-45-mb | 0 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-50-mb | 225.545 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-50-mb | 225.545 | ms |
opensearch-benchmark | | error rate | bulk-index-50-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-55-mb | 210.582 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-55-mb | 210.582 | ms |
opensearch-benchmark | | error rate | bulk-index-55-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-60-mb | 194.882 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-60-mb | 194.882 | ms |
opensearch-benchmark | | error rate | bulk-index-60-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-65-mb | 243.262 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-65-mb | 243.262 | ms |
opensearch-benchmark | | error rate | bulk-index-65-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-70-mb | 209.641 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-70-mb | 209.641 | ms |
opensearch-benchmark | | error rate | bulk-index-70-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-75-mb | 262.526 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-75-mb | 262.526 | ms |
opensearch-benchmark | | error rate | bulk-index-75-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-80-mb | 252.767 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-80-mb | 252.767 | ms |
opensearch-benchmark | | error rate | bulk-index-80-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-85-mb | 262.211 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-85-mb | 262.211 | ms |
opensearch-benchmark | | error rate | bulk-index-85-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-90-mb | 320.98 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-90-mb | 320.98 | ms |
opensearch-benchmark | | error rate | bulk-index-90-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-95-mb | 300.169 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-95-mb | 300.169 | ms |
opensearch-benchmark | | error rate | bulk-index-95-mb | 100 | % |
opensearch-benchmark | | 100th percentile latency | bulk-index-100-mb | 165.695 | ms |
opensearch-benchmark | | 100th percentile service time | bulk-index-100-mb | 165.695 | ms |
opensearch-benchmark | | error rate | bulk-index-100-mb | 100 | % |
opensearch-benchmark |
opensearch-benchmark |
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-50-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-50-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-55-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-55-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-60-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-60-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-65-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-65-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-70-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-70-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-75-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-75-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-80-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-80-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-85-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-85-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-90-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-90-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-95-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-95-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark | [WARNING] Error rate is 100.0 for operation 'bulk-index-100-mb'. Please check the logs.
opensearch-benchmark | [WARNING] No throughput metrics available for [bulk-index-100-mb]. Likely cause: Error rate is 100.0%. Please check the logs.
opensearch-benchmark |
opensearch-benchmark | ---------------------------------
opensearch-benchmark | [INFO] SUCCESS (took 136 seconds)
opensearch-benchmark | ---------------------------------
opensearch-benchmark exited with code 0
Operations above 50MB return a 429 error code (too many requests) so I need to tweak this a little further to make sure I'm giving the cluster enough time to process the request. So far, I've only tested this on my local dockerized environment, but I have the EC2 infrastructure ready to run the tests as soon as I've refined the workloads to include shard allocation and proper warm-up and clean up stages. |
Setup:The benchmarks were run on top of 3 EC2 instances with 16GB RAM and a 8 core, 2200MHz AMD EPYC 7571 processor. docker context ls$ docker context ls
NAME DESCRIPTION DOCKER ENDPOINT ERROR
benchmark ssh://root@benchmark
default * Current DOCKER_HOST based configuration unix:///var/run/docker.sock
node-1 ssh://root@benchmark-node1
node-2 ssh://root@benchmark-node2
node-3 ssh://root@benchmark-node3 Each node had its own docker compose: node-1.ymlservices:
opensearch-node1: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/)
image: opensearchproject/opensearch:2.14.0
container_name: opensearch-node1
hostname: opensearch-node1
environment:
- NODE1_LOCAL_IP=${NODE1_LOCAL_IP}
- NODE2_LOCAL_IP=${NODE2_LOCAL_IP}
- NODE3_LOCAL_IP=${NODE3_LOCAL_IP}
- cluster.name=opensearch-cluster # Name the cluster
- network.publish_host=${NODE1_LOCAL_IP}
- http.publish_host=${NODE1_LOCAL_IP}
- transport.publish_host=${NODE1_LOCAL_IP}
- node.name=opensearch-node1 # Name the node that will run in this container
- discovery.seed_hosts=${NODE1_LOCAL_IP},${NODE2_LOCAL_IP},${NODE3_LOCAL_IP}, # Nodes to look for when discovering the cluster
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2,opensearch-node3 # Nodes eligibile to serve as cluster manager
- bootstrap.memory_lock=true # Disable JVM heap memory swapping
- "OPENSEARCH_JAVA_OPTS=-Xms8g -Xmx8g" # Set min and max JVM heap sizes to at least 50% of system RAM
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD} # Sets the demo admin user password when using demo configuration (for OpenSearch 2.12 and later)
ulimits:
memlock:
soft: -1 # Set memlock to unlimited (no soft or hard limit)
hard: -1
nofile:
soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536
hard: 65536
volumes:
- opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container
#healthcheck:
# test: curl -sku admin:${OPENSEARCH_INITIAL_ADMIN_PASSWORD} https://opensearch-node1:9200/_cat/health | grep -q opensearch-cluster
# start_period: 10s
# start_interval: 3s
ports:
- 9200:9200 # REST API
- 9300:9300 # REST API
- 9600:9600 # Performance Analyzer
networks:
- opensearch-net # All of the containers will join the same Docker bridge network
opensearch-dashboards:
image: opensearchproject/opensearch-dashboards:2.14.0 # Make sure the version of opensearch-dashboards matches the version of opensearch installed on other nodes
container_name: opensearch-dashboards
#depends_on:
# opensearch-node1:
# condition: service_healthy
ports:
- 5601:5601 # Map host port 5601 to container port 5601
expose:
- "5601" # Expose port 5601 for web access to OpenSearch Dashboards
environment:
- NODE1_LOCAL_IP=${NODE1_LOCAL_IP}
- NODE2_LOCAL_IP=${NODE2_LOCAL_IP}
- NODE3_LOCAL_IP=${NODE3_LOCAL_IP}
- OPENSEARCH_HOSTS=["https://${NODE1_LOCAL_IP}:9200","https://${NODE2_LOCAL_IP}:9200","https://${NODE3_LOCAL_IP}:9200"]
networks:
- opensearch-net
volumes:
opensearch-data1:
networks:
opensearch-net: node-2.ymlservices:
opensearch-node2:
image: opensearchproject/opensearch:2.14.0 # This should be the same image used for opensearch-node1 to avoid issues
container_name: opensearch-node2
hostname: opensearch-node2
environment:
- NODE1_LOCAL_IP=${NODE1_LOCAL_IP}
- NODE2_LOCAL_IP=${NODE2_LOCAL_IP}
- NODE3_LOCAL_IP=${NODE3_LOCAL_IP}
- cluster.name=opensearch-cluster
- network.publish_host=${NODE2_LOCAL_IP}
- http.publish_host=${NODE2_LOCAL_IP}
- transport.publish_host=${NODE2_LOCAL_IP}
- node.name=opensearch-node2
- discovery.seed_hosts=${NODE1_LOCAL_IP},${NODE2_LOCAL_IP},${NODE3_LOCAL_IP}, # Nodes to look for when discovering the cluster
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2,opensearch-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms8g -Xmx8g"
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD}
ports:
- 9200:9200
- 9300:9300
- 9600:9600
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data2:/usr/share/opensearch/data
networks:
- opensearch-net
volumes:
opensearch-data2:
networks:
opensearch-net:
node-3.ymlservices:
opensearch-node3:
image: opensearchproject/opensearch:2.14.0 # This should be the same image used for opensearch-node1 to avoid issues
container_name: opensearch-node3
hostname: opensearch-node3
environment:
- NODE1_LOCAL_IP=${NODE1_LOCAL_IP}
- NODE2_LOCAL_IP=${NODE2_LOCAL_IP}
- NODE3_LOCAL_IP=${NODE3_LOCAL_IP}
- network.publish_host=${NODE3_LOCAL_IP}
- http.publish_host=${NODE3_LOCAL_IP}
- transport.publish_host=${NODE3_LOCAL_IP}
- cluster.name=opensearch-cluster
- node.name=opensearch-node3
- discovery.seed_hosts=${NODE1_LOCAL_IP},${NODE2_LOCAL_IP},${NODE3_LOCAL_IP}, # Nodes to look for when discovering the cluster
- cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2,opensearch-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms8g -Xmx8g"
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD}
ports:
- 9200:9200
- 9300:9300
- 9600:9600
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data3:/usr/share/opensearch/data
networks:
- opensearch-net
volumes:
opensearch-data3:
networks:
opensearch-net:
The cluster itself was brought up from a script in my local machine (making use of the remote contexts) for convenience: cluster.sh#!/bin/bash
case $1 in
down)
for i in {1..3}
do
echo "Bringing node-$i down"
docker --context=node-$i compose -f node-$i.yml down -v
done
;;
up)
for i in {1..3}
do
echo "Bringing node-$i up"
docker --context=node-$i compose -f node-$i.yml up -d
done
;;
logs)
docker --context=node-$2 logs opensearch-node$2
;;
ps)
docker --context=node-$2 ps -a
;;
run)
docker --context=benchmark compose -f benchmark.yml up -d
;;
results)
docker --context=benchmark logs opensearch-benchmark -f
;;
*)
echo "Unrecognized option"
;;
esac
exit 0
Lastly, a 4th ec2 instance was used to run the actual benchmark from the following docker compose: docker-compose.ymlservices:
opensearch-benchmark:
image: opensearchproject/opensearch-benchmark:1.6.0
hostname: opensearch-benchmark
container_name: opensearch-benchmark
volumes:
- /root/benchmarks:/opensearch-benchmark/.benchmark
environment:
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD}
- BENCHMARK_NAME=${BENCHMARK_NAME}
- NODE1_LOCAL_IP=${NODE1_LOCAL_IP}
- NODE2_LOCAL_IP=${NODE2_LOCAL_IP}
- NODE3_LOCAL_IP=${NODE3_LOCAL_IP}
#command: execute-test --target-hosts https://opensearch-node1:9200 --pipeline benchmark-only --workload geonames --client-options basic_auth_user:admin,basic_auth_password:${OPENSEARCH_INITIAL_ADMIN_PASSWORD},verify_certs:false --test-mode
command: execute-test --pipeline="benchmark-only" --workload-path="/opensearch-benchmark/.benchmark/${BENCHMARK_NAME}" --target-hosts="https://${NODE1_LOCAL_IP}:9200,https://${NODE2_LOCAL_IP}:9200,https://${NODE3_LOCAL_IP}:9200" --client-options="basic_auth_user:admin,basic_auth_password:${OPENSEARCH_INITIAL_ADMIN_PASSWORD},verify_certs:false"
networks:
- opensearch-net # All of the containers will join the same Docker bridge network
permissions-setter:
image: alpine:3.14
container_name: permissions-setter
volumes:
- /root/benchmarks:/benchmark
entrypoint: /bin/sh
command: >
-c '
chmod -R a+rw /benchmark
'
opensearch-node1: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/)
image: opensearchproject/opensearch:2.14.0
container_name: opensearch-node1
hostname: opensearch-node1
environment:
- cluster.name=opensearch-cluster # Name the cluster
- node.name=opensearch-node1 # Name the node that will run in this container
- cluster.initial_cluster_manager_nodes=opensearch-node1 # Nodes eligibile to serve as cluster manager
- bootstrap.memory_lock=true # Disable JVM heap memory swapping
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD} # Sets the demo admin user password when using demo configuration (for OpenSearch 2.12 and later)
ulimits:
memlock:
soft: -1 # Set memlock to unlimited (no soft or hard limit)
hard: -1
nofile:
soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536
hard: 65536
volumes:
- opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container
#healthcheck:
# test: curl -sku admin:${OPENSEARCH_INITIAL_ADMIN_PASSWORD} https://opensearch-node1:9200/_cat/health | grep -q opensearch-cluster
# start_period: 10s
# start_interval: 3s
ports:
- 9200:9200 # REST API
- 9300:9300 # REST API
- 9600:9600 # Performance Analyzer
networks:
- opensearch-net # All of the containers will join the same Docker bridge network
opensearch-dashboards:
image: opensearchproject/opensearch-dashboards:2.14.0 # Make sure the version of opensearch-dashboards matches the version of opensearch installed on other nodes
container_name: opensearch-dashboards
#depends_on:
# opensearch-node1:
# condition: service_healthy
ports:
- 5601:5601 # Map host port 5601 to container port 5601
expose:
- "5601" # Expose port 5601 for web access to OpenSearch Dashboards
environment:
- OPENSEARCH_HOSTS=["https://opensearch-node1:9200"]
networks:
- opensearch-net
volumes:
opensearch-data1:
networks:
opensearch-net: I initially also included an opensearch node in this machine because Benchmark files:In order to create benchmark files I downloaded my This was done using a simple bash script: generate_files.sh#!/bin/bash
PREFIX="wazuh-alerts"
SOURCE_FILE="$PREFIX-100.json"
MB_TO_B=1048576
for i in {1..20}
do
MB_SIZE=$i
SIZE=$(( $MB_SIZE * $MB_TO_B ))
FILENAME="$PREFIX-$MB_SIZE.json"
head -c $SIZE $SOURCE_FILE > $FILENAME
sed -i '$ d' $FILENAME
done The rest of the configuration files for the benchmark were created using the following bash script (which admittedly is a little rough around the edges, but still works): generate_config.sh#!/bin/bash
PREFIX="wazuh-alerts"
SOURCE_FILE="$PREFIX-20.json"
MB_TO_B=1048576
CORPORAE=()
TEST_PROCEDURES=()
INDICES=()
OPERATIONS=()
PARALLEL_JOBS=4
SINGLE_BULK_TEST=()
PARALLEL_BULK_TEST=()
TASKS=()
CLIENTS=2
for i in {01..20}
do
MB_SIZE=$i
NAME="$PREFIX-$MB_SIZE"
FILENAME="$NAME.json"
DOCUMENT_COUNT=$(wc -l $FILENAME | cut -d' ' -f1)
OPERATION_NAME="${MB_SIZE}MB-bulk"
CORPORAE+="
{
\"name\": \"$NAME\",
\"documents\": [
{
\"target-index\": \"$NAME\",
\"source-file\": \"$FILENAME\",
\"document-count\": $DOCUMENT_COUNT
}
]
},"
SINGLE_BULK_TEST+="
{
\"operation\": \"$OPERATION_NAME\",
\"clients\": $CLIENTS
},"
OPERATIONS+="
{
\"name\": \"$OPERATION_NAME\",
\"operation-type\": \"bulk\",
\"corpora\": \"$NAME\",
\"bulk-size\": $DOCUMENT_COUNT
},"
INDICES+="
{
\"name\": \"$NAME\",
\"body\": \"${PREFIX}.json\"
},"
done
SINGLE_BULK_TEST=${SINGLE_BULK_TEST%%,}
TEST_PROCEDURES+="
{
\"name\": \"single-bulk-index-test\",
\"description\": \"Wazuh Alerts bulk index test\",
\"default\": true,
\"schedule\": [
${SINGLE_BULK_TEST}
]
},"
for i in {01..05}
do
MB_SIZE=$i
OPERATION_NAME="${MB_SIZE}MB-bulk"
TASKS=()
for j in $(seq --format="%02g" 1 ${PARALLEL_JOBS})
do
TASKS+="
{
\"name\": \"parallel-test-${i}-thread-${j}\",
\"operation\": \"$OPERATION_NAME\",
\"clients\": $CLIENTS
},"
done
TASKS=${TASKS%%,}
PARALLEL_BULK_TEST+="
{
\"parallel\": {
\"tasks\": [
${TASKS}
]
}
},"
done
PARALLEL_BULK_TEST=${PARALLEL_BULK_TEST%%,}
TEST_PROCEDURES+="
{
\"name\": \"parallel-bulk-index-test\",
\"description\": \"Test using ${PARALLEL_JOBS} parallel indexing operations\",
\"schedule\": [
${PARALLEL_BULK_TEST}
]
},"
CORPORAE=${CORPORAE%%,}
OPERATIONS=${OPERATIONS%%,}
TEST_PROCEDURES=${TEST_PROCEDURES%%,}
INDICES=${INDICES%%,}
OLDIFS=$IFS
IFS=$'`'
WORKLOAD="
{% import \"benchmark.helpers\" as benchmark with context %}
{
\"version\": 2,
\"description\": \"Wazuh Indexer Bulk Benchmarks\",
\"indices\": [
${INDICES[@]}
],
\"corpora\": [
${CORPORAE[@]}
],
\"operations\": [
{{ benchmark.collect(parts=\"operations/*.json\") }}
],
\"test_procedures\": [
{{ benchmark.collect(parts=\"test_procedures/*.json\") }}
]
}
"
mkdir -p ./operations
mkdir -p ./test_procedures
echo ${OPERATIONS[@]} > ./operations/default.json
echo ${TEST_PROCEDURES[@]} > ./test_procedures/default.json
echo ${WORKLOAD[@]} > ./workload.json
IFS=$OLDIFS
This script generates a TestsThe nature of the benchmark itself can be assessed by looking at the output test_procedures/default.json{
"name": "single-bulk-index-test",
"description": "Wazuh Alerts bulk index test",
"default": true,
"schedule": [
{
"operation": "01MB-bulk",
"clients": 2
},
{
"operation": "02MB-bulk",
"clients": 2
},
{
"operation": "03MB-bulk",
"clients": 2
},
{
"operation": "04MB-bulk",
"clients": 2
},
{
"operation": "05MB-bulk",
"clients": 2
},
{
"operation": "06MB-bulk",
"clients": 2
},
{
"operation": "07MB-bulk",
"clients": 2
},
{
"operation": "08MB-bulk",
"clients": 2
},
{
"operation": "09MB-bulk",
"clients": 2
},
{
"operation": "10MB-bulk",
"clients": 2
},
{
"operation": "11MB-bulk",
"clients": 2
},
{
"operation": "12MB-bulk",
"clients": 2
},
{
"operation": "13MB-bulk",
"clients": 2
},
{
"operation": "14MB-bulk",
"clients": 2
},
{
"operation": "15MB-bulk",
"clients": 2
},
{
"operation": "16MB-bulk",
"clients": 2
},
{
"operation": "17MB-bulk",
"clients": 2
},
{
"operation": "18MB-bulk",
"clients": 2
},
{
"operation": "19MB-bulk",
"clients": 2
},
{
"operation": "20MB-bulk",
"clients": 2
}
]
},
{
"name": "parallel-bulk-index-test",
"description": "Test using 4 parallel indexing operations",
"schedule": [
{
"parallel": {
"tasks": [
{
"name": "parallel-test-01-thread-01",
"operation": "01MB-bulk",
"clients": 2
},
{
"name": "parallel-test-01-thread-02",
"operation": "01MB-bulk",
"clients": 2
},
{
"name": "parallel-test-01-thread-03",
"operation": "01MB-bulk",
"clients": 2
},
{
"name": "parallel-test-01-thread-04",
"operation": "01MB-bulk",
"clients": 2
}
]
}
},
{
"parallel": {
"tasks": [
{
"name": "parallel-test-02-thread-01",
"operation": "02MB-bulk",
"clients": 2
},
{
"name": "parallel-test-02-thread-02",
"operation": "02MB-bulk",
"clients": 2
},
{
"name": "parallel-test-02-thread-03",
"operation": "02MB-bulk",
"clients": 2
},
{
"name": "parallel-test-02-thread-04",
"operation": "02MB-bulk",
"clients": 2
}
]
}
},
{
"parallel": {
"tasks": [
{
"name": "parallel-test-03-thread-01",
"operation": "03MB-bulk",
"clients": 2
},
{
"name": "parallel-test-03-thread-02",
"operation": "03MB-bulk",
"clients": 2
},
{
"name": "parallel-test-03-thread-03",
"operation": "03MB-bulk",
"clients": 2
},
{
"name": "parallel-test-03-thread-04",
"operation": "03MB-bulk",
"clients": 2
}
]
}
},
{
"parallel": {
"tasks": [
{
"name": "parallel-test-04-thread-01",
"operation": "04MB-bulk",
"clients": 2
},
{
"name": "parallel-test-04-thread-02",
"operation": "04MB-bulk",
"clients": 2
},
{
"name": "parallel-test-04-thread-03",
"operation": "04MB-bulk",
"clients": 2
},
{
"name": "parallel-test-04-thread-04",
"operation": "04MB-bulk",
"clients": 2
}
]
}
},
{
"parallel": {
"tasks": [
{
"name": "parallel-test-05-thread-01",
"operation": "05MB-bulk",
"clients": 2
},
{
"name": "parallel-test-05-thread-02",
"operation": "05MB-bulk",
"clients": 2
},
{
"name": "parallel-test-05-thread-03",
"operation": "05MB-bulk",
"clients": 2
},
{
"name": "parallel-test-05-thread-04",
"operation": "05MB-bulk",
"clients": 2
}
]
}
}
]
} There are two tests:
The first sequentially indexes data in 1 through 20MB bulks. The second one runs 4 parallel bulk indexing operations at a time, increasing the bulk size in 1MB increments. Running the benchmarkIn order to obtain a fair sample size from these tests, we considered using the iterations parameter but later found out it only really applies to For that reason, I opted to simply launch the test repeatedly from the simplest of bash scripts: benchmark.sh#!/bin/bash
TEST="parallel-bulk-index-test"
curl -sku admin:Password -XDELETE https://node-1:9200/wazuh-*
curl -sku admin:Password -XPOST https://node-1:9200/_forcemerge
for i in {1..100}
do
opensearch-benchmark execute-test --pipeline="benchmark-only" --workload-path="./benchmarks/wazuh-alerts" --target-hosts="https://node-1:9200,https://node-2:9200,https://node-3:9200" --client-options="basic_auth_user:admin,basic_auth_password:Password,verify_certs:false" --results-format csv --results-file ./${TEST}/results-$(date +%F-%T).csv --test-procedure=${TEST}
curl -sku admin:Password -XDELETE https://node-1:9200/wazuh-*
curl -sku admin:Password -XPOST https://node-1:9200/_forcemerge
done
TEST="single-bulk-index-test"
for i in {1..100}
do
opensearch-benchmark execute-test --pipeline="benchmark-only" --workload-path="./benchmarks/wazuh-alerts" --target-hosts="https://node-1:9200,https://node2:9200,https://node3:9200" --client-options="basic_auth_user:admin,basic_auth_password:Password,verify_certs:false" --results-format csv --results-file ./${TEST}/results-$(date +%F-%T).csv --test-procedure=${TEST}
curl -sku admin:Password -XDELETE https://node-1:9200/wazuh-*
curl -sku admin:Password -XPOST https://node-1:9200/_forcemerge
done
This script simply runs all the benchmarks in a loop and outputs their results to a ResultsThe results are dumped and plotted to the team's drive:
In the graphs above, the y axis holds the number of indexed documents per second, and the x axis the size of each bulk operation in MB for each test. These are results are averaged from the output of running the tests 30 times, but the results for each pass don't vary a lot. So it seems that increasing the bulk size increases the throughput until we hit diminishing returns. We chose to run this up to 20MB because it is recommended to keep bulk indexing operations below 15MB. |
ResultsWe ran more benchmark tests for single and parallel bulks. The most representative data set runs an OpenSearch Benchmark workload using 1 client and 4 parallel bulk tasks, summing up to (bulk_size * threads) MB concurrently, measuring the Mean Throughput in average after 100 runs. The infrastructure uses 3 nodes of Wazuh Indexer in cluster mode, v4.8.0, using the default wazuh-alerts template: 3 primary shards and 1 replica shard. In the charts below, we can see a clear comparison between using a single bulk request vs parallel bulks ConclusionsThe parallel bulk request scenario has proven to return way higher metrics. In the table below, we can see the performance boost in ingestion metrics (ingested documents per second) parallelizing 4 bulk request vs using a single bulk request. The difference is substantial, while we can see that the performance gain tends to drop as we increase the bulk size. On the other hand, the table shows that the trend line is strictly increasing, which demonstrates the Indexer is able to ingest more documents per second by increasing the bulk size and or increasing the parallel requests. However, we decided to stop further analysis past the 20 MB bulk size, as it's above the recommended settings by Elastic and OpenSearch. Using values higher than 15 MB is not recommended as it can make the cluster unstable. Preliminary analysis shown that we can increase this number until 50 MB. At this point, the Indexer stops responding.
For best tradeoff between performance and stability, we recommend not passing the 15 MB threshold per bulk request. It's also important to note that the bulk size depends on the number of document and their size:
Also, the client should make sure that bulk requests are round-robined across all the data nodes, to prevent a single node from storing all the bulks in memory while processing. References: |
Description
As part of the new Data Persistence Model to be implemented across Wazuh, we need to elaborate a performance analysis over different designs, in order to see how the indexer behaves on them.
The objective of this issue is to measure the performance of bulk requests for:
on:
given the following scenarios:
The goal is to discover which design performs better on a well-configured indexer cluster.
For the tests, we are considering mocking events for 5K agents, generating events of 1 KB maximum. The EPS for each of the indices is defined by the formula below:
Functional requirements
Implementation restrictions
Both test scenario must run on:
Plan
The text was updated successfully, but these errors were encountered: