From fda59912149b0215463d3f5354830daaf369046a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=81lex=20Ruiz?= Date: Tue, 16 Apr 2024 17:26:29 +0200 Subject: [PATCH] Add documentation --- integrations/README.md | 52 +++++++++---------- .../amazon-security-lake/src/invoke-lambda.sh | 6 +-- 2 files changed, 28 insertions(+), 30 deletions(-) diff --git a/integrations/README.md b/integrations/README.md index f2bdc9bc848ed..23ae595b74cdb 100644 --- a/integrations/README.md +++ b/integrations/README.md @@ -21,29 +21,41 @@ docker compose -f ./docker/amazon-security-lake.yml up -d ``` This docker compose project will bring a *wazuh-indexer* node, a *wazuh-dashboard* node, -a *logstash* node and our event generator. On the one hand, the event generator will push events +a *logstash* node, our event generator and an AWS Lambda Python container. On the one hand, the event generator will push events constantly to the indexer, on the `wazuh-alerts-4.x-sample` index by default (refer to the [events -generator](./tools/events-generator/README.md) documentation for customization options). -On the other hand, logstash will constantly query for new data and deliver it to the integration -Python program, also present in that node. Finally, the integration module will prepare and send the -data to the Amazon Security Lake's S3 bucket. +generator](./tools/events-generator/README.md) documentation for customization options). +On the other hand, logstash will constantly query for new data and deliver it to output configured in the +pipeline, which can be one of `indexer-to-s3`, `indexer-to-file` or `indexer-to-integrator`. + +The `indexer-to-s3` pipeline is the method used by the integration. This pipeline delivers +the data to an S3 bucket, from which the data is processed using a Lambda function, to finally +be sent to the Amazon Security Lake bucket in Parquet format. Attach a terminal to the container and start the integration by starting logstash, as follows: ```console -/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-integrator.conf --path.settings /etc/logstash +/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash ``` -Unprocessed data can be sent to a file or to an S3 bucket. -```console -/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-file.conf --path.settings /etc/logstash -/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash +After 5 minutes, the first batch of data will show up in http://localhost:9444/ui/wazuh-indexer-aux-bucket. +You'll need to invoke the Lambda function manually, selecting the log file to process. + +```bash +export AWS_BUCKET=wazuh-indexer-aux-bucket + +bash amazon-security-lake/src/invoke-lambda.sh ``` -All three pipelines are configured to fetch the latest data from the *wazuh-indexer* every minute. In -the case of `indexer-to-file`, the data is written at the same pace, whereas `indexer-to-s3`, data -is uploaded every 5 minutes. +Processed data will be uploaded to http://localhost:9444/ui/wazuh-indexer-amazon-security-lake-bucket. Click on any file to download it, +and check it's content using `parquet-tools`. Just make sure of installing the virtual environment first, through [requirements.txt](./amazon-security-lake/). + +```bash +parquet-tools show +``` + +Bucket names can be configured editing the [amazon-security-lake.yml](./docker/amazon-security-lake.yml) file. + For development or debugging purposes, you may want to enable hot-reload, test or debug on these files, by using the `--config.reload.automatic`, `--config.test_and_exit` or `--debug` flags, respectively. @@ -53,20 +65,6 @@ For production usage, follow the instructions in our documentation page about th As a last note, we would like to point out that we also use this Docker environment for development. -###### Integration through an AWS Lambda function - -Start the integration by sending log data to an S3 bucket. - -```console -/usr/share/logstash/bin/logstash -f /usr/share/logstash/pipeline/indexer-to-s3.conf --path.settings /etc/logstash -``` - -Once there is data in the source bucket, you can invoke the lambda function manually using an HTTP API request, as follows: - -``` -curl -X POST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{"Resources":"hello world!"}' -``` - ### Other integrations TBD diff --git a/integrations/amazon-security-lake/src/invoke-lambda.sh b/integrations/amazon-security-lake/src/invoke-lambda.sh index 340025dbf207e..2a9b65dbbdfda 100644 --- a/integrations/amazon-security-lake/src/invoke-lambda.sh +++ b/integrations/amazon-security-lake/src/invoke-lambda.sh @@ -22,14 +22,14 @@ curl -X POST "http://localhost:9000/2015-03-31/functions/function/invocations" - "s3SchemaVersion": "1.0", "configurationId": "testConfigRule", "bucket": { - "name": "wazuh-indexer-aux-bucket", + "name": "'"${AWS_BUCKET}"'", "ownerIdentity": { "principalId":"A3NL1KOZZKExample" }, - "arn": "arn:aws:s3:::wazuh-indexer-aux-bucket" + "arn": "'"arn:aws:s3:::${AWS_BUCKET}"'" }, "object": { - "key": "2024/04/16/ls.s3.0906d6d6-e4ca-4db8-b445-b3c572425ee1.2024-04-16T09.15.part3.txt", + "key": "'"${1}"'", "size": 1024, "eTag":"d41d8cd98f00b204e9800998ecf8427e", "versionId":"096fKKXTRTtl3on89fVO.nfljtsv6qko"