Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluent-bit in ECS: Domain name not found #9637

Open
neartik opened this issue Nov 24, 2024 · 0 comments
Open

Fluent-bit in ECS: Domain name not found #9637

neartik opened this issue Nov 24, 2024 · 0 comments

Comments

@neartik
Copy link

neartik commented Nov 24, 2024

Bug Report

Describe the bug
When trying to run the image amazon/aws-for-fluent-bit:latest or any stable previous version, when the task boots it cannot reach the elastic cluster.

To Reproduce

Follow the tutorial from elastic: https://www.elastic.co/blog/elastic-cloud-with-aws-firelens-accelerate-time-to-insight-with-agentless-data-ingestion

For ECS, the task will look like this:

{ "family": "firelens-fargate-elastic", "taskRoleArn": "**redacted**", "executionRoleArn": "**redacted**", "networkMode": "awsvpc", "cpu": "512", "memory": "1024", "requiresCompatibilities": [ "FARGATE" ], "containerDefinitions": [ { "essential": true, "image": "public.ecr.aws/aws-observability/aws-for-fluent-bit:2.32.4", "name": "log_router", "firelensConfiguration": { "type": "fluentbit", "options": { "enable-ecs-log-metadata": "true" } }, "logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-group": "firelens-container", "awslogs-region": "eu-west-1", "awslogs-create-group": "true", "awslogs-stream-prefix": "firelens" } }, "memoryReservation": 50 }, { "essential": true, "image": "nginx", "name": "app", "logConfiguration": { "logDriver":"awsfirelens", "secretOptions": [ { "valueFrom": "**redacted**:CLOUD_ID::", "name": "Cloud_ID" }, { "valueFrom": "**redacted**:CLOUD_AUTH::", "name": "Cloud_Auth" } ], "options": { "Name": "es", "Port": "9243", "Tag_Key tags": "tags", "Include_Tag_Key": "true", "Index": "elastic_firelens", "tls": "On", "tls.verify": "Off" }}, "memoryReservation": 100 } ] }

When deploying the task, make sure that it is accessible with a public IP and that it leads to the NGINX container.
The logs of the log router will show:

24 November 2024 at 00:06 (UTC) [2024/11/24 00:06:49] [ warn] [net] getaddrinfo(host='**redacted**.eu-west-1.aws.found.io:443', err=4): Domain name not found [2024/11/24 00:06:49] [ warn] [engine] failed to flush chunk '1-1732406808.778340835.flb', retry in 7 seconds: task_id=0, input=forward.1 > output=es.1 (out_id=1)

Note that redacted.eu-west-1.aws.found.io:443 is accessible from the browser at the time I get this error.
If Cloud_ID is edited to remove the port, the logs look different like an invalid argument is provided.

Expected behavior

The logs should go to Elastic.

Your Environment

  • Version used: Latest & 2.32.4
  • Configuration: ECS Fargate on Linux ARM 64
  • Filters and plugins: None

Additional context

The goal is to have this tool to send all logs to Elastic from the 200 tasks running as a sidecar for each. It is not manageable to have an agent instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant