Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluentbit config parsing logic for isolated region compatibility #94

Merged
merged 18 commits into from
Dec 3, 2024

Conversation

whoix
Copy link
Contributor

@whoix whoix commented Sep 5, 2024

Issue #, if available:

Description of changes:

Isolated regions need the Cloudwatch logs endpoint specified in the linux fluent bit configmap in order to properly create log groups. The endpoints are different for isolated (ADC) regions and do not follow conventional formatting compared to commercial. I have refactored some logic to properly parse which region the addon is being applied to and appropriately apply the correct linux configmap. These changes already work in isolated regions and are in AWS code base.

###Testing
Images/addon components are already onboarded to an internal ImageReplicationService. This automatically syncs CW images lowside and transfers them up to all supported EKS regions. We have worked with the EKS team to ensure you guys are already onboarded.

I was able to confirm metrics, log groups, and that my metrics for Application Signals and GPU Container insights show up in Container Insights. I am not able to attach/illustrate specific screenshots or logs due to security implications.

Changes only affect ADC regions to incorporate the endpoint line to the configmap. Commercial is unaffected and i did test there too.

In commercial the configmap remains the same

[INPUT]
  Name                tail
  Tag                 application.*
  Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
  Path                /var/log/containers/*.log
  multiline.parser    docker, cri
  DB                  /var/fluent-bit/state/flb_container.db
  Mem_Buf_Limit       50MB
  Skip_Long_Lines     On
  Refresh_Interval    10
  Rotate_Wait         30
  storage.type        filesystem
  Read_from_Head      ${READ_FROM_HEAD}

[INPUT]
  Name                tail
  Tag                 application.*
  Path                /var/log/containers/fluent-bit*
  multiline.parser    docker, cri
  DB                  /var/fluent-bit/state/flb_log.db
  Mem_Buf_Limit       5MB
  Skip_Long_Lines     On
  Refresh_Interval    10
  Read_from_Head      ${READ_FROM_HEAD}

[INPUT]
  Name                tail
  Tag                 application.*
  Path                /var/log/containers/cloudwatch-agent*
  multiline.parser    docker, cri
  DB                  /var/fluent-bit/state/flb_cwagent.db
  Mem_Buf_Limit       5MB
  Skip_Long_Lines     On
  Refresh_Interval    10
  Read_from_Head      ${READ_FROM_HEAD}

[FILTER]
  Name                kubernetes
  Match               application.*
  Kube_URL            https://kubernetes.default.svc:443
  Kube_Tag_Prefix     application.var.log.containers.
  Merge_Log           On
  Merge_Log_Key       log_processed
  K8S-Logging.Parser  On
  K8S-Logging.Exclude Off
  Labels              Off
  Annotations         Off
  Use_Kubelet         On
  Kubelet_Port        10250
  Buffer_Size         0

[OUTPUT]
  Name                cloudwatch_logs
  Match               application.*
  region              ${AWS_REGION}
  log_group_name      /aws/containerinsights/${CLUSTER_NAME}/application
  log_stream_prefix   ${HOST_NAME}-
  auto_create_group   true
  extra_user_agent    container-insights

while Isolated region now have the new added endpoint param

[INPUT]
  Name                tail
  Tag                 application.*
  Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
  Path                /var/log/containers/*.log
  multiline.parser    docker, cri
  DB                  /var/fluent-bit/state/flb_container.db
  Mem_Buf_Limit       50MB
  Skip_Long_Lines     On
  Refresh_Interval    10
  Rotate_Wait         30
  storage.type        filesystem
  Read_from_Head      ${READ_FROM_HEAD}

[INPUT]
  Name                tail
  Tag                 application.*
  Path                /var/log/containers/fluent-bit*
  multiline.parser    docker, cri
  DB                  /var/fluent-bit/state/flb_log.db
  Mem_Buf_Limit       5MB
  Skip_Long_Lines     On
  Refresh_Interval    10
  Read_from_Head      ${READ_FROM_HEAD}

[INPUT]
  Name                tail
  Tag                 application.*
  Path                /var/log/containers/cloudwatch-agent*
  multiline.parser    docker, cri
  DB                  /var/fluent-bit/state/flb_cwagent.db
  Mem_Buf_Limit       5MB
  Skip_Long_Lines     On
  Refresh_Interval    10
  Read_from_Head      ${READ_FROM_HEAD}

[FILTER]
  Name                kubernetes
  Match               application.*
  Kube_URL            https://kubernetes.default.svc:443
  Kube_Tag_Prefix     application.var.log.containers.
  Merge_Log           On
  Merge_Log_Key       log_processed
  K8S-Logging.Parser  On
  K8S-Logging.Exclude Off
  Labels              Off
  Annotations         Off
  Use_Kubelet         On
  Kubelet_Port        10250
  Buffer_Size         0

[OUTPUT]
  Name                cloudwatch_logs
  Match               application.*
  region              ${AWS_REGION}
  log_group_name      /aws/containerinsights/${CLUSTER_NAME}/application
  log_stream_prefix   ${HOST_NAME}-
  auto_create_group   true
  endpoint             logs.${AWS_REGION}.c2s.ic.gov
  extra_user_agent    container-insights

@whoix whoix marked this pull request as draft September 5, 2024 20:00
@whoix whoix marked this pull request as ready for review September 5, 2024 20:01
@whoix whoix marked this pull request as draft November 21, 2024 16:03
@whoix whoix marked this pull request as ready for review November 21, 2024 16:04
@@ -242,6 +242,318 @@ containerLogs:
log_stream_prefix ${HOST_NAME}.
auto_create_group true
extra_user_agent container-insights
adcIsoExtraFiles:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the only difference is the endpoint, could we figure out a way to template it out instead of duplicating each of the configs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that you would need to refactor your fluent-bit-configmap in your linux helm chart to parse only for the OUTPUTS section and for ADC regions which then gets convoluted. Its about the same refactoring as I have done already in the values currently. The endpoint only needs to be specified in the OUTPUTS section, other wise it can actually cause helm chart formatting errors -> causes the pod to be instable.

Open to suggestions but this was the safest way i could think that doesn't impact your helm charts too much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we would fix this in fluent bit so the right endpoint would be used without having to override it https://github.com/fluent/fluent-bit/blob/master/src/aws/flb_aws_util.c#L75, but until that's done, I'm fine with doing it this way.

@@ -14,8 +14,20 @@ data:
{{- end }}
parsers.conf: |
{{- .Values.containerLogs.fluentBit.config.customParsers | nindent 4 }}
{{- if contains "us-iso-" .Values.region }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. updating.

@jefchien jefchien merged commit ba92eff into aws-observability:main Dec 3, 2024
0 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants