Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_opentelemetry: enhancements for log body and attributes handling (fix #8359) #8491

Merged
merged 3 commits into from
Feb 19, 2024

Conversation

edsiper
Copy link
Member

@edsiper edsiper commented Feb 16, 2024

The following patch fix and enhance the OpenTelemetry output connector when handling log records.

In Fluent Bit world, we deal with ton of unstructured log records which comes from variety of sources, or just simply raw text files. When converting those lines to structured messages, there was no option to define what will be the log body and log attributes and everything is packaged inside log body by default.

This patch enhances the previous behavior by allowing the following:

  • log body: optionally define multiple record accessor patterns that tries to match a key or sub-key from the record structure. For the first matched key, it's value is used as part of the body content. If no matches exists, the whole record is set inside the body.

  • log attributes: if the log record contains native metadata, the keys are added as OpenTelemetry Attributes. if the log body was populated by using a record accessor pattern as described above, the remaining keys that were not used can be optionally added as attributes as well if the new option logs_body_key_attributes is set to true (default: false).

To achieve the desired new behavior, the configuration needs to use the new configuration property called logs_body_key, which can be used to define multiple record accessor patterns. e.g:

pipeline:
  inputs:
    - name: dummy
      metadata: '{"meta": "data"}'
      dummy: '{"name": "bill", "lastname": "gates", "log": {"abc": {"def":123}, "saveme": true}}'
    
  outputs:
    - name: opentelemetry
      match: '*'
      host: localhost
      port: 4318
      logs_body_key: $name
      logs_body_key: $log['abc']['def']
      logs_body_key_attributes: true

In the example above, the dummy input plugin will generate a record with metadata, in the output side, the plugin will lookup in order for $name and then $log['abc']['def']. $name will match, so bill will become the value of the body and the remaining content as attributes. Here is the output of the vanilla Otel Collector when inspecting the content it receives:

  Body: Str(bill)
  Attributes:
       -> meta: Str(data)
       -> lastname: Str(gates)
       -> log: Map({"abc":{"def":123},"saveme":true})

updates

  • [02/18/2024] adjusted the description to include the new option logs_body_key_attributes.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

The mp_accessor API allows to manipulate a msgpack object buffer
by adding or removing keys through a definition of a list of
record accessor patterns.

By default, when running mp_accessor it iterates through all the
list of record accessors, but there are cases where would be
ideal to just enable/disable some of them.

This patch extends the mp_accessor API allowing to activate/deactivate
certain record accessors on-demand.

Signed-off-by: Eduardo Silva <[email protected]>
…ix #8359)

The following patch fix and enhance the OpenTelemetry output connector when
handling log records.

In Fluent Bit world, we deal with ton of unstructured log records which comes
from variety of sources, or just simply raw text files. When converting those
lines to structured messages, there was no option to define what will be the
log body and log attributes and everything is packaged inside log body by
default.

This patch enhance the previous behavior by allowing the following:

- log body: optionally define multiple record accessor patterns that tries
            to match a key or sub-key from the record structure.
            For the first matched key, it's value is used as part of
            the body content.

            If no matches exists, the whole record is set inside the body.

- log attributes: if the log record contains native metadata, the keys
                  are added as OpenTelemetry Attributes.

                  if the log body was populated by using a record accessor
                  pattern as described above, the remaining keys that were
                  not used are added as attributes.

To achieve the desired new behavior, the configuration needs to use the new
configuration property called 'logs_body_key', which can be used to define
multiple record accessor patterns. e.g:

  pipeline:
    inputs:
      - name: dummy
        metadata: '{"meta": "data"}'
        dummy: '{"name": "bill", "lastname": "gates", "log": {"abc": {"def":123}, "saveme": true}}'

    outputs:
      - name: opentelemetry
        match: '*'
        host: localhost
        port: 4318
        logs_body_key: $name
        logs_body_key: $log['abc']['def']

In the example above, the dummy input plugin will generate a record with
metadata, in the output side, the plugin will lookup in order for $name and
then $log['abc']['def']. $name will match so 'bill' will become the value of
the body and the remaining content as attributes. Here is the output of the
vanilla Otel Collector when inspecting the content it receives:

  Body: Str(bill)
  Attributes:
       -> meta: Str(data)
       -> lastname: Str(gates)
       -> log: Map({"abc":{"def":123},"saveme":true})

Signed-off-by: Eduardo Silva <[email protected]>
@nokute78
Copy link
Collaborator

In my understanding, first logs_body_key is for log body and second logs_body_key is for log attributes, right ?
Why are they not separate names ?

@edsiper
Copy link
Member Author

edsiper commented Feb 17, 2024

@nokute78 logs_body_key allows to define a list of candidate keys to populate the body, everything else that was not included in the body gets added as attribute. There is more context about this behavior on #8359

@nokute78
Copy link
Collaborator

@edsiper Thank you.

everything else that was not included in the body gets added as attribute.

I think it may affect user request #8205.
It is to forward OTLP using fluent-bit.

Expected behavior
The data I passed in with OTLP should, at minimum, be the same when it comes out if there is no processing done.

Once log_body_key is set, Attributes also will be changed.
I think it is better not to touch Attributes since other metadata may follow Logs Data structure.
#8359 (comment)

@nokute78
Copy link
Collaborator

I tested following configuration and it enrich Attributes field.
User may not want to change.

Following Metadata of in_dummy is a pseudo OTEL Logs Data structure.

[INPUT]
    name dummy
    Metadata {"ObservedTimestamp":1234, "Timestamp":3456, "SeverityText":"WARN", "SeverityNumber":5, "TraceFlags":13, "Attributes":{"log.file.name":"a.log", "test":"val"}}
    dummy {"name": "bill", "lastname": "gates", "log": {"abc": {"def":123}, "saveme": true}}

[OUTPUT]
    Name   stdout
    Match *

[OUTPUT]
    Name opentelemetry
    Host localhost
    Port 6969
    Match *
    logs_body_key $name
receivers:
  otlp:
    protocols:
      http:
        endpoint: localhost:6969

exporters:
  file:
    path: 2_out.json

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    logs:
      receivers: [otlp]
      exporters: [file]

Output of otel-collector

{
  "resourceLogs": [
    {
      "resource": {},
      "scopeLogs": [
        {
          "scope": {},
          "logRecords": [
            {
              "timeUnixNano": "1708134916475903956",
              "body": {
                "stringValue": "bill"
              },
              "attributes": [
                {
                  "key": "ObservedTimestamp",
                  "value": {
                    "intValue": "1234"
                  }
                },
                {
                  "key": "Timestamp",
                  "value": {
                    "intValue": "3456"
                  }
                },
                {
                  "key": "SeverityText",
                  "value": {
                    "stringValue": "WARN"
                  }
                },
                {
                  "key": "SeverityNumber",
                  "value": {
                    "intValue": "5"
                  }
                },
                {
                  "key": "TraceFlags",
                  "value": {
                    "intValue": "13"
                  }
                },
                {
                  "key": "Attributes",
                  "value": {
                    "kvlistValue": {
                      "values": [
                        {
                          "key": "log.file.name",
                          "value": {
                            "stringValue": "a.log"
                          }
                        },
                        {
                          "key": "test",
                          "value": {
                            "stringValue": "val"
                          }
                        }
                      ]
                    }
                  }
                },
                {
                  "key": "lastname",
                  "value": {
                    "stringValue": "gates"
                  }
                },
                {
                  "key": "log",
                  "value": {
                    "kvlistValue": {
                      "values": [
                        {
                          "key": "abc",
                          "value": {
                            "kvlistValue": {
                              "values": [
                                {
                                  "key": "def",
                                  "value": {
                                    "intValue": "123"
                                  }
                                }
                              ]
                            }
                          }
                        },
                        {
                          "key": "saveme",
                          "value": {
                            "boolValue": true
                          }
                        }
                      ]
                    }
                  }
                }
              ],
              "traceId": "",
              "spanId": ""
            }
          ]
        }
      ]
    }
  ]
}

@sudomateo
Copy link
Contributor

Thank you for working on this. I was able to compile this branch locally and test it. Here are my results.

I started OpenTelemetry and Fluent Bit with the following respective configurations.

otel-config.yaml
---
receivers:
  otlp:
    protocols:
      grpc:
      http:
        endpoint: "0.0.0.0:4318"

processors:
  batch:

exporters:
  file:
    path: otel-output.json

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    logs:
      receivers:
        - otlp
      processors:
        - batch
      exporters:
        - file
fluent-bit-config.yaml
---
pipeline:
  inputs:
    - name: dummy
      tag: dummy
      metadata: |
        {
          "resource-attr": "resource-attr-val-1"
        }
      # This log record is taken directly from OpenTelemetry's Examples page.
      # https://opentelemetry.io/docs/specs/otel/protocol/file-exporter/#examples
      dummy: |
        {
          "severityNumber": 9,
          "severityText": "Info",
          "name": "logA",
          "message": "This is a log message",
          "app": "server",
          "instance_num": 1,
          "traceId": "08040201000000000000000000000000",
          "spanId": "0102040800000000"
        }

  outputs:
    - name: opentelemetry
      match: "*"
      host: localhost
      port: 4318
      logs_body_key: $message

    # For live debugging on the Fluent Bit side.
    - name: stdout
      match: "*"
      format: json

Here's what I received in OpenTelemetry. This is a much better experience than previous. However, it's still not exactly what I would expect when shipping logs to an OpenTelemetry endpoint.

actual-logs.json
{
  "resourceLogs": [
    {
      "resource": {},
      "scopeLogs": [
        {
          "scope": {},
          "logRecords": [
            {
              "timeUnixNano": "1708201530553920000",
              "body": {
                "stringValue": "This is a log message"
              },
              "attributes": [
                {
                  "key": "resource-attr",
                  "value": {
                    "stringValue": "resource-attr-val-1"
                  }
                },
                {
                  "key": "severityNumber",
                  "value": {
                    "intValue": "9"
                  }
                },
                {
                  "key": "severityText",
                  "value": {
                    "stringValue": "Info"
                  }
                },
                {
                  "key": "name",
                  "value": {
                    "stringValue": "logA"
                  }
                },
                {
                  "key": "app",
                  "value": {
                    "stringValue": "server"
                  }
                },
                {
                  "key": "instance_num",
                  "value": {
                    "intValue": "1"
                  }
                },
                {
                  "key": "traceId",
                  "value": {
                    "stringValue": "08040201000000000000000000000000"
                  }
                },
                {
                  "key": "spanId",
                  "value": {
                    "stringValue": "0102040800000000"
                  }
                }
              ],
              "traceId": "",
              "spanId": ""
            }
          ]
        }
      ]
    }
  ]
}

Instead, I would have expected to receive the following where the metadata from Fluent Bit maps to the Resource field in OpenTelemetry instead of being bundled with the rest of the attributes under the Attributes field. Also, I would expect to have some mechanism to set the other top-level OpenTelemetry fields (i.e., TraceId, SpanId, SeverityNumber, etc.).

expected-logs.json
{
  "resourceLogs": [
    {
      "resource": {
        "attributes": [
          {
            "key": "resource-attr",
            "value": {
              "stringValue": "resource-attr-val-1"
            }
          }
        ]
      },
      "scopeLogs": [
        {
          "scope": {},
          "logRecords": [
            {
              "timeUnixNano": "1708201530553920000",
              "severityNumber": 9,
              "severityText": "Info",
              "name": "logA",
              "body": {
                "stringValue": "This is a log message"
              },
              "attributes": [
                {
                  "key": "app",
                  "value": {
                    "stringValue": "server"
                  }
                },
                {
                  "key": "instance_num",
                  "value": {
                    "intValue": "1"
                  }
                }
              ],
              "traceId": "08040201000000000000000000000000",
              "spanId": "0102040800000000"
            },
          ]
        }
      ]
    }
  ]
}

I believe we need to answer the following questions before the Fluent Bit opentelemetry output will be ready for mass adoption with the greatest compatibility.

  • How does Fluent Bit's event metadata relate to OpenTelemetry's Logs Data Model?
  • How does a Fluent Bit operator populate top-level fields (i.e., TraceId, SpanId, SeverityNumber, etc.) in OpenTelemetry's Logs Data Model? This pull request is focused on solving populating the Body and Attributes fields, but what about the rest?

@edsiper
Copy link
Member Author

edsiper commented Feb 18, 2024

@sudomateo thanks for the feedback.

Regarding other Log model fields that are not mapped yet, @nokute78 is working on #8475 to support those dynamically. The concept is as follows, let's use SeverityNumber as an example:

  • plugin exposes a configuration option called logs_severity_number_key that defaults to $SeverityNumber (it starts with $ because it is a record accessor pattern) and tries to find the key inside the record, then it populates the value in the right place.

Through #8475 the expectation is to do that for all the missing keys that are part of the Log data model.

Regarding the questions:

1. How does Fluent Bit's event metadata relate to OpenTelemetry's Logs Data Model?

Fluent Bit is schema-less, meaning that it can adapt to any type of schema; we provide configurations like the one we are working on to solve specific use cases. In Logging, we have to deal with tons of data that are structured but don't follow a schema, while OTel does it. We just need to finish up these mapping details.

2. How does a Fluent Bit operator populate top-level fields (i.e., TraceId, SpanId, SeverityNumber, etc.) in OpenTelemetry's Logs Data Model? This pull request is focused on solving populating the Body and Attributes fields, but what about the rest?

This PR #8491, only focuses on fixing how to populate body and attributes when the data that comes is not from OTel instrumentation. About the rest will be implemented in #8475; note we have other PRs around to solve OTel in the input side, here we are focusing in the output.

For this PR, I will do some tweaks to make it optional to set all remaining keys that were not used in the body as part of the attributes, or allow to define specific attributes from the record.

Again, thanks for the feedback on this, to get it right we need this type of testing and interaction :)

…: false)

When logs_body_key was added, by default it added all the unmatched keys
as attributes, there are many cases where this is not desired.

In order to keep the flexibility to the user, this patch adds a new config
option called 'logs_body_key_attributes' which takes a boolean value. When
it's enabled, the remaining keys are added as attributes; by default this
is disabled.

Signed-off-by: Eduardo Silva <[email protected]>
@edsiper
Copy link
Member Author

edsiper commented Feb 18, 2024

@nokute78 I have added a new option called logs_body_key_attributes (bool / default: false) which adds the unmatched keys as attributes, so we can keep it clean, thanks for the feedback

@edsiper
Copy link
Member Author

edsiper commented Feb 19, 2024

To move forward OTel fixes ASAP, I am merging this PR and we will continue iterating in the others to fill the end-user gaps.

@edsiper edsiper merged commit 202da13 into master Feb 19, 2024
45 checks passed
@edsiper edsiper deleted the out_otel-fixes branch February 19, 2024 16:52
@sudomateo
Copy link
Contributor

Thank you for working on this @edsiper! I agree with your assessment to merge this and continue iterating in the other pull requests.

@cb645j
Copy link
Contributor

cb645j commented Mar 4, 2024

@edsiper This is good but still does not address the main concerns of my ticket (#8359 (comment)). The traceid and spanid are still not being set. It seems your saying the rest will be fixed as part of #8475 ?? However it seems the PR has staled

@saixiaohui
Copy link

Thank you for working on this. I was able to compile this branch locally and test it. Here are my results.

I started OpenTelemetry and Fluent Bit with the following respective configurations.

otel-config.yaml

---
receivers:
  otlp:
    protocols:
      grpc:
      http:
        endpoint: "0.0.0.0:4318"

processors:
  batch:

exporters:
  file:
    path: otel-output.json

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    logs:
      receivers:
        - otlp
      processors:
        - batch
      exporters:
        - file

fluent-bit-config.yaml

---
pipeline:
  inputs:
    - name: dummy
      tag: dummy
      metadata: |
        {
          "resource-attr": "resource-attr-val-1"
        }
      # This log record is taken directly from OpenTelemetry's Examples page.
      # https://opentelemetry.io/docs/specs/otel/protocol/file-exporter/#examples
      dummy: |
        {
          "severityNumber": 9,
          "severityText": "Info",
          "name": "logA",
          "message": "This is a log message",
          "app": "server",
          "instance_num": 1,
          "traceId": "08040201000000000000000000000000",
          "spanId": "0102040800000000"
        }

  outputs:
    - name: opentelemetry
      match: "*"
      host: localhost
      port: 4318
      logs_body_key: $message

    # For live debugging on the Fluent Bit side.
    - name: stdout
      match: "*"
      format: json

Here's what I received in OpenTelemetry. This is a much better experience than previous. However, it's still not exactly what I would expect when shipping logs to an OpenTelemetry endpoint.

actual-logs.json
Instead, I would have expected to receive the following where the metadata from Fluent Bit maps to the Resource field in OpenTelemetry instead of being bundled with the rest of the attributes under the Attributes field. Also, I would expect to have some mechanism to set the other top-level OpenTelemetry fields (i.e., TraceId, SpanId, SeverityNumber, etc.).

expected-logs.json
I believe we need to answer the following questions before the Fluent Bit opentelemetry output will be ready for mass adoption with the greatest compatibility.

  • How does Fluent Bit's event metadata relate to OpenTelemetry's Logs Data Model?
  • How does a Fluent Bit operator populate top-level fields (i.e., TraceId, SpanId, SeverityNumber, etc.) in OpenTelemetry's Logs Data Model? This pull request is focused on solving populating the Body and Attributes fields, but what about the rest?

May I know what your pod yaml looks like? I am always getting permission issue when I try to write to a file from Aks cluster. Thanks!

@cb645j
Copy link
Contributor

cb645j commented Jul 8, 2024

Thank you for working on this. I was able to compile this branch locally and test it. Here are my results.
I started OpenTelemetry and Fluent Bit with the following respective configurations.
otel-config.yaml

---
receivers:
  otlp:
    protocols:
      grpc:
      http:
        endpoint: "0.0.0.0:4318"

processors:
  batch:

exporters:
  file:
    path: otel-output.json

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    logs:
      receivers:
        - otlp
      processors:
        - batch
      exporters:
        - file

fluent-bit-config.yaml

---
pipeline:
  inputs:
    - name: dummy
      tag: dummy
      metadata: |
        {
          "resource-attr": "resource-attr-val-1"
        }
      # This log record is taken directly from OpenTelemetry's Examples page.
      # https://opentelemetry.io/docs/specs/otel/protocol/file-exporter/#examples
      dummy: |
        {
          "severityNumber": 9,
          "severityText": "Info",
          "name": "logA",
          "message": "This is a log message",
          "app": "server",
          "instance_num": 1,
          "traceId": "08040201000000000000000000000000",
          "spanId": "0102040800000000"
        }

  outputs:
    - name: opentelemetry
      match: "*"
      host: localhost
      port: 4318
      logs_body_key: $message

    # For live debugging on the Fluent Bit side.
    - name: stdout
      match: "*"
      format: json

Here's what I received in OpenTelemetry. This is a much better experience than previous. However, it's still not exactly what I would expect when shipping logs to an OpenTelemetry endpoint.
actual-logs.json
Instead, I would have expected to receive the following where the metadata from Fluent Bit maps to the Resource field in OpenTelemetry instead of being bundled with the rest of the attributes under the Attributes field. Also, I would expect to have some mechanism to set the other top-level OpenTelemetry fields (i.e., TraceId, SpanId, SeverityNumber, etc.).
expected-logs.json
I believe we need to answer the following questions before the Fluent Bit opentelemetry output will be ready for mass adoption with the greatest compatibility.

  • How does Fluent Bit's event metadata relate to OpenTelemetry's Logs Data Model?
  • How does a Fluent Bit operator populate top-level fields (i.e., TraceId, SpanId, SeverityNumber, etc.) in OpenTelemetry's Logs Data Model? This pull request is focused on solving populating the Body and Attributes fields, but what about the rest?

May I know what your pod yaml looks like? I am always getting permission issue when I try to write to a file from Aks cluster. Thanks!

If you look at my changes, they address your concerns and the rest of the fields.

#8644

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants