Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Emit and Loggers and more... #1

Merged
merged 10 commits into from
Dec 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/workflows/checks.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: checks
on:
pull_request:
paths-ignore:
- ".github/**"

jobs:
check-changelog:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v4
- name: Check Changelog modified
uses: dangoslen/changelog-enforcer@v3
with:
changeLogPath: "./CHANGELOG.md"
missingUpdateErrorMessage: |
Please include an entry into `CHANGELOG.md` to describe what happened in the PR
24 changes: 24 additions & 0 deletions .github/workflows/components-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: distro component test

on:
pull_request:
branches: [ main ]

env:
GO_VERSION: 1.22.x

jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Set up Go ${{ env.GO_VERSION }}
uses: actions/setup-go@v4
with:
go-version: ${{ env.GO_VERSION }}

- name: test components
run: find . -name go.mod -execdir go test -v ./... \;

11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
## Otelity

### v0.1.0 / 2024-12-15 (Breaking)
- [Feat] add `emit` funciton to starlark processor
- [Feat] added `log` module to starlark processor
- [Feat] added `entrypoint` option to starlark config, allowing to specify the entry point of the starlark script/code
- [Fix] `json.decode` no longer required to load telemetry events in starlark processor

### v0.0.1 / 2024-12-07
- [Feat] initial release
- starlark processor
Binary file added misc/starlark-vs-transform.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
126 changes: 69 additions & 57 deletions processors/starlarkprocessor/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,13 @@
# starlarktransform
# starlark

<!-- status autogenerated section -->
| Status | |
| ------------- |-----------|
| Stability | [alpha]: traces, metrics, logs |


[beta]: https://github.com/open-telemetry/opentelemetry-collector#beta
[sumo]: https://github.com/SumoLogic/sumologic-otel-collector
| Status | |
|-----------|--------------------------------|
| Stability | [alpha]: traces, metrics, logs |
<!-- end autogenerated section -->


The starlarktransform processor modifies telemetry based on configuration using Starlark code.
The starlark processor modifies telemetry based on configuration using Starlark code.

Starlark is a scripting language used for configuration that is designed to be similar to Python. It is designed to be fast, deterministic, and easily embedded into other software projects.

Expand All @@ -21,71 +17,70 @@ The processor leverages Starlark to modify telemetry data while using familiar,

While there are a number of transform processors, most notably, the main OTTL [transform processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor), this processor aims to grant users more flexibility by allowing them to manipulate telemetry data using a familiar syntax.

Python is a popular, well known language, even among non-developers. By allowing Starlark code to be used as an option to transform telemetry data, users can leverage their existing knowledge of Python.

Python is a popular, well known language, even among non-developers. By allowing Starlark code to be used as an option to transform telemetry data, users can leverage their existing knowledge of Python to transform or manipulate their telemetry data.

## Config

| Parameter | Desc |
| ------- | ---- |
| code | Add in-line starlark code directly to your config |
| script | Allows you to set the script source. Supports __File path__ or __HTTP URL__ |

To configure starlarktransform, you can add your code using the `code` option in the config.
| Parameter | Desc |
|------------|-----------------------------------------------------------------------------|
| entrypoint | The name of the function to call when executing the starlark script |
| code | Add in-line starlark code directly to your config |
| script | Allows you to set the script source. Supports **File path** or **HTTP URL** |

To configure starlark, you can add your code using the `code` option in the config and specify the name of the function to call using the `entrypoint` option.

```yaml
processors:
starlarktransform:
starlark:
entrypoint: transform
code: |
def transform(event):
event = json.decode(event)
<your starlark code>
return event
```


Alternatively, in cases where you prefer to not have starlark code present or visible in your Open Telemetry configuration, you can use the `script` parameter to pass your script from a file or http url.

```yaml
processors:
starlarktransform:
starlark:
entrypoint: transform
script: /path/to/script.star

# or
processors:
starlarktransform:
starlark:
entrypoint: transform
script: https://some.url.com/script.star

```



You must define a function called `transform` that accepts a single argument, `event`. This function is called by the processor and is passed the telemetry event. The function **must return** the modified, json decoded event.

Your `entrypoint` must define a function that accepts a single argument, the name of the argument **does not** have to be `event` like the examples above. This function is called by the processor and is passed the telemetry event. The function **must return** the modified telemetry event.

## How it works

The processor uses the [Starlark-Go](https://github.com/google/starlark-go) interpreter, this allows you to run this processor without having to install a Starlark language interpreter on the host.

## Features

The starlarktransform processor gives you access to the full telemetry event payload. You are able to modify this payload using the Starklark code in any way you want. This allows you do various things such as:
The starlark processor gives you access to the full telemetry event payload. You are able to modify this payload using the Starklark code in any way you want. This allows you do various things such as:

- Filtering
- Adding/Removing attributes
- Modifying attributes
- Modifying telemetry data directly
- Telemetry injection based on existing values
- And more
- Live, detailed, telemetry data debugging and evaluation
- And much more

## Libs, Functions and Functionality
## Modules, Functions and Functionality

While similar in syntax to Python, Starlack does not have all the functionality associated with Python. This processor does not have access to Python standard libraries and the implementation found in this processor is limited further to only the following libraries and functions:

- **json**

> The JSON library allows you to encode and decode JSON strings. The use of this library is mandatory as the telemetry data is passed to the processor as a JSON string. You must decode the JSON string to a Dict before you can modify it. **You must also return a JSON decoded Dict to the processor.**
The JSON library allows you to encode and decode JSON strings. The use of this library is mandatory as the telemetry data is passed to the processor as a JSON string. You must decode the JSON string to a Dict before you can modify it. **You must also return a JSON decoded Dict to the processor.**

```python
# encode dict string to json string
Expand All @@ -101,31 +96,43 @@ x = json.decode('{"foo": ["bar", "baz"]}')

You can read more on the JSON library [here](https://qri.io/docs/reference/starlark-packages/encoding/json)

- **print**
> You are able to use the print function to check outputs of your code. The output of the print function is sent to the Open Telemetry runtime log. Values printed by the Print function only show when running Open Telemetry in Debug mode.
- **log** modules and **print** function

You are able to use the print function or log module to print output to the Open Telemetry runtime log. This is useful for debugging your Starlark code as well as general logging based on evaluation of the telemetry data. For example, you may want to print a log message if certain values or behaviors are identified in the telemetry data.

```python
def transform(event):
print("hello world")
return json.decode(event)
```

The print statement above would result in the following output in the Open Telemetry runtime log. Again, this output is only visible when running Open Telemetry in Debug mode.
The print statement above would result in the following output in the Open Telemetry runtime log when opentelemetry is run in debug mode:

```log
2023-09-23T16:50:17.328+0200 debug traces/processor.go:25 hello world {"kind": "processor", "name": "starlarktransform/traces", "pipeline": "traces", "thread": "trace.processor", "source": "starlark/code"}
2023-09-23T16:50:17.328+0200 debug traces/processor.go:25 hello world {"kind": "processor", "name": "starlark/traces", "pipeline": "traces", "thread": "trace.processor", "source": "starlark/code"}
```

If you want to log messages outside of the debug mode, you can use the log module:

- **re** (regex)
> Support for Regular Expressions coming soon
```python
def transform(event):
log.info("hello world")
return json.decode(event)
```

There are 3 log levels available: - `log.info` - `log.warn` - `log.error`

Again, note that the debug level is handled by the `print` function and is only available in debug mode.

Note that you can define your own functions within your Starlark code, however, there must be at least one function named `transform` that accepts a single argument `event` and returns a JSON decoded Dict, this function can call all your other functions as needed.
- **re** (regex)

> Support for Regular Expressions coming soon

Note that you can define your own functions within your Starlark code, and call them as required, however, there must be at least one entrypoint function that accepts a single argument telemetry event argument and returns the modified event.

## Examples

This section contains examples of the event payloads that are sent to the starlarktransform processor from each telemetry type. These examples can help you understand the structure of the telemetry events and how to modify them.
This section contains examples of the event payloads that are sent to the starlark processor from each telemetry type. These examples can help you understand the structure of the telemetry events and how to modify them.

##### Log Event Payload Example:

Expand Down Expand Up @@ -211,6 +218,7 @@ View the the log.proto type definition [here](https://github.com/open-telemetry/
}]
}
```

View the metric.proto type definition [here](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/metrics/v1/metrics.proto)

##### Trace Event Payload Example:
Expand Down Expand Up @@ -346,10 +354,9 @@ View the metric.proto type definition [here](https://github.com/open-telemetry/o

View the trace.proto type definition [here](https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/trace/v1/trace.proto).


## Full Configuration Example

For following configuration example demonstrates the starlarktransform processor telemetry events for logs, metrics and traces.
For following configuration example demonstrates the starlark processor telemetry events for logs, metrics and traces.

```yaml
receivers:
Expand Down Expand Up @@ -381,10 +388,9 @@ processors:
# - add resource attribute cluster: dev
# - filter out any logs that contain the word password
# - add an attribute to each log: language: golang
starlarktransform/logs:
starlark/logs:
code: |
def transform(event):
event = json.decode(event)
# edit resource attributes
for data in event['resourceLogs']:
for attr in data['resource']['attributes']:
Expand All @@ -405,13 +411,12 @@ processors:

return event
# - print event received to otel runtime log
# - if there are no resources, add a resource attribute source starlarktransform
# - prefix each metric name with starlarktransform
starlarktransform/metrics:
# - if there are no resources, add a resource attribute source starlark
# - prefix each metric name with starlark
starlark/metrics:
code: |
def transform(event):
print("received event", event)
event = json.decode(event)
for md in event['resourceMetrics']:
# if resources are empty
if not md['resource']:
Expand All @@ -420,31 +425,30 @@ processors:
{
"key": "source",
"value": {
"stringValue": "starlarktransform"
"stringValue": "starlark"
}
}
]
}

# prefix each metric name with starlarktransform
# prefix each metric name with starlark
for sm in md['scopeMetrics']:
for m in sm['metrics']:
m['name'] = 'starlarktransform.' + m['name']
m['name'] = 'starlark.' + m['name']

return event

# - add resource attribute source starlarktransform
# - add resource attribute source starlark
# - filter out any spans with http.target /roll attribute
starlarktransform/traces:
starlark/traces:
code: |
def transform(event):
event = json.decode(event)
for td in event['resourceSpans']:
# add resource attribute
td['resource']['attributes'].append({
'key': 'source',
'value': {
'stringValue': 'starlarktransform'
'stringValue': 'starlark'
}
})

Expand All @@ -467,29 +471,37 @@ service:
receivers:
- filelog
processors:
- starlarktransform/logs
- starlark/logs
exporters:
- logging

metrics:
receivers:
- otlp
processors:
- starlarktransform/metrics
- starlark/metrics
exporters:
- logging

traces:
receivers:
- otlp
processors:
- starlarktransform/traces
- starlark/traces
exporters:
- logging
```

### Performance

### Warnings
The graph below shows a comparison of the starklark processor vs the transform processor. While the starlark processor is far from slow, the transform processor is twice as fast overall when performing the same type of transformation. If speed is a concern, you may want to consider using the transform processor instead.

The starlarktransform processor allows you to modify all aspects of your telemetry data. This can result in invalid or bad data being propogated if you are not careful. It is your responsibility to inspect the data and ensure it is valid.
<p align="center">
<img src="../../misc/starlark-vs-transform.jpg">
</p>

The starklark processor is a great component for highly complex or specialised types of transformations that simply can't be performed using the transform processor.

### Warnings

The starlark processor allows you to modify all aspects of your telemetry data. This can result in invalid or bad data being propogated if you are not careful. It is your responsibility to inspect the data and ensure it is valid.
7 changes: 7 additions & 0 deletions processors/starlarkprocessor/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,20 @@ type Config struct {
// source of startlark script
// accepts both file path and url
Script string `mapstructure:"script"`

// entrypoint - the name of the funciton to be called
EntryPoint string `mapstructure:"entrypoint"`
}

func (c *Config) validate() error {
if c.Code == "" && c.Script == "" {
return errors.New("a value must be given for altest one, [code] or [script]")
}

if c.EntryPoint == "" {
return errors.New("[entrypoint] must be provided")
}

return nil
}

Expand Down
Loading
Loading