Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Omit specific headers by default #92

Merged
merged 18 commits into from
Oct 9, 2023
Merged

Conversation

ConradLang
Copy link
Contributor

@ConradLang ConradLang commented Oct 4, 2023

Purpose

Alternative to #84

Pino has a limitation where you can either redact or remove specified redact.paths by setting redact.remove to true or false.

In some cases, it's reasonable to want to redact some properties e.g. headers.authorization so you know that you have an authorization header but in other cases, the header, e.g. x-envoy-peer-metadata only contributes to increasing the log message size and therefore cost.

To reduce costs, we should be able to remove the properties that are not relevant for diagnostic purposes.

Approach

  • Properties containing headers that should be omitted were identified via this Splunk query over 1 business day:

    index="*" 
    | fields *x-envoy-*
    | fieldsummary
    | rex mode=sed field=values "s~\[?\{\"value\":\"~~g
      s~\",\"count\":\d+\}[,\]]*~~g"
    | eval avgBytes=round(len(values)/distinct_count,0), totalMiB=avgBytes*count/1024/1024
    | table field, count, distinct_count, avgBytes, totalMiB
    | sort 0 -totalMiB
    
    • The headers are typically under req, headers, and requestHeader properties.
  • Added defaultOmitHeaderNames which is a list of header names to omit.

    export const defaultOmitHeaderNames = [
      'x-envoy-attempt-count',
      'x-envoy-decorator-operation',
      'x-envoy-expected-rq-timeout-ms',
      'x-envoy-external-address',
      'x-envoy-internal',
      'x-envoy-peer-metadata',
      'x-envoy-peer-metadata-id',
    ];
  • Add a omitHeaderNames logger option that defaults to the defaultOmitHeaderNames. If a consumer wants to opt out, they can supply an empty list or their own list instead.

  • A headers custom serializer omits the specified headers from the headers property of the logged object.

  • The existing req custom serializer also omits the specified headers from the req.headers property of the logged object.

Notes

💡 It may be helpful to review commit-by-commit.

Answers from discussion with @72636c

  1. ✅ What's the risk classification for this change since it affects much of SEEK?
    Minimal Risk based on this being "...business critical given it ladders up to a citizenship ask (cost savings), has limited risk (only removes headers from logs that you’ve de-risked, pending point 3), and it’s still up to individual team discretion to review & choose to merge the version bump".
  2. ✅ There is no reqHeaders custom serializer so some guidance around this is needed. We could add this as it would be a clone of the headers custom serializer.
    Since reqHeaders is produced by two services of one team, we will not be adding a specific custom serializer.
  3. ✅ Still following up with cloud platform team on these headers.
    Cloud platform team is ok with this change and has had a look at this PR.
  4. ✅ Is it ok to remove the default export from serializers since it isn't used internally and I don't think it's accessible via e.g. import serializers from '@seek/logger/lib-commonjs/serializers'.
    Removed default export from serializers.

Issues

Relates to #83 but does not resolve it.

Function for omitting a list of property names from the input object.
A higher order function for creating a serializer that omits properties
from the object assigned to the top-level property.
@ConradLang ConradLang force-pushed the add-remove-headers-serializer branch 2 times, most recently from ed2af31 to 5394b1d Compare October 5, 2023 03:19
@ConradLang ConradLang force-pushed the add-remove-headers-serializer branch from 5394b1d to cea3d28 Compare October 5, 2023 06:54
@ConradLang ConradLang marked this pull request as ready for review October 5, 2023 07:15
@ConradLang ConradLang requested a review from a team October 5, 2023 07:15
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the release summary format ok?
I'm not sure what it will look like in the release notes.

These headers amount to logging noise today, and so the new version will not require code changes or produce unexpected behaviour across our applications.
@72636c 72636c self-assigned this Oct 6, 2023
72636c added a commit that referenced this pull request Oct 6, 2023
`toMatchObject` accepts an object subset, which makes it hard to
reliably test property removal. For example, #92 adds a happy-path test
case "should omit defaultOmitHeaderNames by default" which unexpectedly
passes even if the omission logic is removed.

These tests aren't great overall and I'd prefer them to use a more
typical Jest structure so we can use `.only`s and the like, but this
should be enough to get #92 through with confidence around regression
testing.
Comment on lines +9 to +17
// We can get away with a shallow clone as we only touch top-level properties.
const output: Record<PropertyKey, unknown> = {
...input,
};

for (const property of properties) {
// Remove the property from our shallow clone.
delete output[property];
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption is that this is overall cheaper than re-spreading the object on every iteration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete seems the fastest; thanks for showing me this.

Tested in TS Playground

Comment on lines 22 to 29
const propertyNames = [...new Set(options.omitPropertyNames ?? [])].filter(
(propertyName) => typeof propertyName === 'string',
);

if (
!topLevelPropertyName ||
typeof topLevelPropertyName !== 'string' ||
topLevelPropertyName.length === 0 ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is that we don't need to be quite this defensive, because the only variable input to createOmitPropertiesSerializer is omitHeaderNames, and that should be statically defined and type checked in the consumer repo.

By contrast it makes sense to be defensive when serializing an arbitrary logging object as we don't have any type guarantees around that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind this change since as you say they're meant to be statically defined and type-checked.

My thoughts were that it's done on initialisation and cleansing external inputs is just automatic (since you can cast as unknown as ExpectedType) so wasn't expensive anyway.

Copy link
Member

@72636c 72636c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a bunch for your patience on this one, it looks great 🙌

I've made a few hopefully uncontroversial tweaks to the implementation. Let me know if you'd like me to run through any of these or if you feel any are regressions.

Otherwise I'll aim to merge this in and cut a release tomorrow.

Comment on lines +23 to +25
* Matching is currently case sensitive.
* You will typically express the header names in lowercase,
* as server frameworks normalise incoming headers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting this limitation here; we can also surface in the changeset or README if you think it necessary.

I assume we are comfortable with this for now; we can weigh up the cost of lowercasing the property names if it turns out to be an issue.

@ConradLang
Copy link
Contributor Author

ConradLang commented Oct 9, 2023

@72636c : I'm happy with the changes and learnt a few things too.

Thanks for your review and updates!

@72636c 72636c merged commit 06aa31a into master Oct 9, 2023
2 checks passed
@72636c 72636c deleted the add-remove-headers-serializer branch October 9, 2023 03:07
@seek-oss-ci seek-oss-ci mentioned this pull request Oct 9, 2023
72636c added a commit that referenced this pull request Oct 9, 2023
`toMatchObject` accepts an object subset, which makes it hard to
reliably test property removal. For example, #92 adds a happy-path test
case "should omit defaultOmitHeaderNames by default" which unexpectedly
passes even if the omission logic is removed.

These tests aren't great overall and I'd prefer them to use a more
typical Jest structure so we can use `.only`s and the like, but this
should be enough to get #92 through with confidence around regression
testing.

---------

Co-authored-by: Conrad Lang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants