Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_forward: fix memory leak during connection loss #8399

Merged
merged 1 commit into from
Jan 22, 2024

Conversation

Garfield96
Copy link
Contributor

@Garfield96 Garfield96 commented Jan 21, 2024

I observed that memory consumption was constantly growing while the data sink of the forward plugin was unavailable. Using valgrind, it could be confirmed that there is a leak in the forward output plugin:

==8== 385,024 bytes in 47 blocks are definitely lost in loss record 126 of 126
==8==    at 0x4E055B4: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8==    by 0x4E0A81C: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8==    by 0x739B7B: msgpack_sbuffer_write (sbuffer.h:81)
==8==    by 0x5D5262: msgpack_pack_map (pack_template.h:753)
==8==    by 0x5D8F8A: flb_mp_map_header_init (flb_mp.c:321)
==8==    by 0x73A026: append_options (forward_format.c:103)
==8==    by 0x73AE2F: flb_forward_format_forward_mode (forward_format.c:442)
==8==    by 0x73B3A4: flb_forward_format (forward_format.c:633)
==8==    by 0x73566E: cb_forward_flush (forward.c:1559)
==8==    by 0x51F033: output_pre_cb_flush (flb_output.h:597)
==8==    by 0xA45126: co_init (amd64.c:117)

Further analysis showed that during connection loss out_buf is only freed if time_as_integer is set to true:

        if (!u_conn) {
            flb_plg_error(ctx->ins, "no upstream connections available");
            msgpack_sbuffer_destroy(&mp_sbuf);
            if (fc->time_as_integer == FLB_TRUE) {
                flb_free(out_buf);
            }

Source: https://github.com/fluent/fluent-bit/blob/master/plugins/out_forward/forward.c#L1574-L1579

However, from the code it looks like out_buf is also used to write metadata in append_options (https://github.com/fluent/fluent-bit/blob/master/plugins/out_forward/forward_format.c#L85). To verify this assumption, I enabled Time_as_Integer in the forward configuration. Afterwards, valgrind did no longer find a leak. Therefore, I removed the conditional, which is save because this is also done in other places (e.g. https://github.com/fluent/fluent-bit/blob/master/plugins/out_forward/forward.c#L1644) and even if out_buf is not used, it would be NULL and freeing NULL is valid.

I ran the patched version for 12 hours and the memory consumption was perfectly constant.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • [N/A] Example configuration file for the change -> Can be reproduced with any forward configuration which doesn't set Time_as_Integer to true
  • [N/A] Debug log output from testing the change -> Change not visible in log
  • Attached Valgrind output that shows no leaks or memory corruption was found -> Nothing to attach, since valgrind found only known issues

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@edsiper
Copy link
Member

edsiper commented Jan 22, 2024

good catch! thanks for the PR

@edsiper edsiper merged commit 7218316 into fluent:master Jan 22, 2024
57 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants