Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

es_out: support Upstream Servers with configuration overriding #7608

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

mabrarov
Copy link
Contributor

@mabrarov mabrarov commented Jun 25, 2023

Implementation of Upstream feature for the Elasticsearch output plugin.

This pull request is based on pull request #1560 and Forward output plugin.

It was tested in a local setup with:

  1. Fluent Bit without Upstream feature connected to a single node of Elasticsearch cluster consisting of 3 master-eligible/data and 1 coordinating nodes.

    Refer to elastic-cluster directory of mabrarov/elastic-stack repository for Docker Compose project used to create target Elasticsearch cluster and Kibana.

    fluent-bit.conf Fluent Bit configuration file used for the test - refer to fluent-bit-es/fluent-bit.conf in mabrarov/elastic-stack repository.

    Debug log is available at flb_es.log.

  2. Fluent Bit with Upstream feature connected to all Elasticsearch data nodes of Elasticsearch cluster consisting of 3 master-eligible/data and 1 coordinating nodes.

    Refer to elastic-cluster directory of mabrarov/elastic-stack repository for Docker Compose project used to create target Elasticsearch cluster and Kibana.

    fluent-bit.conf Fluent Bit configuration file used for the test - refer to fluent-bit-es-cluster/fluent-bit.conf in mabrarov/elastic-stack repository.

    Debug log is available at flb_es_upstream.log.

Testing

  • Example configuration files for the change can be found in mabrarov/elastic-stack repository under fluent-bit-es-cluster directory.
  • Debug log output from testing the change - see above.
  • Attached Valgrind output that shows no leaks or memory corruption was found - refer to flb_run_code_analysis.log for the output of command
    TEST_PRESET=valgrind ./run_code_analysis.sh
  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

return 0;
}

toks = flb_utils_split((const char *)cloud_auth, ':', -1);
Copy link
Contributor Author

@mabrarov mabrarov Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not:

Suggested change
toks = flb_utils_split((const char *)cloud_auth, ':', -1);
toks = flb_utils_split((const char *) cloud_auth, ':', -1);

because https://github.com/fluent/fluent-bit/blob/master/CONTRIBUTING.md requires to follow https://httpd.apache.org/dev/styleguide.html which requires to not put a space in case of type cast:

The Guidelines
...
There is no whitespace between a cast and the item modified (e.g., "(int)j" and not "(int) j").

@mabrarov
Copy link
Contributor Author

Hi reviewers,

Is it possible to approve only workflow for this pull request, so that automated checks and build can start?

Thank you.

@mabrarov mabrarov temporarily deployed to pr June 28, 2023 17:55 — with GitHub Actions Inactive
@mabrarov mabrarov temporarily deployed to pr June 28, 2023 17:55 — with GitHub Actions Inactive
@mabrarov mabrarov temporarily deployed to pr June 28, 2023 17:55 — with GitHub Actions Inactive
@PettitWesley
Copy link
Contributor

@mabrarov sure

@mabrarov mabrarov temporarily deployed to pr June 28, 2023 18:22 — with GitHub Actions Inactive
@mabrarov
Copy link
Contributor Author

mabrarov commented Jun 29, 2023

Hi @PettitWesley,

It looks like all failed checks are around run-macos-unit-tests jobs and caused by the following failed unit tests:

  1. flb-rt-in_event_test
  2. flb-rt-out_tcp

I feel like other pull requests have the same issues, i.e. it doesn't seem that the failed checks are caused by this pull request changes.

Help of maintainers is appreciated.

Thank you.

}
tmp = flb_upstream_node_get_property(FLB_ES_CONFIG_PROPERTY_LOGSTASH_PREFIX, node);
if (tmp) {
ec->logstash_prefix = (char *)tmp;
Copy link
Contributor Author

@mabrarov mabrarov Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type cast is illegal. Should use flb_sds_create instead. Working on fix in feature/out_es_upstream_support_extended_test branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@mabrarov mabrarov force-pushed the feature/out_es_upstream_support_extended branch from ba3382a to b7cd81b Compare July 8, 2023 10:22
@mabrarov
Copy link
Contributor Author

Hi @PettitWesley,

Is it possible to trigger automated workflow (build) for this pull request one more time? I found & fixed one issue and added tests for the new code since last build happened.

Thank you.

@mabrarov mabrarov temporarily deployed to pr July 10, 2023 12:05 — with GitHub Actions Inactive
@mabrarov mabrarov temporarily deployed to pr July 10, 2023 12:05 — with GitHub Actions Inactive
@mabrarov mabrarov temporarily deployed to pr July 10, 2023 12:05 — with GitHub Actions Inactive
@mabrarov mabrarov temporarily deployed to pr July 10, 2023 12:28 — with GitHub Actions Inactive
@mabrarov
Copy link
Contributor Author

Hi dear reviewers,

Is it possible to get this pull request reviewed / accepted sooner? Is there something pending / waiting from my side to start review?

Thank you.

@mabrarov mabrarov force-pushed the feature/out_es_upstream_support_extended branch from b7cd81b to b81d3f7 Compare July 20, 2023 19:38
@mabrarov
Copy link
Contributor Author

Hi @PettitWesley and @edsiper,

It feels like you are code owners for Elasticsearch output plugin. Is there something pending / waiting from my side to start review of this pull request? This new feature was requested 4 years ago and I feel it is something which multiple users of Fluent Bit (not just my team) would like to have.

Thank you.

@mabrarov mabrarov force-pushed the feature/out_es_upstream_support_extended branch from b81d3f7 to f6431c2 Compare September 30, 2023 13:32
…wn to parser of Upstream node configuration section are implemented, e.g. "host" and "port"

Signed-off-by: Marat Abrarov <[email protected]>
…o the test callback based on configuration of Fluent Bit and based on configuration of plugin

Signed-off-by: Marat Abrarov <[email protected]>
…with Upstream node configuration

For Elastic cloud authentication these parameters are always taken from plugin configuration and never from Upstream node configuration: cloud_id.

For AWS authentication these parameters are always taken from plugin configuration and never from Upstream node configuration: http_proxy, no_proxy, tls*.

Signed-off-by: Marat Abrarov <[email protected]>
…o the test callback based on configuration of Fluent Bit and based on configuration of plugin

Signed-off-by: Marat Abrarov <[email protected]>
Copy link
Contributor

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking whether initialized or not with own_xxx is bad idea and it's too defensive patterns for our code style.
Initializing planning to be allocated variable should be initialized with NULL and checking whether it was allocated or not with if (ec->xxx) { ... } pattern.
NULL checking pattern should be enough to release allocated resources. This pattern could not be working properly when lacking of initializing with NULL on initialize.

plugins/out_es/es.h Show resolved Hide resolved
plugins/out_es/es_conf.c Show resolved Hide resolved
plugins/out_es/es_conf.c Show resolved Hide resolved
@mabrarov
Copy link
Contributor Author

mabrarov commented Dec 8, 2024

Hi @cosmo0920,

Checking whether initialized or not with own_xxx is bad idea

own_xxx is not about initialization, but is about ownership. Some data (to simplify the code and to reduce memory footprint) is shared b/w output plugin configuration and upstream node configuration, so it is required to know what entity owns the data to deallocate that data just once.

Initializing planning to be allocated variable should be initialized with NULL and checking whether it was allocated or not with if (ec->xxx) { ... } pattern. NULL checking pattern should be enough to release allocated resources.

NULL check is not required for flb_free function, because it uses C standard library free which accepts NULL:

The function accepts (and does nothing with) the null pointer to reduce the amount of special-casing.

Thank you.

@mabrarov mabrarov requested a review from cosmo0920 December 8, 2024 17:59
plugins/out_es/es_conf.c Show resolved Hide resolved
@mabrarov
Copy link
Contributor Author

mabrarov commented Dec 9, 2024

Hi @cosmo0920,

Regarding your comment - the ownership concept have no relationship with language. If we remove all own_xxx fields and run Fluent Bit with upstream servers (refer to fluent-bit-es-cluster/fluent-bit.conf in description of this pull request) and Valgrind, then we will get double deallocation error, because the same value (same address) is stored in multiple pointers. Please check the code one more time and thank you for the time you spend on review of these changes.

@mabrarov mabrarov requested a review from cosmo0920 December 9, 2024 08:42
@cosmo0920
Copy link
Contributor

Hi @cosmo0920,

Regarding your comment - the ownership concept have no relationship with language. If we remove all own_xxx fields and run Fluent Bit with upstream servers (refer to fluent-bit-es-cluster/fluent-bit.conf in description of this pull request) and Valgrind, then we will get double deallocation error, because the same value (same address) is stored in multiple pointers. Please check the code one more time and thank you for the time you spend on review of these changes.

In our code base, there is no own_xxx fields even if flb_upstream using plugins such as out_forward.
This indicates that your code style does not fit for ours. Thanks.

@mabrarov
Copy link
Contributor Author

mabrarov commented Dec 9, 2024

Hi @cosmo0920,

Regarding your comment.

The only Fluent Bit output plugin supporting upstream servers (before this pull request) is Forward output plugin and that plugin seems to not support overriding common configuration of plugin per each upstream node. That's the reason Forward output plugin doesn't need to care about ownership of configuration data - each upstream node just use its own set of configuration data. User of Forward output plugin has to specify (duplicate) plugin options at upstream node level otherwise default value will be used (e.g. shared_key, username and password options), i.e there is no place to specify configuration which is common for all upstream nodes of specific Forward output plugin instance. This approach is not going to work well with Elasticsearch output plugin, because it has too many options to duplicate them per each upstream node.

Refer to the linked fluent/fluent-bit-docs#1143 for the changes in documentation for Elasticsearch output plugin which clearly state what options can be overridden at upstream node level (or configuration of output plugin is used otherwise).

Thank you.

@cosmo0920
Copy link
Contributor

Hi @cosmo0920,

Regarding your comment.

The only Fluent Bit output plugin supporting upstream servers (before this pull request) is Forward output plugin and that plugin seems to not support overriding common configuration of plugin per each upstream node. That's the reason Forward output plugin doesn't need to care about ownership of configuration data - each upstream node just use its own set of configuration data. User of Forward output plugin has to specify (duplicate) plugin options at upstream node level otherwise default value will be used (e.g. shared_key, username and password options), i.e there is no place to specify configuration which is common for all upstream nodes of specific Forward output plugin instance. This approach is not going to work well with Elasticsearch output plugin, because it has too many options to duplicate them per each upstream node.

Refer to the linked fluent/fluent-bit-docs#1143 for the changes in documentation for Elasticsearch output plugin which clearly state what options can be overridden at upstream node level (or configuration of output plugin is used otherwise).

Thank you.

Hi,
If your suggestion is true, we need to implement overriding mechanism on configuration on upstream nodes.
With the current implementation, it is too deviated from our coding rules.
From the own_xxx pattern, there is not good smells.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants