Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_azure_logs_ingestion: make the stream_name explicit #8472

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jlaundry
Copy link

@jlaundry jlaundry commented Feb 9, 2024

While the stream name for Custom log table DCRs created through the Azure Portal GUI are predictable (i.e., if the Custom Log table is Example_CL, the stream name will be Custom-Example_CL), this is not always the case.

This is purely convention for DCRs created through the Azure Portal GUI, but by using the API or ARM templates, it's possible to create DCRs have completely arbitrary stream names, which is necessary if you're sending data to built-in tables (i.e., Syslog or CommonSecurityLog), and/or have multiple transformation streams defined in the same DCR.

This PR creates the stream_name parameter, so that it can be set explicitly.

One thing I'm not sure (and would appreciate feedback on) is if stream_name should be Optional, and assume the old behavior... while it would be nice for config backwards compatibility, the effect of getting it wrong is that logs ultimately won't end up in the right place, and other applications/SDKs require the stream name to be explicitly set.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@jlaundry
Copy link
Author

jlaundry commented Feb 9, 2024

Example config (you can use the instructions at https://learn.microsoft.com/en-us/azure/sentinel/connect-logstash-data-connection-rules#example-dcr-that-ingests-data-into-the-syslog-table to create an appropriate DCR):

[INPUT]
    Name dummy
    Dummy {"Computer": "dummy", "Message":"This is a test."}

[OUTPUT]
    Name stdout
    Format json_lines
    Match *

[OUTPUT]
    Name            azure_logs_ingestion
    Match           *
    client_id       00000000-0000-0000-0000-000000000000
    client_secret   00000~0000000000000.0000000000000000-000
    tenant_id       00000000-0000-0000-0000-000000000000
    dce_url         https://example-xxxx.westus2-1.ingest.monitor.azure.com
    dcr_id          dcr-00000000000000000000000000000000
    table_name      Syslog
    stream_name     Custom-SyslogStream
    time_generated  true
    time_key        TimeGenerated
    Compress        true

@jlaundry
Copy link
Author

jlaundry commented Feb 9, 2024

Debug log: (sensitive values zeroised)

debug.log

@jlaundry
Copy link
Author

jlaundry commented Feb 9, 2024

Valgrind output:

==41351== 
==41351== HEAP SUMMARY:
==41351==     in use at exit: 1,524 bytes in 4 blocks
==41351==   total heap usage: 26,563 allocs, 26,559 frees, 21,018,260 bytes allocated
==41351== 
==41351== LEAK SUMMARY:
==41351==    definitely lost: 0 bytes in 0 blocks
==41351==    indirectly lost: 0 bytes in 0 blocks
==41351==      possibly lost: 0 bytes in 0 blocks
==41351==    still reachable: 1,524 bytes in 4 blocks
==41351==         suppressed: 0 bytes in 0 blocks
==41351== Reachable blocks (those to which a pointer was found) are not shown.
==41351== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==41351== 
==41351== Use --track-origins=yes to see where uninitialised values come from
==41351== For lists of detected and suppressed errors, rerun with: -s
==41351== ERROR SUMMARY: 468226 errors from 1000 contexts (suppressed: 0 from 0)

@jlaundry
Copy link
Author

Just checking, was there anything else I need to do?

@kforeverisback
Copy link
Contributor

@jlaundry @edsiper and @patrick-stephens I originally developed this plugin. Will test it out today and post any change request if necessary.

@jlaundry
Copy link
Author

@kforeverisback just checking, did you manage to take a look?

@dneto82
Copy link

dneto82 commented Aug 29, 2024

This will be nice. Thanks @jlaundry for your article and work on this.

Waiting for this amazing fix

@jlaundry
Copy link
Author

jlaundry commented Nov 1, 2024

@edsiper @leonardo-albertovich @fujimotos @koleini just trying to revive this, have I missed a step?

@kforeverisback
Copy link
Contributor

kforeverisback commented Nov 28, 2024

@jlaundry @dneto82 Sorry for the late reply. I was preoccupied with lots of stuff these past few months.
While looking at the code changes, I noticed couple of things:

  1. The table_name property replaced with stream_name. It's just an empty property essentially at this point without any feature. If we're focusing on using the new stream_name instead of the table_name we should remove the latter and properly document the newer one (with reference links)
  2. While I am okay with the replacement, the stream_name is not properly documented in your documentation PR (or even in the MSFT link provided in your doc PR). Check my comments: https://github.com/fluent/fluent-bit-docs/pull/1305/files#r1861378861.
  3. The stream_name looks similar to logstash's microsoft-sentinel-log-analytics-logstash-output-plugin that uses dcr_stream_name and the only document I could find where to find this "stream" is here and here:
  4. Correct me if I am wrong on this: this PR seems to handle a specific use case where you can send logs (e.g. Syslogs) to a "Standard" table using custom log ingestions. Standard tables might have Even with the standard tables, based on your config example the stream_name still has "Custom-" attached to it. In fact, all of the example config in the logstash microsoft-sentinel plugin has "Custom-" prefix in stream_name. I didn't find an example of other prefixes (e.g. Microsoft-Syslog).

Also, I was trying to create a custom DCR, that forwards logs to a standard Syslog table with Microsoft-Syslog as stream name, but I was unable to do that. Hence, I couldn't test your changes.

Would you be able to provide me a sample bicep/arm where I can create resources and test out the custom stream name property?

@jlaundry
Copy link
Author

Hi @kforeverisback,

  1. Ya I was also thinking it'd be better to completely break backwards compatibility... will see if I can dust off the code and take table_name out.
  2. Looks like Microsoft have refactored their docs; I've updated the link in the doc PR.
  3. Yes, it's the same attribute.
  4. Not sure I understand this point? Stream names can currently start with Microsoft- or Custom-, but that may change in the future: https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/data-collection-rule-structure#data-flow-properties
  5. Yes, I have a ARM template and instructions on my blog: https://www.jlaundry.nz/2024/using_fluent_bit_to_send_commonsecuritylog_data_to_sentinel/

@kforeverisback
Copy link
Contributor

kforeverisback commented Nov 28, 2024

Hi @kforeverisback,

  1. Ya I was also thinking it'd be better to completely break backwards compatibility... will see if I can dust off the code and take table_name out.
  2. Looks like Microsoft have refactored their docs; I've updated the link in the doc PR.
  3. Yes, it's the same attribute.
  4. Not sure I understand this point? Stream names can currently start with Microsoft- or Custom-, but that may change in the future: https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/data-collection-rule-structure#data-flow-properties
  5. Yes, I have a ARM template and instructions on my blog: https://www.jlaundry.nz/2024/using_fluent_bit_to_send_commonsecuritylog_data_to_sentinel/

Thanks @jlaundry!
Your blog post seems elaborate! Will try to follow that and test it out this week.

EDIT: Rethinking about backward-compatibility:
IMO if we can get by without breaking backward compatibility by documenting or reusing/adding properties would be the best for FluentBit adoption and dev experience.
If table_name, a Required property, is suddenly removed, updating FluentBit (minor version change) will break due to just a property name change. It will leave a terrible user/dev experience.

We have a couple of options to maintain backward compatibility, we can:

  1. Use table_name for both stream and tables, properly document where and how the table/stream is used. We can differentiate between a table and a stream by searching for _CL suffix for table (this is established and unchanged ever since Log Analytics was introduced) and searching for -(hyphenate) in between two strings for streams (table names don't support -).
  2. We keep both table_name and stream_name with some either-or sanity check and separate URI format for each and document their constraints and properties.

Let me know what you think of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants