Skip to content

filter_ecs: Retrieve container ID from record field#9033

Open
RaJiska wants to merge 3 commits into
fluent:masterfrom
RaJiska:filter-ecs-container-id-field
Open

filter_ecs: Retrieve container ID from record field#9033
RaJiska wants to merge 3 commits into
fluent:masterfrom
RaJiska:filter-ecs-container-id-field

Conversation

@RaJiska
Copy link
Copy Markdown

@RaJiska RaJiska commented Jul 2, 2024

Adds container_id_field_name to retrieve the container ID instead of deducing it from the tags via ecs_tag_prefix which is too restrictive to use as too invasive over the logic the user wants to use.

Maintained the retrieval of the container ID to not introduce breaking change, but using container_id_field_name will override and retrieve container ID from the specified field.

Fixes the following issue: #9028


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Added configurable option to extract container IDs from record fields as an alternative to tag-based identification in the ECS filter.
  • Tests

    • Added comprehensive test coverage for the new container ID extraction feature, including scenarios for valid values and error handling.

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Jul 5, 2024

@RaJiska RaJiska force-pushed the filter-ecs-container-id-field branch from da51c32 to 59cd20b Compare July 6, 2024 08:50
@RaJiska
Copy link
Copy Markdown
Author

RaJiska commented Jul 6, 2024

@edsiper Added format checks as well as well as an extra test for invalid data.

@RaJiska
Copy link
Copy Markdown
Author

RaJiska commented Jul 16, 2024

@edsiper One of the test fails, though it doesn't seem related to the changes I made, is this a random failure or was some kind of side-effect introduced? I had all passing when running locally.

@RaJiska
Copy link
Copy Markdown
Author

RaJiska commented Aug 1, 2024

Bump @edsiper @PettitWesley

@RaJiska
Copy link
Copy Markdown
Author

RaJiska commented Aug 26, 2024

Bump @edsiper @PettitWesley 👀

@RaJiska
Copy link
Copy Markdown
Author

RaJiska commented Sep 30, 2024

@edsiper @PettitWesley

@RaJiska
Copy link
Copy Markdown
Author

RaJiska commented Dec 13, 2024

Bump @edsiper @PettitWesley

@github-actions
Copy link
Copy Markdown
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions Bot added the Stale label Mar 16, 2025
@RaJiska
Copy link
Copy Markdown
Author

RaJiska commented Mar 16, 2025

Bump @edsiper @PettitWesley

@github-actions github-actions Bot removed the Stale label Mar 23, 2025
@RaJiska
Copy link
Copy Markdown
Author

RaJiska commented Aug 7, 2025

Bump @edsiper @PettitWesley .

To give more context as to why this is useful, the current plugin retrieves the container ID through the tag, which can be overridden and will break this plugin. Why would anyone override the default tag you might ask? For routing purposes, which is currently impossible as the docker ID . Being able to have a custom tag would allow better routing capabilities.

@eschabell
Copy link
Copy Markdown
Contributor

@patrick-stephens would you have time to review this PR?

Copy link
Copy Markdown
Contributor

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me but I am concerned about the lack of codeowners...

It needs rebasing as well.

Signed-off-by: Ra'Jiska <dodo.lasticot@gmail.com>
Signed-off-by: Ra'Jiska <dodo.lasticot@gmail.com>
Signed-off-by: Ra'Jiska <dodo.lasticot@gmail.com>
@RaJiska RaJiska force-pushed the filter-ecs-container-id-field branch from 59cd20b to 8c960c9 Compare November 9, 2025 08:43
@RaJiska RaJiska requested a review from a team as a code owner November 9, 2025 08:43
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Nov 9, 2025

Walkthrough

The ECS filter plugin now supports configurable container ID extraction. A new container_id_field_name configuration option allows extracting container IDs directly from record fields, superseding the tag-prefix-based approach when configured. New helper functions deduce short container IDs from either tags or records, and metadata retrieval logic adapts accordingly with appropriate fallbacks and logging.

Changes

Cohort / File(s) Summary
ECS Filter Core Implementation
plugins/filter_ecs/ecs.h, plugins/filter_ecs/ecs.c
Added container_id_field_name struct field to flb_filter_ecs. Introduced static functions deduce_short_container_id_from_tag() and deduce_short_container_id_from_record() to extract 12-char container short IDs. Modified get_metadata_by_id() signature to accept container_short_id parameter. Updated cb_ecs_filter() to conditionally use field-based or tag-based extraction based on configuration, with fallback to cluster metadata and enhanced debug/warn logging.
ECS Filter Tests
tests/runtime/filter_ecs.c
Added three test cases: flb_test_ecs_filter_containerid_field (valid extraction), flb_test_ecs_filter_containerid_field_error_missing (missing field scenario), and flb_test_ecs_filter_containerid_field_error_invalid (invalid container ID values). Tests verify resource population with cluster, task, and container metadata, and validate expected output record counts and patterns.

Sequence Diagram

sequenceDiagram
    participant Init as Initialization
    participant Config as Configuration
    participant Filter as Filter (cb_ecs_filter)
    participant Tag as Tag-based Path
    participant Field as Field-based Path
    participant Meta as Metadata Store

    Init->>Config: Load container_id_field_name setting
    Config-->>Init: Config loaded
    
    rect rgb(200, 220, 240)
    Note over Filter: Per log event
    Filter->>Filter: Check if container_id_field_name set
    
    alt Field-based extraction enabled
        Filter->>Field: deduce_short_container_id_from_record()
        Field-->>Filter: container_short_id (or error)
        Filter->>Meta: get_metadata_by_id(container_short_id)
        Meta-->>Filter: metadata (or use cluster fallback)
    else Tag-based extraction (default)
        Filter->>Tag: deduce_short_container_id_from_tag()
        Tag-->>Filter: container_short_id
        Filter->>Meta: get_metadata_by_id(container_short_id)
        Meta-->>Filter: metadata
    end
    
    Filter->>Filter: Populate resource with metadata
    end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Areas requiring extra attention:
    • Logic flow in cb_ecs_filter() regarding conditional extraction method selection and fallback paths
    • Signature change to get_metadata_by_id() and all call sites to ensure correct parameter passing
    • Error handling and logging consistency between field-based and tag-based paths
    • Test coverage comprehensiveness for edge cases (malformed container IDs, concurrent scenarios)

Poem

🐰 A field in sight, or tag in hand,
Container IDs now help us stand.
With metadata flowing both ways round,
Configuration magic we have found! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding the ability to retrieve container ID from a record field in the ECS filter plugin.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a4c158d and 8c960c9.

📒 Files selected for processing (3)
  • plugins/filter_ecs/ecs.c (7 hunks)
  • plugins/filter_ecs/ecs.h (1 hunks)
  • tests/runtime/filter_ecs.c (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
plugins/filter_ecs/ecs.c (3)
src/flb_sds.c (2)
  • flb_sds_create_len (58-76)
  • flb_sds_destroy (389-399)
src/flb_record_accessor.c (3)
  • flb_ra_create (271-358)
  • flb_ra_get_value_object (803-814)
  • flb_ra_destroy (232-248)
src/flb_ra_key.c (1)
  • flb_ra_key_value_destroy (842-851)
tests/runtime/filter_ecs.c (2)
plugins/out_s3/s3.c (1)
  • setenv (57-60)
src/flb_lib.c (3)
  • flb_filter_set (618-649)
  • flb_start (983-994)
  • flb_lib_push (843-870)
🔇 Additional comments (11)
plugins/filter_ecs/ecs.h (1)

148-148: LGTM!

The new field container_id_field_name is appropriately placed in the struct and aligns with the feature requirements.

tests/runtime/filter_ecs.c (4)

435-482: LGTM!

The test correctly validates that container metadata is populated when a valid container_id field is present. The use of "randomtag" is appropriate since tag-based extraction is not used with the field-based approach.


484-531: LGTM!

The test correctly validates fallback behavior when the configured field name doesn't exist in the record. The expected pattern "cluster_name.." confirms that only cluster metadata is attached.


533-586: LGTM!

The test correctly validates handling of invalid container ID values (string too short and non-string type). Both records appropriately fall back to cluster metadata only.


596-598: LGTM!

All three new tests are properly registered in the TEST_LIST.

plugins/filter_ecs/ecs.c (6)

202-206: LGTM!

The debug logging appropriately informs which container ID retrieval method is in use based on the configuration.


1298-1328: LGTM!

The helper function correctly extracts the 12-character container short ID from the tag using the configured prefix. Error messages are clear and the memory management is sound.


1368-1402: LGTM!

The refactored function correctly accepts the pre-deduced container_short_id and provides appropriate logging based on the extraction method in use.


1532-1563: LGTM!

The tag-based extraction path correctly maintains backward compatibility, including the failure tracking mechanism to avoid repeated failed metadata requests. The fallback to cluster metadata is appropriate.


1593-1612: Verify interaction with the performance issue flagged earlier.

This code block calls deduce_short_container_id_from_record for every log event. As noted in the earlier comment (lines 1330-1366), creating a record accessor per event is expensive. Once that issue is addressed, this call site should be updated to pass the pre-created record accessor.

The logic here is sound—extracting container ID per record is necessary since different records in the same batch can have different container IDs. The fallback behavior and cleanup are correct.


1796-1801: LGTM!

The configuration map entry is complete with an appropriate description documenting that this option supersedes ecs_tag_prefix when set.

Comment thread plugins/filter_ecs/ecs.c
Comment on lines +1330 to +1366
static int deduce_short_container_id_from_record(struct flb_filter_ecs *ctx,
const msgpack_object *record,
flb_sds_t *container_short_id)
{
struct flb_record_accessor *ra_key;
struct flb_ra_value *rval;

ra_key = flb_ra_create(ctx->container_id_field_name, FLB_TRUE);
if (!ra_key) {
flb_errno();
return -1;
}
rval = flb_ra_get_value_object(ra_key, *record);
if (!rval) {
flb_plg_debug(ctx->ins, "Container field \"%s\" not found in record",
ctx->container_id_field_name);
flb_ra_destroy(ra_key);
return -1;
}
flb_ra_destroy(ra_key);
if (rval->type != FLB_RA_STRING || rval->o.via.str.size < 12) {
flb_plg_debug(ctx->ins, "Container field \"%s\" needs to represent a container "
"ID a string of at least 12 characters "
"representing the container ID",
ctx->container_id_field_name);
flb_ra_key_value_destroy(rval);
return -1;
}
*container_short_id = flb_sds_create_len(rval->val.string, 12);
if (!*container_short_id) {
flb_errno();
flb_ra_key_value_destroy(rval);
return -1;
}
flb_ra_key_value_destroy(rval);
return 0;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Performance: Record accessor should be created once during initialization.

This function is called for every log event when container_id_field_name is configured (see line 1595). Creating and destroying a record accessor per event is expensive and will degrade throughput with high-volume logs.

Consider this refactor:

  1. Add a field to struct flb_filter_ecs:
struct flb_record_accessor *container_id_ra;
  1. Initialize it once in cb_ecs_init (after line 206):
+    if (*ctx->container_id_field_name) {
+        ctx->container_id_ra = flb_ra_create(ctx->container_id_field_name, FLB_TRUE);
+        if (!ctx->container_id_ra) {
+            flb_plg_error(f_ins, "Could not create record accessor for container_id_field_name");
+            goto error;
+        }
+    }
  1. Simplify this function to accept and use the pre-created record accessor:
-static int deduce_short_container_id_from_record(struct flb_filter_ecs *ctx,
+static int deduce_short_container_id_from_record(struct flb_record_accessor *ra_key,
+                                                 struct flb_filter_ecs *ctx,
                                                  const msgpack_object *record,
                                                  flb_sds_t *container_short_id)
 {
-    struct flb_record_accessor *ra_key;
     struct flb_ra_value *rval;

-    ra_key = flb_ra_create(ctx->container_id_field_name, FLB_TRUE);
-    if (!ra_key) {
-        flb_errno();
-        return -1;
-    }
     rval = flb_ra_get_value_object(ra_key, *record);
     if (!rval) {
         flb_plg_debug(ctx->ins, "Container field \"%s\" not found in record",
                     ctx->container_id_field_name);
-        flb_ra_destroy(ra_key);
         return -1;
     }
-    flb_ra_destroy(ra_key);
  1. Clean up in flb_filter_ecs_destroy (after line 1763):
+        if (ctx->container_id_ra) {
+            flb_ra_destroy(ctx->container_id_ra);
+        }
  1. Update the call site at line 1595:
-            ret = deduce_short_container_id_from_record(ctx,
+            ret = deduce_short_container_id_from_record(ctx->container_id_ra, ctx,
                                                         log_event.body,
                                                         &container_short_id);
🤖 Prompt for AI Agents
In plugins/filter_ecs/ecs.c around lines 1330 to 1366, the function
creates/destroys a record accessor on every call which is costly; change it to
use a single pre-created accessor stored in ctx: add struct flb_record_accessor
*container_id_ra to struct flb_filter_ecs, create and assign
ctx->container_id_ra in cb_ecs_init (handle flb_ra_create failure), modify
deduce_short_container_id_from_record to use ctx->container_id_ra instead of
creating one (remove flb_ra_create/flb_ra_destroy calls and associated error
handling), ensure you still check rval for NULL and validate type/length as
before, free/ destroy ctx->container_id_ra in flb_filter_ecs_destroy, and update
the call site at line 1595 to rely on the ctx-held accessor (or pass it in) and
handle the case where the accessor is NULL.

Comment thread plugins/filter_ecs/ecs.c
Comment on lines +1351 to +1354
flb_plg_debug(ctx->ins, "Container field \"%s\" needs to represent a container "
"ID a string of at least 12 characters "
"representing the container ID",
ctx->container_id_field_name);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor grammar issue in error message.

The phrasing "needs to represent a container ID a string of at least 12 characters" is awkward.

Suggest:

-        flb_plg_debug(ctx->ins, "Container field \"%s\" needs to represent a container "
-                      "ID a string of at least 12 characters "
-                      "representing the container ID",
+        flb_plg_debug(ctx->ins, "Container field \"%s\" must be a string of at least "
+                      "12 characters representing the container ID",
                       ctx->container_id_field_name);
🤖 Prompt for AI Agents
In plugins/filter_ecs/ecs.c around lines 1351 to 1354, the log message text is
grammatically awkward; update the format string so it reads clearly (for
example: "Container field \"%s\" must be a string of at least 12 characters
representing the container ID") and keep using ctx->container_id_field_name as
the argument; ensure spacing and punctuation are correct and the message remains
passed to flb_plg_debug unchanged otherwise.

@sparrc
Copy link
Copy Markdown
Contributor

sparrc commented Nov 10, 2025

Hi @RaJiska, thanks for the contribution. I read thru the PR and the attached issue but Im having trouble fully understanding the use-case here. Can you describe in more detail why you need this? Can you please explain the end-to-end reason why you need this? You can include any necessary details like full fluent-bit config file, ecs task definition, awscli commands, any other relevant aws infra, etc etc. Please also indicate what about that fluent-bit config is broken and where this new argument would fit in.

Thank you.

I see this quote, but I don't fully understand what you mean by this:

the current plugin retrieves the container ID through the tag, which can be overridden and will break this plugin. Why would anyone override the default tag you might ask? For routing purposes, which is currently impossible as the docker ID

@github-actions
Copy link
Copy Markdown
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions Bot added the Stale label Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants