Skip to content

input_chunk: configuration option for maximum chunk size#11373

Open
castorsky wants to merge 3 commits intofluent:masterfrom
castorsky:input_chunk_max_size_configurable
Open

input_chunk: configuration option for maximum chunk size#11373
castorsky wants to merge 3 commits intofluent:masterfrom
castorsky:input_chunk_max_size_configurable

Conversation

@castorsky
Copy link

@castorsky castorsky commented Jan 20, 2026

Introduced new key storage.max_chunk_size for the service block in configuration. This key regulates maximum size of buffer chunk for input plugins that use the filesystem buffer.

Default value of 2048000 was preserved for compatibility with older configurations.

Getter function for the value of storage.max_chunk_size is exposed for usage by other plugins. The in_winevtlog was patched to use this function for calculation of read threshold size.

This PR addresses #10327.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change

Variant 1 (default value):

bin/fluent-bit -i tail -p path=../test.log -o stdout -v

Variant 2 (max_chunk_size=16 KB):

service:
  flush: 1
  log_level: debug
  storage.max_chunk_size: 16KB

pipeline:
  inputs:
    - name: tail
      path: ../test.log

  outputs:
    - name: stdout
      match: '*'
  • Debug log output from testing the change
Log output for variant 1
Fluent Bit v5.0.0
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF graduated project under the Fluent organization
* https://fluentbit.io

______ _                  _    ______ _ _           _____  _____           _            
|  ___| |                | |   | ___ (_) |         |  ___||  _  |         | |           
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   _|___ \ | |/' |______ __| | _____   __
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \|  /| |______/ _` |/ _ \ \ / /
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V //\__/ /\ |_/ /     | (_| |  __/\ V / 
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)\___/       \__,_|\___| \_/


[2026/01/20 00:41:51.653100951] [ info] Configuration:
[2026/01/20 00:41:51.653201708] [ info]  flush time     | 1.000000 seconds
[2026/01/20 00:41:51.653211382] [ info]  grace          | 5 seconds
[2026/01/20 00:41:51.653222881] [ info]  daemon         | 0
[2026/01/20 00:41:51.653229900] [ info] ___________
[2026/01/20 00:41:51.653238740] [ info]  inputs:
[2026/01/20 00:41:51.653245854] [ info]      tail
[2026/01/20 00:41:51.653252957] [ info] ___________
[2026/01/20 00:41:51.653260515] [ info]  filters:
[2026/01/20 00:41:51.653272466] [ info] ___________
[2026/01/20 00:41:51.653281744] [ info]  outputs:
[2026/01/20 00:41:51.653290761] [ info]      stdout.0
[2026/01/20 00:41:51.653299858] [ info] ___________
[2026/01/20 00:41:51.653308603] [ info]  collectors:
[2026/01/20 00:41:51.654220525] [ info] [fluent bit] version=5.0.0, commit=70b94ff0ad, pid=486702
[2026/01/20 00:41:51.654295200] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2026/01/20 00:41:51.654396819] [ info] [storage] ver=1.5.4, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2026/01/20 00:41:51.654442422] [ info] [simd    ] disabled
[2026/01/20 00:41:51.654462908] [ info] [cmetrics] version=1.0.6
[2026/01/20 00:41:51.654494218] [ info] [ctraces ] version=0.6.6
[2026/01/20 00:41:51.654763953] [ info] [input:tail:tail.0] initializing
[2026/01/20 00:41:51.654805973] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2026/01/20 00:41:51.654865100] [debug] [tail:tail.0] created event channels: read=28 write=29
[2026/01/20 00:41:51.655226412] [debug] [input:tail:tail.0] flb_tail_fs_inotify_init() initializing inotify tail input
[2026/01/20 00:41:51.655282897] [debug] [input:tail:tail.0] inotify watch fd=34
[2026/01/20 00:41:51.655322899] [debug] [input:tail:tail.0] scanning path ../test.log
[2026/01/20 00:41:51.655420050] [debug] [input:tail:tail.0] file will be read in POSIX_FADV_DONTNEED mode ../test.log
[2026/01/20 00:41:51.655609024] [debug] [input:tail:tail.0] inode=796614 with offset=909 appended as ../test.log
[2026/01/20 00:41:51.655648763] [debug] [input:tail:tail.0] scan_glob add(): ../test.log, inode 796614
[2026/01/20 00:41:51.655671165] [debug] [input:tail:tail.0] 1 new files found on path '../test.log'
[2026/01/20 00:41:51.655719720] [debug] [stdout:stdout.0] created event channels: read=36 write=37
[2026/01/20 00:41:51.656347185] [ info] [sp] stream processor started
[2026/01/20 00:41:51.656444674] [ info] [output:stdout:stdout.0] worker #0 started
[2026/01/20 00:41:51.656673628] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
[2026/01/20 00:41:51.657166761] [debug] [input:tail:tail.0] inode=796614 file=../test.log promote to TAIL_EVENT
[2026/01/20 00:41:51.657333440] [ info] [input:tail:tail.0] inotify_fs_add(): inode=796614 watch_fd=1 name=../test.log
[2026/01/20 00:41:51.657387057] [debug] [input:tail:tail.0] [static files] processed 0b, done
[2026/01/20 00:42:01.458347644] [debug] [input:tail:tail.0] inode=796614, ../test.log, events: IN_MODIFY 
[2026/01/20 00:42:01.458682230] [debug] [input chunk] could not parse maximum chunk size, using the default value: 2048000
[2026/01/20 00:42:02.454180506] [debug] [task] created task=0x7f0dc40382e0 id=0 OK
[2026/01/20 00:42:02.454233071] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] tail.0: [[1768869721.458544686, {}], {"log"=>"TEST"}]
[2026/01/20 00:42:02.454539502] [debug] [out flush] cb_destroy coro_id=0
[2026/01/20 00:42:02.454644649] [debug] [task] destroy task=0x7f0dc40382e0 (task_id=0)
Log output for variant 2
Fluent Bit v5.0.0
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF graduated project under the Fluent organization
* https://fluentbit.io

______ _                  _    ______ _ _           _____  _____           _            
|  ___| |                | |   | ___ (_) |         |  ___||  _  |         | |           
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   _|___ \ | |/' |______ __| | _____   __
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \|  /| |______/ _` |/ _ \ \ / /
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V //\__/ /\ |_/ /     | (_| |  __/\ V / 
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)\___/       \__,_|\___| \_/


[2026/01/20 00:51:19.217684032] [ info] Configuration:
[2026/01/20 00:51:19.217785010] [ info]  flush time     | 1.000000 seconds
[2026/01/20 00:51:19.217798788] [ info]  grace          | 5 seconds
[2026/01/20 00:51:19.217815154] [ info]  daemon         | 0
[2026/01/20 00:51:19.217824769] [ info] ___________
[2026/01/20 00:51:19.217833763] [ info]  inputs:
[2026/01/20 00:51:19.217844619] [ info]      tail
[2026/01/20 00:51:19.217857143] [ info] ___________
[2026/01/20 00:51:19.217872612] [ info]  filters:
[2026/01/20 00:51:19.217883250] [ info] ___________
[2026/01/20 00:51:19.217899388] [ info]  outputs:
[2026/01/20 00:51:19.217912336] [ info]      stdout.0
[2026/01/20 00:51:19.217924923] [ info] ___________
[2026/01/20 00:51:19.217935031] [ info]  collectors:
[2026/01/20 00:51:19.218959450] [ info] [fluent bit] version=5.0.0, commit=70b94ff0ad, pid=488771
[2026/01/20 00:51:19.219030387] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2026/01/20 00:51:19.219153132] [ info] [storage] ver=1.5.4, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2026/01/20 00:51:19.219205863] [ info] [simd    ] disabled
[2026/01/20 00:51:19.219230922] [ info] [cmetrics] version=1.0.6
[2026/01/20 00:51:19.219268591] [ info] [ctraces ] version=0.6.6
[2026/01/20 00:51:19.219557226] [ info] [input:tail:tail.0] initializing
[2026/01/20 00:51:19.219599135] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2026/01/20 00:51:19.219649142] [debug] [tail:tail.0] created event channels: read=28 write=29
[2026/01/20 00:51:19.220005076] [debug] [input:tail:tail.0] flb_tail_fs_inotify_init() initializing inotify tail input
[2026/01/20 00:51:19.220068447] [debug] [input:tail:tail.0] inotify watch fd=34
[2026/01/20 00:51:19.220105361] [debug] [input:tail:tail.0] scanning path ../test.log
[2026/01/20 00:51:19.220182355] [debug] [input:tail:tail.0] file will be read in POSIX_FADV_DONTNEED mode ../test.log
[2026/01/20 00:51:19.220361757] [debug] [input:tail:tail.0] inode=796614 with offset=914 appended as ../test.log
[2026/01/20 00:51:19.220401043] [debug] [input:tail:tail.0] scan_glob add(): ../test.log, inode 796614
[2026/01/20 00:51:19.220430057] [debug] [input:tail:tail.0] 1 new files found on path '../test.log'
[2026/01/20 00:51:19.220490328] [debug] [stdout:stdout.0] created event channels: read=36 write=37
[2026/01/20 00:51:19.221142793] [ info] [sp] stream processor started
[2026/01/20 00:51:19.221260610] [ info] [output:stdout:stdout.0] worker #0 started
[2026/01/20 00:51:19.221378059] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
[2026/01/20 00:51:19.221738511] [debug] [input:tail:tail.0] inode=796614 file=../test.log promote to TAIL_EVENT
[2026/01/20 00:51:19.221877524] [ info] [input:tail:tail.0] inotify_fs_add(): inode=796614 watch_fd=1 name=../test.log
[2026/01/20 00:51:19.221921952] [debug] [input:tail:tail.0] [static files] processed 0b, done
[2026/01/20 00:51:23.735527246] [debug] [input:tail:tail.0] inode=796614, ../test.log, events: IN_MODIFY 
[2026/01/20 00:51:23.735760022] [debug] [input chunk] using maximum chunk size: 16000
[2026/01/20 00:51:24.454219507] [debug] [task] created task=0x7f98c40382c0 id=0 OK
[2026/01/20 00:51:24.454287605] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] tail.0: [[1768870283.735683874, {}], {"log"=>"TEST"}]
[2026/01/20 00:51:24.454762340] [debug] [out flush] cb_destroy coro_id=0
[2026/01/20 00:51:24.454894806] [debug] [task] destroy task=0x7f98c40382c0 (task_id=0)
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

PR for documentation

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Added a configurable storage maximum chunk size so users can set custom chunk limits.
  • Chores

    • Updated core and plugin chunk-sizing logic to honor the new configuration and replace fixed-size behavior.
    • Improved threshold handling for chunk limits so displayed and enforced sizes reflect configuration and safe bounds.

@coderabbitai
Copy link

coderabbitai bot commented Jan 20, 2026

📝 Walkthrough

Walkthrough

A new configurable storage.max_chunk_size option was added to flb_config and wired into input chunk handling and the in_winevtlog plugin. Hard-coded chunk-size constants were replaced with a computed max size obtained via flb_input_chunk_get_max_size(), and config cleanup was added.

Changes

Cohort / File(s) Summary
Config header
include/fluent-bit/flb_config.h
Added storage_max_chunk_size field to struct flb_config and defined FLB_CONF_STORAGE_MAX_CHUNK_SIZE macro.
Input chunk API
include/fluent-bit/flb_input_chunk.h
Changed FLB_INPUT_CHUNK_FS_MAX_SIZE to (size_t) cast; added declaration size_t flb_input_chunk_get_max_size(struct flb_config *config);.
Config implementation
src/flb_config.c
Registered storage.max_chunk_size config key and freed storage_max_chunk_size in flb_config_exit.
Input chunk logic
src/flb_input_chunk.c
Added flb_input_chunk_get_max_size() which parses storage_max_chunk_size (fallback to default), and replaced hard-coded max-size checks and 1% threshold calculations with config-derived values.
Plugin threshold handling
plugins/in_winevtlog/in_winevtlog.c
Removed fixed MAXIMUM_THRESHOLD_SIZE, added MAXIMUM_THRESHOLD_PERCENT and maximum_threshold_size computed from flb_input_chunk_get_max_size(config), and updated threshold boundary logic and messages.

Sequence Diagram(s)

sequenceDiagram
    participant Config
    participant InputChunk
    participant WinEvtPlugin

    Config->>InputChunk: provide storage.max_chunk_size
    InputChunk->>InputChunk: flb_input_chunk_get_max_size(config) -> parse value or fallback
    WinEvtPlugin->>InputChunk: request max size (flb_input_chunk_get_max_size)
    InputChunk-->>WinEvtPlugin: return computed maximum_threshold_size
    WinEvtPlugin->>WinEvtPlugin: compute thresholds and apply caps/warnings
    InputChunk->>InputChunk: apply max size and 1% threshold during append
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • edsiper

Poem

🐰 I hopped through config, nibbling bytes so fine,
I found a size that now can grow or shrink on time,
No more rigid bounds to make me pout,
Chunks flex and dance as bytes whirl about,
Hooray — the rabbit's patch feels just sublime!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: adding a configuration option for maximum chunk size in input_chunk functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can disable the changed files summary in the walkthrough.

Disable the reviews.changed_files_summary setting to disable the changed files summary in the walkthrough.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70b94ff0ad

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Added configuration parameter for 'service' block to specify
maximum chunk size for the 'input_chunk' module.

Signed-off-by: Castor Sky <csky57@gmail.com>
Added new function 'flb_input_chunk_get_max_size' that retrieves value
of the 'storage.max_chunk_size' parameter from the fluent-bit configuration
or sets default value of FLB_INPUT_CHUNK_FS_MAX_SIZE (when user have not set
parameter or there is any problem in parsing). Function is exposed to other
modules and can be used anywhere to get 'storage.max_chunk_size' parameter.

Light optimization: validation of available space in buffer now uses integer
division instead of floating point multiplication (should be faster).

Signed-off-by: Castor Sky <csky57@gmail.com>
…threshold

Used configurable parameter 'storage.max_chunk_size' to calculate reading size
threshold instead of fixed FLB_INPUT_CHUNK_FS_MAX_SIZE. Unnecessary type
conversions were removed (all related variables are `size_t`).

Introduced MAXIMUM_THRESHOLD_PERCENT to replace MAXIMUM_THRESHOLD_SIZE for
calculation of threshold as percentage of user configured parameter.

Signed-off-by: Castor Sky <csky57@gmail.com>
@castorsky castorsky force-pushed the input_chunk_max_size_configurable branch from 5c70876 to 827dc18 Compare March 16, 2026 12:00
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/in_winevtlog/in_winevtlog.c`:
- Around line 278-305: The clamp logic can produce contradictory behavior when
maximum_threshold_size is lower than MINIMUM_THRESHOLD_SIZE; change the flow to
first compute and enforce an effective_max = (maximum_threshold_size <
MINIMUM_THRESHOLD_SIZE) ? MINIMUM_THRESHOLD_SIZE : maximum_threshold_size
(logging a single warning via flb_plg_warn if you had to raise the max), then
perform a single clamp of ctx->total_size_threshold into the inclusive range
[MINIMUM_THRESHOLD_SIZE, effective_max], using
flb_utils_bytes_to_human_readable_size and flb_plg_warn/flb_plg_debug for
consistent messages and updating ctx->total_size_threshold accordingly;
reference variables/functions: maximum_threshold_size, MINIMUM_THRESHOLD_SIZE,
ctx->total_size_threshold, flb_utils_bytes_to_human_readable_size, flb_plg_warn,
flb_plg_debug.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b1e15ea6-e2c5-4762-9181-0718278d958f

📥 Commits

Reviewing files that changed from the base of the PR and between 70b94ff and 827dc18.

📒 Files selected for processing (3)
  • include/fluent-bit/flb_config.h
  • include/fluent-bit/flb_input_chunk.h
  • plugins/in_winevtlog/in_winevtlog.c
🚧 Files skipped from review as they are similar to previous changes (2)
  • include/fluent-bit/flb_config.h
  • include/fluent-bit/flb_input_chunk.h

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant