Skip to content

Comments

in_tail: Use temp table on overflowing the limit of SQL statement#11383

Merged
edsiper merged 1 commit intomasterfrom
cosmo0920-use-temp-table-on-overflowing-stmt-limit
Feb 20, 2026
Merged

in_tail: Use temp table on overflowing the limit of SQL statement#11383
edsiper merged 1 commit intomasterfrom
cosmo0920-use-temp-table-on-overflowing-stmt-limit

Conversation

@cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Jan 22, 2026

in_tail plugins will fail to start when exceeded the number of placeholders for deleting inodes on DB.
So, this PR introduces an offloading operation which creates temp table and runs deletions with it.

Closes #11272


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

Prepare massive amount of files.

mkdir -p /tmp/tailmass
for i in $(seq 1 55000); do
  echo "log $i" > /tmp/tailmass/file_$i.log
done
  • Example configuration file for the change
$ /path/to/fluent-bit \
  -i tail -p path=/tmp/tailmass/*.log -p db=/tmp/tail.db \
  -o null -v
  • Debug log output from testing the change

This change can detect this type of overflow and use temp database to drop the definitions beyond the number of limit of SQLite's placeholders.

<snip>
[2026/01/22 15:50:28.415179297] [ warn] [input:tail:tail.0] db: large file set detected (55000 files) exceeds SQLite variable limit (32766); using temp-table cleanup for stale inode deletion
[2026/01/22 15:50:28.507476246] [ info] [input:tail:tail.0] db: delete unmonitored stale inodes from the database: count=0
  • Attached Valgrind output that shows no leaks or memory corruption was found
==1535682== 
==1535682== HEAP SUMMARY:
==1535682==     in use at exit: 0 bytes in 0 blocks
==1535682==   total heap usage: 1,325,123 allocs, 1,325,123 frees, 2,591,568,841 bytes allocated
==1535682== 
==1535682== All heap blocks were freed -- no leaks are possible
==1535682== 
==1535682== For lists of detected and suppressed errors, rerun with: -s
==1535682== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • Scalable stale-entry cleanup for very large numbers of monitored files to avoid DB variable limits.
    • Automatic fallback cleanup path when variable limits are exceeded, with a warning logged.
    • Correct handling when no files are monitored — deletes all stale entries and reports the result.
    • Fixed parameter-count handling to reduce risk of SQL over-allocation and cleanup failures.
  • Stability

    • Improved logging and error handling around stale-entry cleanup.

@coderabbitai
Copy link

coderabbitai bot commented Jan 22, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a temporary-table based stale-inode cleanup path in the tail input plugin to avoid SQLite variable limits, adjusts NOT IN parameter counting, fixes zero-file deletion behavior, and introduces related logging and error handling in the plugin's DB cleanup logic.

Changes

Cohort / File(s) Summary
Stale File Cleanup & DB logic
plugins/in_tail/tail_db.c
Add flb_tail_db_stale_file_delete_temp_table() to create a temp inode table, bulk-insert inodes inside a transaction, and delete stale rows via join; integrate this path into flb_tail_db_stale_file_delete() when SQLite variable limits are exceeded; fix NOT IN parameter counting, handle zero-file case, add inttypes.h for PRIu64, and update logging/error handling.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • edsiper
  • koleini

Poem

🐇
I dug a tiny table in the dirt,
Counted inodes without getting hurt.
In one small tx I cleared the mess,
No more limits, no more stress.
Hooray — the startup hops, all fresh! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and specifically describes the main technical change: using a temp table to handle SQLite variable limit overflow when deleting stale inodes in the in_tail plugin.
Linked Issues check ✅ Passed The PR successfully addresses issue #11272 by implementing a temporary table-based deletion mechanism to bypass SQLite variable limits when monitoring very large file sets (55,000+).
Out of Scope Changes check ✅ Passed All changes are directly scoped to solving the SQLite variable limit problem: new temp-table helper function, limit detection logic, parameter binding optimization, and supporting includes. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cosmo0920-use-temp-table-on-overflowing-stmt-limit

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cosmo0920 cosmo0920 force-pushed the cosmo0920-use-temp-table-on-overflowing-stmt-limit branch from 8476ca3 to 704ea89 Compare February 9, 2026 08:15
@cosmo0920 cosmo0920 marked this pull request as ready for review February 9, 2026 09:04
@cosmo0920 cosmo0920 requested a review from edsiper as a code owner February 9, 2026 09:04
@cosmo0920 cosmo0920 added this to the Fluent Bit v5.0 milestone Feb 9, 2026
@edsiper
Copy link
Member

edsiper commented Feb 17, 2026

@cosmo0920 we have a conflict on this one

@cosmo0920
Copy link
Contributor Author

Oh! I'll rebase off master.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
plugins/in_tail/tail_db.c (1)

377-386: sqlite3_changes may not reflect the DELETE after the COMMIT exec — consider reading it earlier.

sqlite3_changes() at line 384 is called after the COMMIT statement is executed via sqlite3_exec. While COMMIT is not a DML statement and should leave the change counter untouched per SQLite documentation, this ordering is fragile and differs from the legacy path (line 703) which reads sqlite3_changes immediately after sqlite3_step of the DELETE. Moving the read before the COMMIT is safer and removes any ambiguity.

Proposed fix
+    changes = sqlite3_changes(ctx->db->handler);
+
     ret = flb_sqldb_query(ctx->db, "COMMIT;", NULL, NULL);
     if (ret != FLB_OK) {
         flb_plg_error(ctx->ins, "db: cannot commit transaction for temp inode inserts");
         goto error;
     }
     txn_started = FLB_FALSE;
 
-    changes = sqlite3_changes(ctx->db->handler);
     flb_plg_info(ctx->ins, "db: delete unmonitored stale inodes from the database: count=%d",
                  changes);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_tail/tail_db.c` around lines 377 - 386, The sqlite3_changes() call
should be moved to read the change count immediately after the DELETE executes
and before committing the transaction to avoid relying on behavior after COMMIT;
update the block around flb_sqldb_query(ctx->db, "COMMIT;", ...) so that you
call sqlite3_changes(ctx->db->handler) and store it in changes right after the
DELETE path (before calling flb_sqldb_query for the COMMIT) and then log
flb_plg_info using that stored changes value, leaving txn_started handling
unchanged; refer to ctx->db, sqlite3_changes, flb_sqldb_query, changes and
txn_started to locate where to adjust the order.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/in_tail/tail_db.c`:
- Around line 285-400: The temp-table cleanup in
flb_tail_db_stale_file_delete_temp_table performs
CREATE/DELETE/BEGIN/INSERT/COMMIT/ROLLBACK calls without holding the global
tail_db_lock, causing races with other DB writers; fix by acquiring tail_db_lock
before any SQLite calls in flb_tail_db_stale_file_delete_temp_table (or ensure
the caller holds it) and release it on every return path (including
error/rollback), keeping the lock held across stmt_prepare/step/finalize and
flb_sqldb_query calls so concurrent writers that use tail_db_lock are properly
serialized.

---

Nitpick comments:
In `@plugins/in_tail/tail_db.c`:
- Around line 377-386: The sqlite3_changes() call should be moved to read the
change count immediately after the DELETE executes and before committing the
transaction to avoid relying on behavior after COMMIT; update the block around
flb_sqldb_query(ctx->db, "COMMIT;", ...) so that you call
sqlite3_changes(ctx->db->handler) and store it in changes right after the DELETE
path (before calling flb_sqldb_query for the COMMIT) and then log flb_plg_info
using that stored changes value, leaving txn_started handling unchanged; refer
to ctx->db, sqlite3_changes, flb_sqldb_query, changes and txn_started to locate
where to adjust the order.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
plugins/in_tail/tail_db.c (1)

594-601: Fragile lock ownership: caller acquires, callee releases.

The lock is acquired at line 579, but the return at line 600 transfers unlock responsibility to flb_tail_db_stale_file_delete_temp_table (lines 389/403). This split ownership is error-prone — the double-unlock bug at line 318 is a direct consequence.

Consider unlocking in the caller after the call returns, matching the pattern used by the legacy path (line 717). The callee would then not touch the lock at all.

Proposed refactor — keep lock ownership in the caller
     if (max_vars > 0 && file_count > (uint64_t) max_vars) {
         flb_plg_warn(ctx->ins,
                      "db: large file set detected (%" PRIu64 " files) exceeds SQLite variable limit (%d); "
                      "using temp-table cleanup for stale inode deletion",
                      file_count, max_vars);
-        return flb_tail_db_stale_file_delete_temp_table(ctx, file_count);
+        ret = flb_tail_db_stale_file_delete_temp_table(ctx, file_count);
+        tail_db_unlock(ctx);
+        return ret;
     }

Then remove all tail_db_unlock calls from inside flb_tail_db_stale_file_delete_temp_table (lines 318, 389, 403).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_tail/tail_db.c` around lines 594 - 601, The caller acquires the DB
lock but returns early into flb_tail_db_stale_file_delete_temp_table which
currently releases the lock, causing split ownership and the double-unlock bug;
change the flow so the caller retains lock ownership: remove all tail_db_unlock
calls from flb_tail_db_stale_file_delete_temp_table (so the callee never touches
the lock) and have the caller call flb_tail_db_stale_file_delete_temp_table
while still holding the lock, then perform tail_db_unlock in the caller after
the callee returns (matching the legacy-path pattern used elsewhere).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/in_tail/tail_db.c`:
- Around line 296-308: The zero-file branch currently performs the DELETE
(flb_sqldb_query) and logs the deleted rows (sqlite3_changes + flb_plg_info) but
then jumps to the error cleanup via "goto error" which returns -1; change this
path so after a successful DELETE you perform the proper DB unlock/cleanup (use
the same DB unlock/cleanup routine used elsewhere for ctx->db) and return a
success code (FLB_OK) instead of jumping to the error label; specifically modify
the block that checks file_count == 0 to remove the final "goto error" and
replace it with the correct unlock/cleanup call for ctx->db followed by "return
FLB_OK".
- Around line 316-320: In flb_tail_db_stale_file_delete remove the redundant
tail_db_unlock call before the goto error: the lock is acquired by the caller
and every other error path uses the single tail_db_unlock at the error label, so
delete the stray tail_db_unlock in the block that logs "db: cannot create temp
table for inode cleanup" (the branch that checks ret != FLB_OK) and let the goto
error reach the existing unlock at the error label to avoid double-unlocking the
mutex.

---

Nitpick comments:
In `@plugins/in_tail/tail_db.c`:
- Around line 594-601: The caller acquires the DB lock but returns early into
flb_tail_db_stale_file_delete_temp_table which currently releases the lock,
causing split ownership and the double-unlock bug; change the flow so the caller
retains lock ownership: remove all tail_db_unlock calls from
flb_tail_db_stale_file_delete_temp_table (so the callee never touches the lock)
and have the caller call flb_tail_db_stale_file_delete_temp_table while still
holding the lock, then perform tail_db_unlock in the caller after the callee
returns (matching the legacy-path pattern used elsewhere).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
plugins/in_tail/tail_db.c (1)

285-407: Optional: Document or encapsulate the lock-ownership-transfer contract.

flb_tail_db_stale_file_delete_temp_table unconditionally releases the caller's lock on every exit path (lines 307, 390, 404) without acquiring it itself. This ownership-transfer contract is implicit and could lead to a double-unlock if the function is ever called directly in the future without the lock held. Consider at minimum a short comment at the function entry to document the contract, or invert the pattern so this function manages its own lock:

♻️ Option: self-contained locking
 static int flb_tail_db_stale_file_delete_temp_table(struct flb_tail_config *ctx,
                                                     uint64_t file_count)
 {
     int ret;
     int changes;
-    int txn_started = FLB_FALSE;
+    int txn_started  = FLB_FALSE;
+    int lock_held    = FLB_FALSE;
     sqlite3_stmt *stmt_insert_inode = NULL;
     ...

+    ret = tail_db_lock(ctx);
+    if (ret != 0) {
+        flb_plg_error(ctx->ins, "db: could not acquire lock");
+        return -1;
+    }
+    lock_held = FLB_TRUE;

And remove the three tail_db_unlock(ctx) calls inside the function body, replacing the error label's unlock with a conditional on lock_held. The caller in flb_tail_db_stale_file_delete would then call tail_db_unlock(ctx) normally after the dispatch returns, matching every other early-return path in that function.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_tail/tail_db.c` around lines 285 - 407,
flb_tail_db_stale_file_delete_temp_table unconditionally calls
tail_db_unlock(ctx) on all exit paths but does not acquire the lock itself;
either document this ownership-transfer contract or make the function
self-contained: add a short comment at the top of
flb_tail_db_stale_file_delete_temp_table stating "assumes caller holds DB lock
and this function will release it", or modify the function to call
tail_db_lock(ctx) at entry and track a lock_held boolean, remove the three
unconditional tail_db_unlock(ctx) calls and instead call tail_db_unlock(ctx)
once on EXIT only if lock_held is true (and update the caller
flb_tail_db_stale_file_delete to stop unlocking after dispatch). Use the symbols
flb_tail_db_stale_file_delete_temp_table, tail_db_lock, tail_db_unlock, and
flb_tail_db_stale_file_delete to locate and apply the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@plugins/in_tail/tail_db.c`:
- Around line 285-407: The earlier lock-was-bypassed concern is now resolved:
flb_tail_db_stale_file_delete_temp_table holds tail_db_lock for its entire body
(tail_db_lock/tail_db_unlock surround all exits) so remove the
duplicate/outdated review comment referencing the prior lock issue; update the
PR comment thread to state the lock concern is addressed and delete or mark the
duplicate comment as resolved, referencing
flb_tail_db_stale_file_delete_temp_table and tail_db_lock to point reviewers to
the validated locking behavior.

---

Nitpick comments:
In `@plugins/in_tail/tail_db.c`:
- Around line 285-407: flb_tail_db_stale_file_delete_temp_table unconditionally
calls tail_db_unlock(ctx) on all exit paths but does not acquire the lock
itself; either document this ownership-transfer contract or make the function
self-contained: add a short comment at the top of
flb_tail_db_stale_file_delete_temp_table stating "assumes caller holds DB lock
and this function will release it", or modify the function to call
tail_db_lock(ctx) at entry and track a lock_held boolean, remove the three
unconditional tail_db_unlock(ctx) calls and instead call tail_db_unlock(ctx)
once on EXIT only if lock_held is true (and update the caller
flb_tail_db_stale_file_delete to stop unlocking after dispatch). Use the symbols
flb_tail_db_stale_file_delete_temp_table, tail_db_lock, tail_db_unlock, and
flb_tail_db_stale_file_delete to locate and apply the change.

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
plugins/in_tail/tail_db.c (2)

633-756: db_locked guards in the legacy path are always true — optional cleanup.

After the early return at line 610 exits the function, the legacy path (line 613 onward) is only reachable with db_locked == FLB_TRUE. Every if (db_locked == FLB_TRUE) check from line 633 to line 752 is unconditionally true. The guards are safe but add noise; calling tail_db_unlock(ctx) directly on each error/success path would make the intent clearer. This is a cosmetic issue and can be deferred.

♻️ Example simplification (one error path shown; apply the same pattern to all guarded blocks)
-    if (db_locked == FLB_TRUE) {
-        tail_db_unlock(ctx);
-    }
-
-    return -1;
+    tail_db_unlock(ctx);
+    return -1;

And at the success path (lines 752–758):

-    if (db_locked == FLB_TRUE) {
-        tail_db_unlock(ctx);
-    }
-    db_locked = FLB_FALSE;
-
-    return 0;
+    tail_db_unlock(ctx);
+    return 0;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_tail/tail_db.c` around lines 633 - 756, The db_locked boolean is
always true for the legacy path, so remove the redundant guards and call
tail_db_unlock(ctx) directly where cleanup is needed; update all error and
success branches that currently do "if (db_locked == FLB_TRUE) {
tail_db_unlock(ctx); }" to simply "tail_db_unlock(ctx);" (affecting the blocks
around sqlite3_prepare_v2/stmt_delete_inodes binding/step/finalize and the final
success path), and leave db_locked logic unchanged elsewhere (keep the db_locked
= FLB_FALSE assignment at the end).

297-313: Zero-file branch is unreachable from the current call site.

The only caller (flb_tail_db_stale_file_delete, line 610) routes here only when file_count > max_vars > 0, so file_count == 0 on entry is impossible. The branch is currently dead code, though it doesn't cause harm. Consider adding an assert(file_count > 0) or a code comment to document the invariant and prevent confusion for future callers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_tail/tail_db.c` around lines 297 - 313, The zero-file branch in
the block inside flb_tail_db_stale_file_delete is unreachable because callers
only invoke this path when file_count > max_vars > 0; add a clear invariant to
the function by inserting either an assert(file_count > 0) (or flb_assert-like
macro used in this codebase) at the top of the function or a concise comment
documenting that file_count is guaranteed > 0 by callers, and remove or mark the
DELETE branch as unreachable to avoid confusion; reference the symbols
file_count, flb_tail_db_stale_file_delete, tail_db_unlock, ctx and db_locked so
the change is easy to locate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@plugins/in_tail/tail_db.c`:
- Around line 633-756: The db_locked boolean is always true for the legacy path,
so remove the redundant guards and call tail_db_unlock(ctx) directly where
cleanup is needed; update all error and success branches that currently do "if
(db_locked == FLB_TRUE) { tail_db_unlock(ctx); }" to simply
"tail_db_unlock(ctx);" (affecting the blocks around
sqlite3_prepare_v2/stmt_delete_inodes binding/step/finalize and the final
success path), and leave db_locked logic unchanged elsewhere (keep the db_locked
= FLB_FALSE assignment at the end).
- Around line 297-313: The zero-file branch in the block inside
flb_tail_db_stale_file_delete is unreachable because callers only invoke this
path when file_count > max_vars > 0; add a clear invariant to the function by
inserting either an assert(file_count > 0) (or flb_assert-like macro used in
this codebase) at the top of the function or a concise comment documenting that
file_count is guaranteed > 0 by callers, and remove or mark the DELETE branch as
unreachable to avoid confusion; reference the symbols file_count,
flb_tail_db_stale_file_delete, tail_db_unlock, ctx and db_locked so the change
is easy to locate.

@edsiper edsiper merged commit 931886e into master Feb 20, 2026
58 of 60 checks passed
@edsiper edsiper deleted the cosmo0920-use-temp-table-on-overflowing-stmt-limit branch February 20, 2026 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

in_tail: Fluent Bit fails to start when monitoring a large number of files

2 participants