fix(task): centralize token tracking and preserve usage during guardrail retries by Jacopos311 · Pull Request #6251 · crewAIInc/crewAI

Jacopos311 · 2026-06-19T22:52:52Z

This Pull Request addresses critical bugs and structural flaws within the token tracking lifecycle of Task executions, specifically optimizing metrics continuity across core execution boundaries and guardrail loops.

The implementation was highly challenging, refactoring the state management across both synchronous (_execute_core) and asynchronous (_aexecute_core) runtime flows. Ensuring state persistence during multi-layered asynchronous retry loops proved to be an incredibly complex task.

Key Fixes Implemented
Bug 1: Guardrail Token Loss Recovery

Issue: Token metrics were dropped or wiped clean whenever a guardrail failed validation and triggered a sub-task retry.

Fix: Introduced an internal token snapshot (current_token_usage) at the top of the retry loop within _ainvoke_guardrail_function. If a guardrail yields a new TaskOutput, the preceding metrics are explicitly preserved and restored.

Bug 2: Duplicate Core Tracking Redundancy

Issue: Decentralized and fragmented token calculations were being redundantly invoked inside background async workers and parallel threads, causing unstable delta spikes.

Fix: Centralized token delta computations exclusively inside the core pipeline. Wrapper functions now stream results without interfering with state counters.

Bug 3: Premature Usage Snapshots

Issue: The final snapshot of the LLM token usage was captured prior to exporting outputs into structured Pydantic models or JSON dictionaries. Any extra token overhead consumed by formatting models was entirely lost.

Fix: Shifted the tokens_after capture and _calculate_token_delta invocation structurally below the _export_output / _aexport_output blocks, ensuring 100% telemetry coverage.

Testing
Verified alignment across synchronous and asynchronous interfaces.

Ensured zero NameError bounds or uninitialized variable leaks during back-to-back guardrail re-evaluations.

Summary by CodeRabbit

New Features
- Added a Mimir Memory Backend for persistent memory, including configurable database path and example setup.
- Improved memory backend resolution to support selecting storage with optional configuration (including built-in Mimir support).
Bug Fixes
- Improved task LLM token accounting, ensuring token usage is consistently recorded in task outputs, including across guardrail retries.
Documentation
- Added a Mimir Memory Backend section covering prerequisites and a Python configuration example.

…protocol

…ping

… protocol and addressing CodeRabbit feedback

…eouts

…ail retries

coderabbitai · 2026-06-19T22:53:17Z

Important

Review skipped

No new commits to review since the last review.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4c118d60-0d8a-478b-a9f9-f30affb53634

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR adds MimirStorage, a new persistent memory backend that delegates to the mimir CLI via subprocess for store and search operations. The storage factory gains a config parameter and a built-in mimir spec fallback. Separately, Task execution now computes and propagates token-usage deltas through both core execution and guardrail retry paths.

Changes

Mimir Memory Backend

Layer / File(s)	Summary
MimirStorage class `lib/crewai/src/crewai/memory/storage/mimir_storage.py`	New `StorageBackend` subclass that resolves `db_path`, finds the `mimir` binary at init, validates `category` and query inputs to prevent flag injection, stores `MemoryRecord`s via `mimir store` subprocess with MD5-derived keys, and searches via `mimir search` returning `(MemoryRecord, score)` tuples.
Factory wiring and README docs `lib/crewai/src/crewai/memory/storage/factory.py`, `lib/crewai/README.md`	`resolve_memory_storage` gains an optional `config` parameter and a built-in `mimir` spec fallback that constructs `MimirStorage(config=config)`. README documents the Mimir binary prerequisite, the `mcp` Python package dependency, and a `MimirStorage` configuration example.

Task Token Usage Tracking

Layer / File(s)	Summary
Token helpers and core execution capture `lib/crewai/src/crewai/task.py`	Adds `_get_agent_token_usage` and `_calculate_token_delta` helpers. Both sync (`_execute_core`) and async (`_aexecute_core`) paths snapshot `agent.llm.token_usage` before and after execution, compute a delta, and pass it into `TaskOutput.token_usage`.
Token preservation through guardrail retries `lib/crewai/src/crewai/task.py`	In both sync and async guardrail invocation, `current_token_usage` is captured at each attempt. It is backfilled onto a returned `TaskOutput` that lacks `token_usage`, and explicitly set when constructing a retry `TaskOutput`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: centralizing token tracking and preserving token usage during guardrail retries, which are core to all modifications across task.py and supporting memory infrastructure.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

corridor-security

Summary: This PR adds a Mimir-backed memory storage integration and adjusts task token tracking behavior around execution and guardrail retries; no exploitable security vulnerabilities were identified.

Risk: Low risk. The new external subprocess integration is not directly public-facing, uses argument-list subprocess execution without shell invocation, and validates CLI-sensitive inputs such as categories and hyphen-prefixed queries.

Recommendations:

Ensure deployments only enable the Mimir backend with a trusted installed mimir binary and trusted db_path configuration.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

lib/crewai/src/crewai/task.py (1)

691-704: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Finalize token usage after guardrail retries finish.

token_delta is attached before guardrails run, then retry TaskOutputs copy the prior current_token_usage after executing the agent again. Tokens consumed by LLM-backed guardrails, retry execute_task/aexecute_task, and retry _export_output/_aexport_output are still omitted.

Suggested direction

-            tokens_after = self._get_agent_token_usage(agent)
-            token_delta = self._calculate_token_delta(tokens_before, tokens_after)
-
             task_output = TaskOutput(
                 name=self.name or self.description,
                 description=self.description,
                 expected_output=self.expected_output,
                 raw=raw,
@@
                 output_format=self._get_output_format(),
                 messages=agent.last_messages,  # type: ignore[attr-defined]
-                token_usage=token_delta,
+                token_usage=None,
             )

             if self._guardrails:
                 for idx, guardrail in enumerate(self._guardrails):
@@
             if self._guardrail:
                 task_output = self._invoke_guardrail_function(
                     task_output=task_output,
                     agent=agent,
                     tools=tools,
                     guardrail=self._guardrail,
                 )
+
+            tokens_after = self._get_agent_token_usage(agent)
+            task_output.token_usage = self._calculate_token_delta(
+                tokens_before, tokens_after
+            )

Apply the same placement in _aexecute_core, or alternatively accumulate each retry delta inside the guardrail helpers before constructing the retry TaskOutput.

Also applies to: 820-833, 1376-1376, 1489-1489

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/task.py` around lines 691 - 704, The token_delta
calculation is being performed and attached to TaskOutput before guardrails
execute their retries, which causes tokens consumed by guardrail execution,
retry task execution, and retry output export operations to be excluded from the
final token_usage. Move the token_delta calculation to after all guardrail retry
operations complete by either applying the same placement in the _aexecute_core
method (the async counterpart), or accumulate each retry delta within the
guardrail helper functions before constructing the retry TaskOutput objects.
This same fix should be applied at the additional locations mentioned (820-833,
1376-1376, 1489-1489) to ensure consistent token usage tracking across all code
paths.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/memory/storage/factory.py`:
- Around line 47-63: The `config` parameter in the `resolve_memory_storage`
function is unused because no callers pass it and it cannot be forwarded to
custom factories (which only accept `spec` per the `MemoryStorageFactory`
signature at line 27). Remove the `config` parameter from the
`resolve_memory_storage` function signature and its use in the MimirStorage
instantiation call (line 63), or alternatively, extend the
`MemoryStorageFactory` protocol/type signature to accept an optional `config`
parameter, update the custom factory invocation (line 56) to pass `config`, and
ensure all places that set memory storage factories are aware of the new
signature.

In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Around line 89-93: The search method in MimirStorage has a signature mismatch
with the storage contract expected by shallow memory retrieval. The method
currently only accepts query as a string along with limit and scope_prefix, but
the shallow retrieval path invokes it with an embedding vector, categories
parameter, and min_score threshold. Update the search method signature to accept
all required parameters including categories and min_score, and add logic to
handle the incoming query appropriately whether it is a string or an embedding
vector. This will ensure the method aligns with the actual contract consumed by
the memory runtime and prevents runtime failures due to argument mismatches.

In `@lib/crewai/src/crewai/task.py`:
- Around line 1313-1314: The condition `and current_token_usage` performs a
truthiness check, but valid token usage objects can be falsy (zero or empty),
which causes the code to incorrectly skip backfilling token usage. Replace the
truthiness check with an explicit None check by changing `and
current_token_usage` to `and current_token_usage is not None` in both locations:
the code block shown around lines 1313-1314, and the identical pattern in the
`_ainvoke_guardrail_function` method around lines 1426-1427.
- Around line 1167-1174: The _calculate_token_delta method in the task.py file
attempts to subtract two values directly using the minus operator, but when
these values are dictionaries (as token_usage from agent.llm is dict[str, int]),
the subtraction fails silently in the exception handler and returns the
cumulative after value unchanged, causing duplicate token spikes. Fix this by
first adding explicit type checks to detect when both before and after are
dictionaries, and if they are, iterate through the keys in the after dictionary
to explicitly subtract the numeric values (int or float types) from their
corresponding before values for each key, building a delta dictionary with the
differences. Keep the fallback try-except logic for non-dict numeric types to
maintain backward compatibility.

---

Outside diff comments:
In `@lib/crewai/src/crewai/task.py`:
- Around line 691-704: The token_delta calculation is being performed and
attached to TaskOutput before guardrails execute their retries, which causes
tokens consumed by guardrail execution, retry task execution, and retry output
export operations to be excluded from the final token_usage. Move the
token_delta calculation to after all guardrail retry operations complete by
either applying the same placement in the _aexecute_core method (the async
counterpart), or accumulate each retry delta within the guardrail helper
functions before constructing the retry TaskOutput objects. This same fix should
be applied at the additional locations mentioned (820-833, 1376-1376, 1489-1489)
to ensure consistent token usage tracking across all code paths.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 905ee92f-52d4-49d2-b8c7-6d0bb921220c

📥 Commits

Reviewing files that changed from the base of the PR and between 9db2d44 and cd0fbaf.

📒 Files selected for processing (4)

lib/crewai/README.md
lib/crewai/src/crewai/memory/storage/factory.py
lib/crewai/src/crewai/memory/storage/mimir_storage.py
lib/crewai/src/crewai/task.py

…t guardrail none checks

…ayer signatures

Jacopos311 · 2026-06-19T23:16:38Z

@coderabbitai review

coderabbitai · 2026-06-19T23:16:45Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

lib/crewai/src/crewai/memory/storage/mimir_storage.py (1)

59-63: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix MemoryRecord field mismatch (value vs content) to avoid runtime failures.

Line 59 reads record.value, but MemoryRecord exposes content. Line 132 constructs MemoryRecord(value=...), which misses required content. This will break both save and search paths at runtime.

Proposed patch

-            value_str = str(record.value)
+            value_str = str(record.content)
@@
-                content_text = res.get("value", res.get("text", ""))
+                content_text = res.get("value", res.get("text", ""))
@@
-                record = MemoryRecord(value=content_text, metadata=meta)
+                record = MemoryRecord(content=content_text, metadata=meta)

Also applies to: 132-133

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py` around lines 59 - 63,
There is a field mismatch in the MemoryRecord class usage. The code at line 59
attempts to access record.value, but the MemoryRecord class exposes a content
field instead. Similarly, at line 132, MemoryRecord is being instantiated with a
value parameter, but it requires the content parameter. Replace all occurrences
of record.value with record.content and change the MemoryRecord constructor call
from using value= to using content= to match the actual field name exposed by
the MemoryRecord class.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Around line 92-97: The search_memory method (and the related method around
lines 121-135) accept **kwargs but do not apply the metadata_filter parameter,
which violates the storage interface contract and can return unexpected records.
Additionally, the default values for limit (currently 3) and min_score
(currently None) do not match the protocol defaults. Fix this by extracting
metadata_filter from kwargs or adding it as an explicit parameter and applying
it during the search operation, and update the default values for limit and
min_score to align with the storage protocol specification. Apply these same
changes consistently to both affected methods.

---

Outside diff comments:
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Around line 59-63: There is a field mismatch in the MemoryRecord class usage.
The code at line 59 attempts to access record.value, but the MemoryRecord class
exposes a content field instead. Similarly, at line 132, MemoryRecord is being
instantiated with a value parameter, but it requires the content parameter.
Replace all occurrences of record.value with record.content and change the
MemoryRecord constructor call from using value= to using content= to match the
actual field name exposed by the MemoryRecord class.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9d17a345-755b-4623-97bb-91b92621e6e1

📥 Commits

Reviewing files that changed from the base of the PR and between bba8015 and 539d2f9.

📒 Files selected for processing (2)

lib/crewai/src/crewai/memory/storage/factory.py
lib/crewai/src/crewai/memory/storage/mimir_storage.py

🚧 Files skipped from review as they are similar to previous changes (1)

lib/crewai/src/crewai/memory/storage/factory.py

…mir storage

Jacopos311 · 2026-06-19T23:23:38Z

@coderabbitai review

coderabbitai · 2026-06-19T23:24:02Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

SessantaNove and others added 14 commits June 18, 2026 00:54

feat: add Mimir as persistent cross-session memory backend

e9f44cb

feat: wire MimirStorage into factory resolution

01b3bb5

fix: address CodeRabbit review comments on MimirStorage integration

2d7815f

fix: adjust SDK calls to async format and fix return types

aa62068

fix: switch to MimirSyncClient to satisfy synchronous StorageBackend …

bd02734

…protocol

fix: sanitize MimirSyncClient config keys and fix comment typo

4fa46f7

refactor: upgrade Mimir memory backend to official MCP standard

21d7956

refactor: complete Mimir integration update in readme and factory

b9e4349

fix: address coderabbit recommendations on db flag, expansion and sco…

711e379

…ping

Merge branch 'main' into main

d6b95cf

Address maintainer feedback for Mimir storage module

67b8320

Fix Mimir storage implementation to fully conform with StorageBackend…

76dc276

… protocol and addressing CodeRabbit feedback

Enhance MimirStorage stability with direct imports and subprocess tim…

0b2c880

…eouts

fix(task): centralize token tracking and preserve usage during guardr…

cd0fbaf

…ail retries

corridor-security Bot reviewed Jun 19, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread lib/crewai/src/crewai/memory/storage/factory.py

Comment thread lib/crewai/src/crewai/memory/storage/mimir_storage.py Outdated

Comment thread lib/crewai/src/crewai/task.py

Comment thread lib/crewai/src/crewai/task.py Outdated

SessantaNove added 2 commits June 20, 2026 01:09

fix(task): address token delta calculation for dict types and explici…

bba8015

…t guardrail none checks

fix(task,memory): address token tracking lifecycle and sync storage l…

539d2f9

…ayer signatures

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread lib/crewai/src/crewai/memory/storage/mimir_storage.py Outdated

fix(memory): align search limit and min_score protocol defaults in mi…

d5c3468

…mir storage

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(task): centralize token tracking and preserve usage during guardrail retries#6251

fix(task): centralize token tracking and preserve usage during guardrail retries#6251
Jacopos311 wants to merge 17 commits into
crewAIInc:mainfrom
Jacopos311:fix/task-token-tracking

Jacopos311 commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

Uh oh!

corridor-security Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jacopos311 commented Jun 19, 2026

Uh oh!

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Jacopos311 commented Jun 19, 2026

Uh oh!

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Jacopos311 commented Jun 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Uh oh!

corridor-security Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jacopos311 commented Jun 19, 2026

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Jacopos311 commented Jun 19, 2026

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Jacopos311 commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading