Skip to content

Integrate long-term memory functionality and improve tests#54

Closed
prashant4654 wants to merge 6 commits into10xHub:mainfrom
prashant4654:prashant-lib
Closed

Integrate long-term memory functionality and improve tests#54
prashant4654 wants to merge 6 commits into10xHub:mainfrom
prashant4654:prashant-lib

Conversation

@prashant4654
Copy link
Contributor

This pull request introduces long-term memory integration into AgentFlow, enabling LLMs to read, write, update, and delete persistent memories. The main changes include a new memory_tool for LLMs, preload node factory for injecting memory context, a write tracker for graceful shutdown, and supporting utilities. There are also enhancements to the LiteLLM converter to better extract reasoning content, and updates to ensure correct token accounting.

Long-term memory integration:

  • Added agentflow/store/long_term_memory.py with memory_tool, create_memory_preload_node, get_memory_system_prompt, and MemoryWriteTracker for async write tracking and graceful shutdown.

  • Updated agentflow/store/__init__.py to export new long-term memory features, making them available for import elsewhere in the codebase.
    Graph shutdown improvements:

  • Modified agentflow/graph/compiled_graph.py to wait for pending memory writes on shutdown, ensuring all async writes complete before the process exits.

  • Added "task_manager" to TOOL_NODE_INJECTABLES in agentflow/graph/tool_node/constants.py to support background task management for memory writes.

LiteLLM converter enhancements:

  • Improved extraction of reasoning content from thinking_blocks in LiteLLM responses, ensuring reasoning is correctly parsed even if not directly present.

  • Corrected token accounting by switching from prompt_tokens_details to completion_tokens_details for reasoning token extraction.
    Miscellaneous:

  • Minor fix to agentflow/store/embedding/google_embedding.py to initialize genai to None for compatibility.

prashant4654 and others added 5 commits March 2, 2026 16:47
…oryWriteTracker

fix: Correct token details key in LiteLLMConverter test
test: Add comprehensive tests for long-term memory functionality
Signed-off-by: prashant4654 <ee23btech11218@iith.ac.in>
Copilot AI review requested due to automatic review settings March 2, 2026 12:25
…ndencies and adjust import handling in GoogleEmbedding
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class long-term memory support to AgentFlow graphs, allowing LLMs to search/store/update/delete persistent memories and ensuring pending memory writes are handled during graph shutdown. Also improves LiteLLM response conversion to better capture provider-specific reasoning content and fixes token accounting for reasoning tokens.

Changes:

  • Introduces agentflow.store.long_term_memory (tool + preload node + prompts + async write tracking).
  • Updates graph shutdown to await pending memory writes and extends ToolNode injectables with task_manager.
  • Enhances LiteLLM converter to extract reasoning from thinking_blocks and adjusts reasoning token accounting; adds/updates tests.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
agentflow/store/long_term_memory.py New long-term memory tool, preload node factory, system prompt helpers, and async write tracker.
agentflow/store/__init__.py Exposes new long-term memory APIs from the store package.
agentflow/graph/compiled_graph.py Waits for pending memory writes during aclose() shutdown sequence.
agentflow/graph/tool_node/constants.py Adds task_manager to injectable params excluded from tool schema generation.
agentflow/adapters/llm/litellm_converter.py Extracts reasoning from thinking_blocks (response + streaming delta) and updates reasoning token source.
tests/store/test_long_term_memory.py New test suite for the long-term memory module/tool behavior.
tests/adapters/test_litellm_converter.py Updates expected usage field for reasoning token accounting.
agentflow/store/embedding/google_embedding.py Initializes genai to None for compatibility when import fails.

Comment on lines +101 to +105
# If reasoning_content is empty, extract from thinking_blocks (used by some providers)
thinking_blocks_data = message_data.get("thinking_blocks") or []
if not reasoning_content and thinking_blocks_data:
thinking_parts = []
for tb in thinking_blocks_data:
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New thinking_blocksreasoning_content fallback logic in convert_response isn’t covered by tests. Add a unit test where reasoning_content is empty but thinking_blocks contains one or more thinking entries, and assert the produced Message includes a ReasoningBlock with the expected reasoning text.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +247 to +252
# If reasoning_content is empty, extract from thinking_blocks in delta
if not reasoning_part:
delta_thinking_blocks = getattr(delta, "thinking_blocks", None) or []
for tb in delta_thinking_blocks:
if isinstance(tb, dict) and tb.get("type") == "thinking":
thinking_text = tb.get("thinking", "")
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streaming path: the new fallback that extracts reasoning from delta.thinking_blocks is untested. Add a streaming test chunk where delta.reasoning_content is empty but delta.thinking_blocks is present, and assert a ReasoningBlock is emitted.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +535 to +540
# 0. Wait for pending memory writes to complete (no cancellation)
try:
from agentflow.store.long_term_memory import get_write_tracker

tracker = get_write_tracker()
if tracker.pending_count > 0:
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompiledGraph.aclose now waits for pending memory writes and populates stats["memory_writes"], but there’s no test asserting this new behavior. Consider extending tests/graph/test_graph.py::test_aclose to patch get_write_tracker() with a fake tracker (pending_count > 0) and verify wait_for_pending() is awaited and stats are recorded.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +535 to +536
# 0. Wait for pending memory writes to complete (no cancellation)
try:
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shutdown sequence documentation for aclose is now out of date: implementation waits for pending memory writes before shutting down the background task manager. Update the docstring or step comments so the documented order matches behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +74
if timeout:
await asyncio.wait_for(
asyncio.gather(*tasks, return_exceptions=True),
timeout=timeout,
)
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MemoryWriteTracker.wait_for_pending uses if timeout: to decide whether to apply asyncio.wait_for. This treats timeout=0.0 as falsy and will wait indefinitely instead of timing out. Use an explicit timeout is not None check so callers can pass 0.0 reliably.

Copilot uses AI. Check for mistakes.
}
if score_threshold > 0:
search_kwargs["score_threshold"] = score_threshold
if memory_types:
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_memory_preload_node takes memory_types (plural) but only uses the first entry (memory_types[0]), silently ignoring additional types. Either change the API to accept a single memory_type, or implement/document how multiple types should be queried/combined.

Suggested change
if memory_types:
if memory_types:
if len(memory_types) > 1:
logger.warning(
"create_memory_preload_node received multiple memory_types (%s); "
"only the first (%s) will be used for search.",
memory_types,
memory_types[0],
)

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Mar 2, 2026

Codecov Report

❌ Patch coverage is 81.81818% with 32 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
agentflow/adapters/llm/litellm_converter.py 21.05% 13 Missing and 2 partials ⚠️
agentflow/store/long_term_memory.py 93.05% 5 Missing and 5 partials ⚠️
agentflow/graph/compiled_graph.py 41.66% 6 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants