Integrate long-term memory functionality and improve tests#54
Integrate long-term memory functionality and improve tests#54prashant4654 wants to merge 6 commits into10xHub:mainfrom
Conversation
…oryWriteTracker fix: Correct token details key in LiteLLMConverter test test: Add comprehensive tests for long-term memory functionality
Signed-off-by: prashant4654 <ee23btech11218@iith.ac.in>
…ndencies and adjust import handling in GoogleEmbedding
There was a problem hiding this comment.
Pull request overview
Adds first-class long-term memory support to AgentFlow graphs, allowing LLMs to search/store/update/delete persistent memories and ensuring pending memory writes are handled during graph shutdown. Also improves LiteLLM response conversion to better capture provider-specific reasoning content and fixes token accounting for reasoning tokens.
Changes:
- Introduces
agentflow.store.long_term_memory(tool + preload node + prompts + async write tracking). - Updates graph shutdown to await pending memory writes and extends ToolNode injectables with
task_manager. - Enhances LiteLLM converter to extract reasoning from
thinking_blocksand adjusts reasoning token accounting; adds/updates tests.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
agentflow/store/long_term_memory.py |
New long-term memory tool, preload node factory, system prompt helpers, and async write tracker. |
agentflow/store/__init__.py |
Exposes new long-term memory APIs from the store package. |
agentflow/graph/compiled_graph.py |
Waits for pending memory writes during aclose() shutdown sequence. |
agentflow/graph/tool_node/constants.py |
Adds task_manager to injectable params excluded from tool schema generation. |
agentflow/adapters/llm/litellm_converter.py |
Extracts reasoning from thinking_blocks (response + streaming delta) and updates reasoning token source. |
tests/store/test_long_term_memory.py |
New test suite for the long-term memory module/tool behavior. |
tests/adapters/test_litellm_converter.py |
Updates expected usage field for reasoning token accounting. |
agentflow/store/embedding/google_embedding.py |
Initializes genai to None for compatibility when import fails. |
| # If reasoning_content is empty, extract from thinking_blocks (used by some providers) | ||
| thinking_blocks_data = message_data.get("thinking_blocks") or [] | ||
| if not reasoning_content and thinking_blocks_data: | ||
| thinking_parts = [] | ||
| for tb in thinking_blocks_data: |
There was a problem hiding this comment.
New thinking_blocks → reasoning_content fallback logic in convert_response isn’t covered by tests. Add a unit test where reasoning_content is empty but thinking_blocks contains one or more thinking entries, and assert the produced Message includes a ReasoningBlock with the expected reasoning text.
| # If reasoning_content is empty, extract from thinking_blocks in delta | ||
| if not reasoning_part: | ||
| delta_thinking_blocks = getattr(delta, "thinking_blocks", None) or [] | ||
| for tb in delta_thinking_blocks: | ||
| if isinstance(tb, dict) and tb.get("type") == "thinking": | ||
| thinking_text = tb.get("thinking", "") |
There was a problem hiding this comment.
Streaming path: the new fallback that extracts reasoning from delta.thinking_blocks is untested. Add a streaming test chunk where delta.reasoning_content is empty but delta.thinking_blocks is present, and assert a ReasoningBlock is emitted.
| # 0. Wait for pending memory writes to complete (no cancellation) | ||
| try: | ||
| from agentflow.store.long_term_memory import get_write_tracker | ||
|
|
||
| tracker = get_write_tracker() | ||
| if tracker.pending_count > 0: |
There was a problem hiding this comment.
CompiledGraph.aclose now waits for pending memory writes and populates stats["memory_writes"], but there’s no test asserting this new behavior. Consider extending tests/graph/test_graph.py::test_aclose to patch get_write_tracker() with a fake tracker (pending_count > 0) and verify wait_for_pending() is awaited and stats are recorded.
| # 0. Wait for pending memory writes to complete (no cancellation) | ||
| try: |
There was a problem hiding this comment.
The shutdown sequence documentation for aclose is now out of date: implementation waits for pending memory writes before shutting down the background task manager. Update the docstring or step comments so the documented order matches behavior.
| if timeout: | ||
| await asyncio.wait_for( | ||
| asyncio.gather(*tasks, return_exceptions=True), | ||
| timeout=timeout, | ||
| ) |
There was a problem hiding this comment.
MemoryWriteTracker.wait_for_pending uses if timeout: to decide whether to apply asyncio.wait_for. This treats timeout=0.0 as falsy and will wait indefinitely instead of timing out. Use an explicit timeout is not None check so callers can pass 0.0 reliably.
| } | ||
| if score_threshold > 0: | ||
| search_kwargs["score_threshold"] = score_threshold | ||
| if memory_types: |
There was a problem hiding this comment.
create_memory_preload_node takes memory_types (plural) but only uses the first entry (memory_types[0]), silently ignoring additional types. Either change the API to accept a single memory_type, or implement/document how multiple types should be queried/combined.
| if memory_types: | |
| if memory_types: | |
| if len(memory_types) > 1: | |
| logger.warning( | |
| "create_memory_preload_node received multiple memory_types (%s); " | |
| "only the first (%s) will be used for search.", | |
| memory_types, | |
| memory_types[0], | |
| ) |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
This pull request introduces long-term memory integration into AgentFlow, enabling LLMs to read, write, update, and delete persistent memories. The main changes include a new
memory_toolfor LLMs, preload node factory for injecting memory context, a write tracker for graceful shutdown, and supporting utilities. There are also enhancements to the LiteLLM converter to better extract reasoning content, and updates to ensure correct token accounting.Long-term memory integration:
Added
agentflow/store/long_term_memory.pywithmemory_tool,create_memory_preload_node,get_memory_system_prompt, andMemoryWriteTrackerfor async write tracking and graceful shutdown.Updated
agentflow/store/__init__.pyto export new long-term memory features, making them available for import elsewhere in the codebase.Graph shutdown improvements:
Modified
agentflow/graph/compiled_graph.pyto wait for pending memory writes on shutdown, ensuring all async writes complete before the process exits.Added
"task_manager"toTOOL_NODE_INJECTABLESinagentflow/graph/tool_node/constants.pyto support background task management for memory writes.LiteLLM converter enhancements:
Improved extraction of reasoning content from
thinking_blocksin LiteLLM responses, ensuring reasoning is correctly parsed even if not directly present.Corrected token accounting by switching from
prompt_tokens_detailstocompletion_tokens_detailsfor reasoning token extraction.Miscellaneous:
Minor fix to
agentflow/store/embedding/google_embedding.pyto initializegenaitoNonefor compatibility.