MFCS Memory is an intelligent conversation memory management system that helps AI assistants remember conversation history with users and dynamically adjust response strategies based on conversation content.
- Intelligent Conversation Memory: Automatically analyzes and summarizes user characteristics and preferences
- Vector Storage: Uses Qdrant for efficient similar conversation retrieval
- Session Management: Supports multi-user, multi-session management
- Automatic Chunking: Automatically creates chunks when conversation history exceeds threshold
- Async Support: All operations support asynchronous execution
- Extensibility: Modular design, easy to extend and customize
- Automatic LLM-based Analysis: User memory and conversation summary are updated automatically at configurable intervals
user_memory/base.py: Base memory management, handles shared resourcesuser_memory/memory_manager.py: Main entry for memory management, orchestrates all modules and async tasksuser_memory/session_manager.py: Handles session creation, update, chunking, and analysis task managementuser_memory/conversation_analyzer.py: Analyzes conversation content and user profile using LLM (OpenAI API)user_memory/vector_store.py: Manages vector storage and retrieval for conversation historyrag/rag_manager.py: Manages retrieval-augmented generation and vector-based information retrievalrag/vector_stores/base.py: Base vector store implementationrag/vector_stores/qdrant_store.py: Qdrant-specific vector store implementationutils/config.py: Loads and validates all configuration from environment variables
-
get(memory_id: str, content: Optional[str] = None, top_k: int = 2) -> str
- Get current session information for specified memory_id
- Includes conversation summary and user memory summary
- Supports content-based relevant historical conversation retrieval (vector search)
- Returns formatted memory information
-
update(memory_id: str, content: str, assistant_response: str) -> bool
- Automatically gets or creates current session for memory_id
- Updates conversation history
- Automatically updates user memory summary every 3 rounds (LLM analysis)
- Automatically updates session summary every 5 rounds (LLM analysis)
- Automatically handles conversation chunking and vector storage
- All analysis tasks run asynchronously and are recoverable on restart
-
delete(memory_id: str) -> bool
- Deletes all data for specified memory_id (session + vector store)
- Returns whether operation was successful
-
reset() -> bool
- Resets all memory records (clears all session and vector data)
- Returns whether operation was successful
MFCS Memory supports multiple levels of context management to provide intelligent and adaptive conversation tracking:
-
Short-term Memory (Session Context)
- Maintains the immediate conversation history
- Stores recent interactions within a single session
- Configurable via
MAX_RECENT_HISTORY(default: 20 interactions) - Enables quick retrieval of recent conversation context
-
Long-term Memory (User Profile)
- Builds a comprehensive user profile across multiple sessions
- Analyzes user preferences, communication patterns, and key characteristics
- Automatically updated every 3 conversation rounds using LLM analysis
- Supports personalized and context-aware responses
-
Vector-based Semantic Memory
- Utilizes Qdrant vector database for semantic similarity search
- Converts conversation chunks into high-dimensional embeddings
- Enables intelligent retrieval of contextually relevant past conversations
- Supports content-based memory lookup with configurable
top_kresults
- Leverages Language Models (LLM) for intelligent context understanding
- Performs automatic summarization and key information extraction
- Asynchronous analysis to minimize performance overhead
- Automatically splits long conversation histories into manageable chunks
- Configurable
CHUNK_SIZE(default: 100 conversations per chunk) - Ensures efficient storage and retrieval of extensive conversation data
- Uses advanced embedding models (default:
BAAI/bge-large-zh-v1.5) - Converts text into 768-dimensional vector representations
- Enables semantic search and similarity-based memory retrieval
-
Multi-session Tracking
- Supports multiple user sessions with unique
memory_id - Maintains isolated yet interconnected memory contexts
- Supports multiple user sessions with unique
-
Asynchronous Memory Operations
- All memory-related tasks run asynchronously
- Supports task recovery and restart
- Minimizes performance impact during conversation
-
Extensible Memory Backends
- Modular design allows easy integration of different vector stores
- Current implementation supports Qdrant
- Flexible configuration for embedding models and storage backends
- Personalized AI assistants
- Contextual chatbots
- Intelligent customer support systems
- Adaptive learning platforms
- Configurable concurrent analysis tasks (
MAX_CONCURRENT_ANALYSIS, default: 3) - Efficient vector storage and retrieval
- Low-latency memory access
- Scalable architecture supporting multiple users and sessions
- Install the package:
pip install mfcs-memory- Install SentenceTransformer for text embedding:
pip install sentence-transformersNote: The default embedding model is
BAAI/bge-large-zh-v1.5. You can change it in the configuration.
- Create a
.envfile and configure necessary environment variables:
# MongoDB Configuration
MONGO_USER=your_username
MONGO_PASSWD=your_password
MONGO_HOST=localhost:27017
# Qdrant Configuration
QDRANT_URL=http://127.0.0.1:6333
# Model Configuration
EMBEDDING_MODEL_PATH=./model/BAAI/bge-large-zh-v1.5
EMBEDDING_DIM=768
LLM_MODEL=qwen-plus-latest # Default value
# OpenAI Configuration
OPENAI_API_KEY=your_api_key
OPENAI_API_BASE=your_api_base # Optional
# Other Configuration
MONGO_REPLSET='' # Optional, if using replica set
MAX_RECENT_HISTORY=20 # Default value
CHUNK_SIZE=100 # Default value
MAX_CONCURRENT_ANALYSIS=3 # Default value- Usage Example:
import asyncio
from mfcs_memory.utils.config import Config
from mfcs_memory.user_memory.memory_manager import MemoryManager
async def main():
# Load configuration
config = Config.from_env()
# Initialize memory manager
memory_manager = MemoryManager(config)
# Update conversation
await memory_manager.update(
"memory_123",
"Hello, I want to learn about Python programming",
"Python is a simple yet powerful programming language..."
)
# Get memory information
memory_info = await memory_manager.get(
"memory_123",
content="How to start Python programming?",
top_k=2
)
# Delete memory data
await memory_manager.delete("memory_123")
# Reset all data
await memory_manager.reset()
if __name__ == "__main__":
asyncio.run(main())src/
├── mfcs_memory/
│ ├── user_memory/
│ │ ├── base.py # Base memory management
│ │ ├── memory_manager.py # Memory manager (main logic)
│ │ ├── session_manager.py # Session manager (session, chunk, task)
│ │ ├── conversation_analyzer.py # Conversation analyzer (LLM)
│ │ ├── vector_store.py # Vector store for conversation history
│ │ └── __init__.py
│ ├── rag/
│ │ ├── rag_manager.py # Retrieval-Augmented Generation manager
│ │ └── vector_stores/
│ │ ├── base.py # Base vector store
│ │ └── qdrant_store.py # Qdrant-specific vector store
│ ├── utils/
│ │ ├── config.py # Configuration management
│ │ └── __init__.py
│ └── __init__.py
├── example/ # Example code
├── model/ # Model directory
├── setup.py # Installation config
├── .env.example # Environment file example
└── README.md # Project documentation
MONGO_USER: MongoDB usernameMONGO_PASSWD: MongoDB passwordMONGO_HOST: MongoDB host addressQDRANT_URL: Qdrant url addressEMBEDDING_MODEL_PATH: Model path for generating text vectorsEMBEDDING_DIM: Vector dimensionOPENAI_API_KEY: OpenAI API keyOPENAI_API_BASE: OpenAI API base URL (Optional)LLM_MODEL: LLM model name
MONGO_REPLSET: MongoDB replica set name (if using replica set)QDRANT_PORT: Qdrant port number (default: 6333)MAX_RECENT_HISTORY: Number of recent conversations kept in main table (default: 20)CHUNK_SIZE: Number of conversations stored in each chunk (default: 100)MAX_CONCURRENT_ANALYSIS: Maximum number of concurrent analysis tasks (default: 3)
Issues and Pull Requests are welcome!
MIT License