| title | Code Analysis RL Environment |
|---|---|
| colorFrom | blue |
| colorTo | purple |
| sdk | docker |
| pinned | false |
A comprehensive Reinforcement Learning (RL) environment for analyzing code repositories, identifying issues, and generating meaningful fixes. Built as part of the OpenEnv Hackathon, this system uses LLM-powered agents to evaluate code quality and suggest improvements through a multi-step interactive process.
Purpose: Train and evaluate AI agents on their ability to:
- Detect code issues and vulnerabilities
- Suggest meaningful and actionable fixes
- Improve performance over multiple interaction steps
- Work with diverse programming languages and project types
- Custom RL Environment (
CodeAnalysisEnv) - Gymnasium-compatible environment with observation/action spaces - Intelligent Reward System - Semantic similarity-based evaluation using difflib matching
- Multi-Step Interaction - Max 3 steps per episode with cumulative reward tracking
- LLM Integration - Multiple agent types (analyzer, issue generator, fix generator, explanation agents)
- Full Docker Support - Containerized deployment with all dependencies
- OpenEnv Compatible - Standardized interface via
openenv.ymlconfiguration - Vector Database - ChromaDB for semantic code analysis and retrieval
- Web Interface - React/Next.js frontend with real-time feedback
- Secure API - FastAPI backend with authentication and database persistence
Agentic_AI/
├── rl/ # PRIMARY: Reinforcement Learning Core
│ ├── env.py # CodeAnalysisEnv - Main RL Environment
│ ├── reward.py # Reward computation with semantic similarity
│ ├── test_env.py # Environment testing utilities
│ ├── __init__.py # Package initialization
│ ├── data/
│ │ └── jobs_data.json # Task dataset for training
│ └── tasks/
│ └── tasks.py # Task configuration and difficulty levels
│
├── backend/ # Backend API & Services
│ ├── main.py # FastAPI application entry point
│ ├── worker.py # Async job processing worker
│ ├── export_data.py # Data export utilities
│ ├── requirements.txt # Backend dependencies
│ ├── jobs_data.json # Job repository data
│ │
│ ├── agents/ # LLM-Powered Agents
│ │ ├── analyzer_agents.py # Code analysis agents
│ │ ├── issue_generator_agent.py # Issue identification
│ │ ├── fixed_generator_agent.py # Fix generation
│ │ └── explanation_agent.py # Solution explanations
│ │
│ ├── api/ # REST API Endpoints
│ │ └── routes.py # FastAPI route definitions
│ │
│ ├── services/ # Business Logic Services
│ │ ├── pipeline.py # Main processing pipeline
│ │ ├── aggregator.py # Response aggregation
│ │ ├── llm_aggregator.py # LLM response handling
│ │ ├── github_service.py # GitHub integration
│ │ └── vector_store.py # Vector database operations
│ │
│ ├── config/ # Configuration Management
│ │ ├── settings.py # Application settings
│ │ ├── database.py # Database configuration
│ │ ├── redis_client.py # Redis cache setup
│ │ └── __init__.py
│ │
│ ├── models/ # Data Models
│ │ ├── db_models.py # SQLAlchemy models
│ │ ├── schemas.py # Pydantic schemas
│ │ └── __init__.py
│ │
│ ├── repos/ # Repository Storage
│ │ └── [Multiple project folders with source code]
│ │
│ ├── chroma_db/ # Vector Database Storage
│ │ ├── chroma.sqlite3
│ │ └── [Embedded knowledge]
│ │
│ └── utils/ # Utility Functions
│ ├── auth.py # Authentication utilities
│ ├── llm.py # LLM integration helpers
│ └── __init__.py
│
├── frontend/ # React/Next.js Web Interface
│ ├── app/
│ │ ├── page.tsx # Main page component
│ │ ├── layout.tsx # Root layout
│ │ ├── globals.css # Global styles
│ │ └── HexGrid.tsx # Interactive hex grid visualization
│ │
│ ├── components/ # Reusable React components
│ │ ├── GlowButton.tsx
│ │ └── HexGrid.tsx
│ │
│ ├── public/ # Static assets
│ ├── package.json # Frontend dependencies
│ ├── next.config.ts # Next.js configuration
│ ├── tsconfig.json # TypeScript configuration
│ └── README.md # Frontend documentation
│
├── inference.py # Main inference entry point
├── openenv.yml # OpenEnv environment spec
├── Dockerfile # Docker container definition
├── requirements.txt # Root dependencies
└── README.md # This file
The environment provides structured observations with the following components:
{
"task_id": str, # Unique task identifier
"repository_id": str, # Repository being analyzed
"problem_description": str, # Issue description
"code_input": str, # Code snippet to analyze
"difficulty_level": str, # "easy" | "medium" | "hard"
"steps_taken": int, # Current step count (0-3)
"previous_issues": list, # Previously identified issues
"previous_fixes": list, # Previously suggested fixes
}Agents must return structured JSON with identified issues and fixes:
{
"identified_issues": [
"issue_1_description",
"issue_2_description"
],
"suggested_fixes": [
"fix_1_description",
"fix_2_description"
]
}Rewards are computed using semantic similarity matching:
Issue Score = (Sum of best matches to expected issues) / number of expected issues
Fix Score = (Sum of best matches to expected fixes) / number of expected fixes
Total Reward = (0.5 * Issue Score) + (0.5 * Fix Score)
Range: [0.0, 1.0]
The compute_reward() function in rl/reward.py implements:
- Token-level similarity using difflib
- Best-match selection per expected output
- Normalization for variable-length outputs
- Support for both string and list inputs
- Initialization: Load tasks from
rl/data/jobs_data.json - Reset: Select random task, return initial observation
- Step: Agent submits action → Reward computed → Next observation returned
- Done: Episode terminates after 3 steps or perfect score reached
- Evaluation: Total episode reward accumulated across all steps
- Python 3.9+
- Docker & Docker Compose (for containerized deployment)
- Redis (for caching, optional)
- PostgreSQL or SQLite (for data persistence)
- OpenAI API key (for LLM access)
# Clone the repository
git clone <repository-url>
cd Agentic_AI
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
pip install -r backend/requirements.txt
# Setup environment variables
cp .env.example .env
# Edit .env with your API keys and database settings
# Initialize database
python backend/main.py# Build and run with Docker
docker build -t code-analysis-rl .
docker run -p 8000:8000 -p 3000:3000 code-analysis-rlfrom rl.env import CodeAnalysisEnv
from rl.reward import compute_reward
# Initialize environment
env = CodeAnalysisEnv(data_path="rl/data/jobs_data.json")
# Reset for new episode
observation = env.reset()
# Agent takes action
action = {
"identified_issues": ["memory leak", "null pointer"],
"suggested_fixes": ["use context manager", "add null check"]
}
# Step through environment
observation, reward, done, info = env.step(action)
print(f"Reward: {reward:.2f}")
print(f"Episode Complete: {done}")cd rl/
python test_env.pycd backend/
uvicorn main:app --reload --host 0.0.0.0 --port 8000API endpoints:
GET /- Health checkPOST /api/analyze- Analyze codeGET /api/tasks- Fetch available tasksPOST /api/submit- Submit solution
cd frontend/
npm install
npm run devFrontend will be available at http://localhost:3000
python inference.py --task_id <task_id> --max_steps 3- Perform static code analysis
- Detect potential issues and vulnerabilities
- Generate AST-based insights
- Support multiple programming languages
- Identifies specific code problems
- Categorizes issues by severity (low, medium, high, critical)
- Provides context-aware descriptions
- Uses pattern matching and heuristics
- Generates concrete, implementable fixes
- Prioritizes fixes by impact and feasibility
- Includes code examples and explanations
- Provides refactoring suggestions
- Explains identified issues in plain language
- Provides best-practice recommendations
- Generates educational content
- Links to relevant documentation
Edit rl/tasks/tasks.py:
def get_tasks():
return [
{
"id": "task_1",
"difficulty": "easy",
"expected_issues": ["issue1", "issue2"],
"expected_fixes": ["fix1", "fix2"]
}
]Edit rl/reward.py compute_reward() function to change scoring logic:
def compute_reward(task, action, config=None):
# Customize reward calculation
issue_score = calculate_issue_similarity(...)
fix_score = calculate_fix_similarity(...)
return (0.5 * issue_score) + (0.5 * fix_score)- Create new agent file in
backend/agents/ - Implement agent class with standard interface
- Register in
backend/services/pipeline.py - Add to agent initialization
The environment tracks:
| Metric | Description |
|---|---|
| Issue Precision | % of identified issues that are correct |
| Issue Recall | % of expected issues that were found |
| Fix Quality | Semantic similarity to expected fixes |
| Episode Reward | Cumulative reward across all steps |
| Success Rate | % of episodes reaching perfect score |
| Efficiency | Average steps to solve per episode |
| Step Improvement | Reward increase across steps |
The project includes multi-stage Docker setup:
# Build image
docker build -t code-analysis-rl:latest .
# Run with environment
docker run \
-e OPENAI_API_KEY=your_key \
-e DATABASE_URL=postgresql://user:pass@db:5432/code_analysis \
-p 8000:8000 \
-p 3000:3000 \
code-analysis-rl:latest
# Run with docker-compose
docker-compose up -dgymnasium>=0.27.0- RL environment frameworkopenai>=1.0.0- LLM integration
fastapi==0.115.0- Web frameworkuvicorn==0.30.6- ASGI serversqlalchemy==2.0.32- ORMpsycopg2-binary==2.9.9- PostgreSQL adapterchromadb==0.5.5- Vector databasesentence-transformers==3.0.1- Embeddingsredis==5.0.8- Caching layergitpython==3.1.43- Git operationspydantic==2.8.2- Data validation
next.js- React frameworktypescript- Type safetytailwindcss- Styling (if configured)
See requirements.txt and backend/requirements.txt for complete dependency lists.
The backend uses JWT token-based authentication:
# Login endpoint returns token
POST /api/auth/login
{
"username": "user",
"password": "pass"
}
# Use token in Authorization header
Authorization: Bearer <token>Implemented in backend/utils/auth.py using:
passlibfor password hashingpython-josefor JWT tokens- Configurable token expiration
# LLM Configuration
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4
# Database
DATABASE_URL=postgresql://user:password@localhost/code_analysis
REDIS_URL=redis://localhost:6379/0
# Archive & Storage
GITHUB_TOKEN=ghp_...
CHROMA_PERSIST_DIR=./backend/chroma_db
# Application
DEBUG=false
LOG_LEVEL=INFO
MAX_STEPS=3
ENVIRONMENT=production- Batch Processing: Use
backend/worker.pyfor async code analysis - Caching: Enable Redis in
backend/config/redis_client.pyfor frequent queries - Vector Search: Pre-compute embeddings for repositories using ChromaDB
- Parallel Agents: Run multiple agents concurrently with async/await
- Model Optimization: Use quantized models for faster inference
- Database Indexing: Add indexes to frequently queried columns
- Request Pooling: Batch multiple analysis requests
cd rl/
python -m pytest test_env.py -v
python test_env.py # Quick testfrom rl.reward import compute_reward
task = {
"expected_issues": ["null pointer"],
"expected_fixes": ["add null check"]
}
action = {
"identified_issues": ["null pointer dereference"],
"suggested_fixes": ["add null check before use"]
}
reward = compute_reward(task, action)
assert 0 <= reward <= 1- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
# Install dev dependencies
pip install -r requirements.txt pytest pytest-cov black flake8
# Run tests
pytest rl/test_env.py -v
# Format code
black backend/ rl/ frontend/
# Lint
flake8 backend/ rl/- Python: PEP 8 (enforced by black)
- TypeScript: ESLint configuration in
frontend/eslint.config.mjs - Commit messages: Conventional commits format
- Framework: Gymnasium RL + FastAPI
- LLM Provider: OpenAI GPT-4
- Vector DB: ChromaDB with Chroma SQLite
- Frontend: Next.js + React + TypeScript
- Backend: FastAPI + SQLAlchemy + Pydantic
- Deployment: Docker + OpenEnv
- License: MIT
- OpenEnv Documentation: See
openenv.ymlfor environment specification - Frontend Details: See frontend/README.md for UI/UX information
- Agent Customization: Check individual agent files for configuration options
- Database Schema: Refer to
backend/models/db_models.pyfor data structure
- Documentation: See frontend/README.md for UI details
- Bug Reports: Open an issue with reproduction steps
- Feature Requests: Submit via discussions with use cases
- Questions: Check documentation first, then create an issue
This project is licensed under the MIT License - see LICENSE file for details.
Built for the OpenEnv Hackathon
Last Updated: April 2026