Skip to content

sibasispadhi/agentic-cloud-optimizer

Repository files navigation

Agent Cloud Optimizer (ACO)

Governance-first, local-first optimization for JVM microservices. ACO combines metric-driven diagnosis, bounded recommendations, audit artifacts, confidence gates, rollback modeling, and benchmark scenarios in one Java codebase.

License Java Spring Boot Ollama


What ACO Is

ACO is an open-source Java Autonomic Reliability Governance (ARG) reference implementation for experimenting with governed optimization of JVM-backed services. ARG is the discipline of designing bounded, auditable, and reversible automation under explicit constraints — particularly where agents plan and act faster than human-in-the-loop validation can operate.

ACO watches runtime signals such as latency, heap, GC, CPU, and thread behavior; produces structured optimization plans; evaluates those plans against policy and actuation budgets; and validates whether changes helped or hurt.

This is not "YOLO let the LLM touch prod." That would be extremely innovative in the worst possible way.

ACO currently supports:

  • Local LLM analysis through Ollama
  • Deterministic fallback analysis through SimpleAgent
  • OptimizationPlan artifacts for auditability
  • Policy evaluation before actuation
  • Actuation budgets — rate-limiting change magnitude and frequency, not just traffic
  • Blast radius constraints through actuation scoping
  • Confidence gates and progressive autonomy modes (Advisory / Auto-Governed / Denied)
  • Eligibility tiers — action-based permission model separating observational, low-risk, and high-risk actions
  • Validation and rollback modeling
  • Deterministic benchmark scenarios for amplification testing

Why This Project Exists

JVM tuning is still too often a guessing game:

  • thread pools get bumped because latency is bad
  • heap gets inflated "just to be safe"
  • GC settings drift without a recorded reason
  • optimization decisions are made in war rooms and forgotten a week later

ACO turns that into a more disciplined loop:

  1. Observe runtime behavior
  2. Reason about likely bottlenecks
  3. Plan a structured optimization with evidence and rollback path
  4. Govern that plan with policy, budgets, and autonomy checks
  5. Validate the outcome
  6. Audit — preserve every decision for review and rollback

The point is not raw automation. The point is bounded, explainable optimization.


Architecture

Execution Pipeline

Every optimization cycle flows through a fixed governance pipeline. No step can be skipped.

flowchart TD
    WS[Workload Simulator] --> MC[Metrics Collection]
    MC --> SLO[SLO Detector]
    SLO --> AR[Agent Reasoning]
    AR --> PA[Plan Assembly]
    PA --> PE[Policy Evaluation]
    PE --> AB[Actuation Budget]
    AB --> AG{Autonomy Gate}
    AG -->|Advisory| ADV[Advisory Report]
    AG -->|Auto-Governed| VE[Validation]
    AG -->|Denied| RE[Rollback]
    VE --> AUD[Audit & Report]
    RE --> AUD
    ADV --> AUD
Loading

Component Layers

ACO is organized into six layers. Each layer has a single responsibility and depends only on the layers below it.

flowchart TD
    A["Infrastructure — Ollama · Load Runner · Workload Simulator"]
    B["Observation — Metrics Collection · SLO Detection"]
    C["Reasoning — LLM Agent · Simple Agent"]
    D["Planning — Plan Assembler · Optimization Plan · Rollback Recipe"]
    E["Governance — Policy Engine · Budget Ledger · Autonomy Gate"]
    F["Validation & Audit — Validation · Rollback · Report"]

    A --> B --> C --> D --> E --> F
Loading

Quick Start

Option 1: Docker with local LLM

git clone https://github.com/sibasispadhi/agentic-cloud-optimizer.git
cd agentic-cloud-optimizer
docker compose up --build

Open:

First run downloads the Ollama model, so yeah, give it a minute.

Option 2: SimpleAgent only (no LLM download)

git clone https://github.com/sibasispadhi/agentic-cloud-optimizer.git
cd agentic-cloud-optimizer
docker compose -f docker-compose.simple.yml up --build

Open:

Option 3: Local development

# Terminal 1
ollama serve
ollama pull llama3.2:3b

# Terminal 2
mvn spring-boot:run

Then open:

For a more guided local setup, use docs/START_HERE.md.


Core Workflow

The Architecture diagram above shows the full pipeline. Each stage maps to a layer:

Pipeline stage Layer Detail
Workload Simulator Infrastructure executes load; drives metric signals
Metrics Collection Observation latency, throughput, heap, CPU, GC, threads
SLO Detector Observation triggers optimization when Service Level Objective (SLO) thresholds breach
Agent Reasoning Reasoning SpringAiLlmAgent or SimpleAgent
Plan Assembly Planning OptimizationPlan with evidence and rollback path
Policy Evaluation Governance PolicyEngine approves, warns, or denies
Actuation Budget Governance ActuationBudgetLedger enforces change limits
Autonomy Gate Governance Advisory / Auto-Governed / Denied
Validation Validation & Audit confirms improvement; triggers rollback if not
Audit & Report Validation & Audit persists every decision for review

Implemented Phases

Phase 0 — Baseline optimizer stabilization

  • local LLM-backed reasoning with Ollama
  • deterministic fallback agent
  • externalized thresholds in configuration
  • Docker startup flows

Phase 1 — OptimizationPlan artifact

Implemented in src/main/java/com/cloudoptimizer/agent/artifact/

Key outputs:

  • OptimizationPlan
  • PlanChange
  • PlanEvidence
  • ValidationRecipe
  • RollbackRecipe

Phase 2 — Policy engine

Implemented in src/main/java/com/cloudoptimizer/agent/policy/

Key outputs:

  • PolicyEngine
  • DefaultPolicyEngine
  • ActuationPolicy
  • PolicyDecision

Phase 3 — Actuation budgets

Implemented in src/main/java/com/cloudoptimizer/agent/budget/

Key outputs:

  • ActuationBudget
  • ActuationBudgetLedger
  • BudgetConsumption

Phase 4 — Confidence gates and progressive autonomy

Implemented in src/main/java/com/cloudoptimizer/agent/autonomy/

Key outputs:

  • AutonomyGate
  • AutonomyMode
  • AutonomyGateResult

Phase 5 — Validation and rollback modeling

Implemented in the artifact and service layers

Key outputs:

  • ValidationExecutor
  • RollbackExecutor
  • ValidationResult
  • RollbackResult

Phase 6 — Benchmark scenarios

Implemented in src/main/java/com/cloudoptimizer/agent/benchmark/

Included scenarios:

  • retry storm
  • thread saturation
  • CPU throttling
  • heap overprovisioning
  • burst traffic

Benchmark / Amplification Evidence

ACO includes deterministic benchmark scenarios so governance claims can be tested without needing production data.

Example outcomes from the benchmark layer:

  • Retry storm: 3× amplification factor
  • Naive latency-reactive agent: p99 worsened by 58%
  • Governed agent: p99 recovered by 75% (480 ms → 120 ms); throughput restored from 40 RPS to 78 RPS

That is the whole point of the project: useful automation should dampen instability, not cosplay as a chaos monkey.


Generated Outputs

ACO writes artifacts and reports under artifacts/.

Typical outputs include:

  • baseline.json
  • after.json
  • report.json
  • optimization-plan artifacts
  • reasoning traces
  • validation and rollback records

You can also inspect example outputs in examples/README.md.


Run Commands

Build

mvn clean package -DskipTests

CLI mode

./scripts/run-agent.sh
# Windows: scripts\run-agent.bat

Live web UI

./scripts/run-web-ui.sh
# Windows: scripts\run-web-ui.bat

Verify Ollama

./scripts/verify-ollama.sh

Tech Stack

  • Java 21
  • Spring Boot 3.2
  • Spring AI
  • Ollama for local LLM inference
  • Jackson for artifact serialization
  • JUnit for test coverage
  • Docker / Docker Compose for local execution

Public Docs


What Is Not Implemented Yet

These are roadmap items, not shipped capabilities:

  • GitOps PR generation
  • OpenSLO ingestion
  • external telemetry adapters beyond the current local flow
  • broader multi-service / multi-region rollout controls

If it is not in code and tested, it does not get to sit in the README pretending it exists.


License

MIT — see LICENSE.

About

Autonomous LLM-powered agent that optimizes JVM performance (concurrency + heap) using local Ollama with explainable reasoning and real-time monitoring

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors