Skip to content

sdcalmes/java-codebase-onboarding-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Codebase Onboarding System

An AI-assisted codebase onboarding system for rapidly understanding large Java codebases (especially monoliths).

🎯 Purpose

When joining a new team with a large, established codebase:

  • There's no systematic approach to learning the code
  • Tribal knowledge is scattered or undocumented
  • It's hard to know what to explore first
  • Progress isn't tracked, leading to repeated exploration
  • Documentation generated during onboarding is often lost

This plugin solves these problems by providing structured guidance, automated analysis, and progress tracking.

✨ Features

  • Automated Analysis Scripts - Bash scripts that analyze Java codebases
  • AI Agents - Specialized agents for different exploration tasks
  • Structured Methodology - A six-phase onboarding process
  • Progress Tracking - YAML-based progress file
  • Documentation Templates - Standardized output formats
  • Cross-Platform - Works with Claude Code AND GitHub Copilot

📦 Installation

Claude Code

# From local path
/plugin install /path/to/codebase-onboarding

# From GitHub (once published)
/plugin install github:username/codebase-onboarding

GitHub Copilot

# Generate Copilot-compatible version
./scripts/build-copilot.sh

# Option 1: Copy everything to target repo (recommended)
cp -r dist/copilot/.github /path/to/your/repo/

# Option 2: Global agents + per-repo skills
# Agents can be installed globally (available in all repos):
cp -r dist/copilot/.github/agents ~/.copilot/agents/

# Skills MUST be copied to each repo you want to onboard:
cp -r dist/copilot/.github/skills /path/to/your/repo/.github/

Note: Skills contain the bash scripts and templates that agents use. They must exist in the repository's .github/skills/ directory for the agents to access them.

🚀 Quick Start

Start Onboarding

Claude Code:

/project:onboard

GitHub Copilot: Select onboarding-orchestrator from the agent dropdown, then:

start onboarding

This will guide you through the six-phase onboarding process:

  1. Setup - Build & run locally
  2. Cartography - High-level architecture map
  3. Domain - Business concepts & glossary
  4. Modules - Deep-dive key modules
  5. Cross-cutting - Auth, logging, config patterns
  6. Contribute - Ship first change

Explore a Codebase

Claude Code:

/project:explore              # Explore current directory
/project:explore ./submodule  # Explore specific path
/project:explore --quick      # Fast summary only

GitHub Copilot: Select codebase-cartographer from the agent dropdown, then:

map this codebase
explore ./submodule

Deep-Dive into a Module

Claude Code:

/project:dive auth                    # Dive into auth module
/project:dive com.example.payments    # Dive into package
/project:dive OrderService            # Dive into specific class
/project:dive User --vertical         # Trace entity from API to DB

GitHub Copilot: Select module-archaeologist from the agent dropdown, then:

dig into the auth module
trace User entity from API to database

📁 Generated Documentation

After onboarding completes, you'll have:

In the repository (shareable, PR-able):

docs/architecture/
├── overview.md           # High-level architecture
├── setup-notes.md        # Build & run instructions
├── domain-glossary.md    # Business terms & entities
├── cross-cutting.md      # Auth, logging, etc.
└── modules/
    ├── auth.md           # Per-module documentation
    ├── payments.md
    └── ...

Personal notes (local):

~/.claude/onboarding/{project}/
├── journal.md            # Learning log
├── questions.md          # Things to ask
└── scratch/              # Working notes

🤖 Agents

Agent Purpose Model
onboarding-orchestrator Drives the 6-phase process, tracks progress Opus
codebase-cartographer High-level mapping and orientation Sonnet
module-archaeologist Deep-dive into specific modules Sonnet

🛠️ Analysis Scripts

Script Purpose
analyze-structure.sh Module overview, LOC, tech detection
find-entrypoints.sh REST controllers, main classes, consumers
dep-graph.sh Maven/Gradle dependency analysis
hot-files.sh Most-imported, largest, most-changed files
find-god-classes.sh Oversized classes detection
find-circular-deps.sh Package cycle detection
map-vertical-slice.sh API → Service → Repo → Entity tracing
find-seams.sh Module boundary identification

All scripts:

  • Written in Bash (zero dependencies)
  • Work on Mac and Linux
  • Auto-detect Maven vs Gradle
  • Accept repo path as argument (default: current directory)

📋 Project Context

Add project-specific context that the orchestrator reads:

Location: .claude/onboarding-context.md

# Project Context

## Business Domain
This is a [describe what the product does]

## Key Stakeholders
- Product: @person
- Architecture: @person

## Important Links
- Confluence: [url]
- Runbooks: [url]

## Known Gotchas
- The "legacy" package is deprecated, ignore it
- Module X is being rewritten, ask @person

📊 Progress Tracking

Progress is stored in .claude/onboarding-progress.yml:

project: my-project
started: 2025-01-15
current_phase: 2

phases:
  setup:
    status: complete
    completed: 2025-01-15
  cartography:
    status: in_progress
    started: 2025-01-16
  # ... more phases

Check your progress anytime:

Claude Code:

/project:onboard status

GitHub Copilot: Select onboarding-orchestrator, then:

what's my onboarding status?

🔄 Cross-Platform Support

This plugin works with both Claude Code and GitHub Copilot.

To generate the GitHub Copilot version:

./scripts/build-copilot.sh

This creates dist/copilot/.github/ with Copilot-compatible agents and skills.

📝 License

MIT

🤝 Contributing

Contributions welcome! Please read our contributing guidelines first.


Built with ❤️ for developers joining new teams

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages