app-classifier

Point it at a repo, get back what it is and how it works — deterministically, with optional LLM lift.

from app_classifier import classify_smart
result = classify_smart("./my-repo")
print(result.app_category, result.app_category_confidence)
print(result.functional_description)

e-commerce 0.95
ShopMax is an e-commerce application. Primary functionality: online shopping, internal admin.
The app routes traffic across 12 HTTP endpoints, including authentication and checkout.

Why it wins

Concern	app-classifier	Manual review	ChatGPT paste	GitHub Copilot Chat
Deterministic baseline	✅ pattern-based fingerprints	—	❌	❌
Works offline	✅ no network needed	✅	❌	❌
Zero runtime deps	✅ stdlib only	✅	n/a	n/a
Multi-language	✅ Python, JS/TS, Java, Go, Ruby, PHP, others	⚠️	⚠️	⚠️
Structured output	✅ dataclasses, JSON-serializable	❌	⚠️ free text	⚠️ free text
Programmatic API	✅ `classify()` / `classify_smart()` / `classify_agentic()` / `map_code()`	❌	❌	❌
LLM-pluggable	✅ 5 adapters (OpenAI / Anthropic / OpenRouter / Ollama / OpenAI-compat)	—	bound to one	bound to one
Auditable	✅ confidence scores + step-by-step trail	❌	❌	❌

If you've ever inherited a 200-file repo and spent an afternoon working out "wait, what does this thing actually do?" — that's the problem this solves.

Installation

pip install app-classifier

Zero runtime dependencies. The LLM adapters use stdlib urllib. If you'd rather use the official SDK clients:

pip install app-classifier[openai]        # adds openai SDK
pip install app-classifier[anthropic]     # adds anthropic SDK
pip install app-classifier[all]           # both

Requires Python 3.10+.

Quick start

1. Pure rule-based (no network, no deps, no setup)

from app_classifier import classify
result = classify("./my-repo")
print(result.app_category, result.app_category_confidence)
print(result.functional_description)
print([r.path for r in result.routes[:5]])

2. Smart — rule-based first, LLM-escalated when uncertain

import os
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."
from app_classifier import classify_smart
result = classify_smart("./my-repo")  # auto-detects provider from env
print(result.app_category, result.app_category_confidence)

High-confidence repos (≥0.75) return immediately with no LLM call. Only ambiguous ones get the agentic treatment.

3. Code mapping for impact analysis (no LLM)

from app_classifier import map_code
cm = map_code("./my-repo")
print("Entry points:", cm.entry_points)
print("If I change src/auth.py:", cm.impact_of("src/auth.py"))

impact_of() does BFS over the reverse dependency graph with cycle detection. Works across Python, JS/TS, Java, Go, Ruby, PHP at the file level; Python also gets function-level resolution.

Configuration

Option A: environment variables (simplest)

export APP_CLASSIFIER_LLM_PROVIDER=anthropic     # picks the provider
export ANTHROPIC_API_KEY=sk-ant-...               # per-provider key

classify_smart() will autodetect this. The provider name maps to one of: openai, anthropic, openrouter, ollama, openai_compat.

Option B: `~/.app-classifier/providers.json`

{
  "default": "anthropic",
  "providers": {
    "anthropic": {
      "type": "anthropic",
      "api_key": "${ANTHROPIC_API_KEY}",
      "model": "claude-haiku-4-5-20251001"
    },
    "local": {
      "type": "openai_compat",
      "base_url": "http://localhost:1234/v1",
      "model": "llama-3.2-3b"
    }
  }
}

${VAR} placeholders are interpolated from the environment at load time.

Option C: explicit instantiation

from app_classifier import classify_smart, OpenAIProvider
provider = OpenAIProvider(api_key="sk-...", model="gpt-4o-mini")
result = classify_smart("./my-repo", llm_provider=provider)

See the per-provider guide for full quickstarts (Groq, LM Studio, Ollama, vLLM, llama.cpp, Together, Fireworks).

API reference

Symbol	Module	Purpose
`classify(repo)`	`app_classifier`	Rule-based, deterministic, no network
`classify_smart(repo)`	`app_classifier`	Rule-first; LLM-escalated for low-confidence
`classify_smart_async(repo)`	`app_classifier`	Async variant; returns full audit trail
`classify_agentic(repo, llm_provider)`	`app_classifier`	LLM tool-loop (low level — `classify_smart` wraps this)
`map_code(repo)`	`app_classifier`	Build a `CodeMap` for impact analysis
`CodeMap.impact_of(target)`	`app_classifier`	BFS over reverse-dep graph
`OpenAIProvider(api_key, model, base_url=None)`	`app_classifier`	OpenAI + any OpenAI-compatible endpoint
`AnthropicProvider(api_key, model)`	`app_classifier`	Anthropic Messages API
`OpenRouterProvider(api_key, model)`	`app_classifier`	OpenRouter (auto-routed across providers)
`OllamaProvider(host, model)`	`app_classifier`	Local Ollama serving
`OpenAICompatProvider(base_url, model, api_key=None)`	`app_classifier`	LM Studio / vLLM / llama.cpp / Groq / Together
`load_provider(name=None)`	`app_classifier`	Resolve provider from env + config
`analyze_hosting_requirements(repo)`	`app_classifier`	Runtime / DB / port / env-var detection

Full dataclass shapes: AppDescription, RouteEntry, DataModel, CodeMap, FileNode, FunctionNode, AgentClassificationResult, AgentStep, SubappClassification, HostingReport, Signal.

CLI

app-classifier ./my-repo                   # human-readable summary
app-classifier ./my-repo --json            # JSON for piping

What it can't do (yet)

Java / JS / Go function-call graph (file-level dep only; v0.6.0+)
Streaming completion (provider layer always waits for the full response)
Cost / token caps per call
Tree-sitter-backed parsing (everything is stdlib + regex right now — fast, but imperfect)

See the CHANGELOG for what landed in each release. The code-mapping guide has impact-analysis recipes.

License

MIT. Contributions welcome.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
docs		docs
examples		examples
src/app_classifier		src/app_classifier
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LAUNCH.md		LAUNCH.md
LICENSE		LICENSE
PUBLISHING.md		PUBLISHING.md
README.md		README.md
USAGE.md		USAGE.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

app-classifier

Why it wins

Installation

Quick start

1. Pure rule-based (no network, no deps, no setup)

2. Smart — rule-based first, LLM-escalated when uncertain

3. Code mapping for impact analysis (no LLM)

Configuration

Option A: environment variables (simplest)

Option B: `~/.app-classifier/providers.json`

Option C: explicit instantiation

API reference

CLI

What it can't do (yet)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

app-classifier

Why it wins

Installation

Quick start

1. Pure rule-based (no network, no deps, no setup)

2. Smart — rule-based first, LLM-escalated when uncertain

3. Code mapping for impact analysis (no LLM)

Configuration

Option A: environment variables (simplest)

Option B: ~/.app-classifier/providers.json

Option C: explicit instantiation

API reference

CLI

What it can't do (yet)

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Option B: `~/.app-classifier/providers.json`

Packages