CSV-AI v2 is a layered Streamlit application designed so that the business logic is independent of the UI. The same services/ package could be served from FastAPI, a CLI, or a Discord bot tomorrow without changes.
┌───────────────────────────────────────────────────────────────────┐
│ streamlit_app.py │
│ (thin entry) │
└──────────────────────────┬────────────────────────────────────────┘
│
┌────────▼────────┐
│ app.ui │
│ pages + comps │ ← Streamlit lives here only
└────────┬────────┘
│
┌────────▼────────┐
│ app.services │
│ Chat·Summary· │ ← pure Python; UI-free
│ Analysis │
└────────┬────────┘
┌──────────────┼──────────────┐
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ app.data │ │ app.llm │ │ app.prompts │
│ load·prof │ │ provider │ │ templates │
│ sample │ │ layer │ │ │
└────────────┘ └────┬───────┘ └────────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
OpenAI Anthropic Ollama
Single source of truth for settings via pydantic-settings. Reads .env, OS env, and (best-effort) Streamlit secrets. Anything that needs an API key or a default should call get_settings() rather than reading os.environ.
A provider-agnostic interface (LLMProvider) with three concrete implementations. Adding a new backend (Mistral API, Bedrock, …) is one file plus one factory entry. Keeping streaming in the interface forces every provider to behave consistently for the chat UI.
Pure-pandas. Never imports Streamlit or any LLM SDK. Three concerns:
loader.py— robust CSV reading (encoding fallback chain, delimiter sniffing).profiler.py— fast statistical profile (rows, dtypes, missing %, correlations, samples).sampler.py+context.py— turn a DataFrame into an LLM-friendly markdown context.
System prompts as constants. Versioning prompts in code (not strings sprinkled through services) keeps prompt iteration reviewable in git.
Three services map 1:1 to the three product workflows. Each takes a provider and a df, exposes a non-streaming and streaming method, and never imports Streamlit. They're trivially callable from FastAPI later.
The only place we import Streamlit. Page modules render; component modules are reusable. Session state is centralized in state.py so key names stay consistent.
Cross-cutting: logging, exception hierarchy, token counting. No business logic.
No FAISS / no retrieval for chat. The original CSV-AI embedded every row chunk into FAISS and retrieved on every question. For small files this was wasted compute; for analytical questions ("what's the average X?") it was actively worse than just showing the model the schema. v2 sends a structured schema + a smart sample. Result: lower latency, lower cost, fewer hallucinations.
Deterministic stats first, LLM second. The Analyze page computes the actual numbers with pandas, then asks the LLM to write the narrative around those numbers. The LLM never invents arithmetic — pure-LLM dataframe agents do.
Streaming everywhere. Every provider yields token chunks. The UI uses st.empty() plus st.markdown with a trailing cursor for the classic typewriter feel.
Custom-model escape hatch. The sidebar's model picker has a custom... option so new model ids (e.g. tomorrow's gpt-5) work without a code change.
Future API split. Because services don't import Streamlit, exposing an HTTP layer is mostly:
@app.post("/chat")
def chat(req: ChatReq) -> ChatResp:
provider = build_provider(req.provider, model=req.model, ...)
df = pd.read_csv(req.csv_path)
service = ChatService(provider=provider, df=df)
return ChatResp(answer=service.ask(req.question))- Create
app/llm/<name>_provider.pysubclassingLLMProvider. Implementcompleteandstream. - Register it in
app/llm/factory.py(_PROVIDERS). - Add suggested models to
_MODEL_CATALOGUEin the same file. - Extend
Settingswith any new credential fields. - Update the sidebar's
_render_provider_sectionif it needs a non-standard field (e.g. a region).
- Create
app/ui/pages/<name>.pywith arender()function. - Add it to
PAGE_RENDERERSinapp/main.pyandPAGESinapp/ui/components/sidebar.py.
That's it. No central router to wire up.