-
Notifications
You must be signed in to change notification settings - Fork 0
docs(ai): add AI orchestration stack + LLM observability pages #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
JacobPEvans-personal
wants to merge
1
commit into
main
Choose a base branch
from
docs/ai-orchestration-stack
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| --- | ||
| title: "AI orchestration stack" | ||
| description: "Five self-hosted tools for building and running LLM workflows — n8n, Dify, LangFlow, CrewAI, and LangChain — and the blunt rule for which one to reach for." | ||
| tier: 2 | ||
| --- | ||
|
|
||
| > Three of these draw boxes and arrows. Two are Python libraries. Knowing which | ||
| > is which is most of the decision. | ||
|
|
||
| The homelab runs a self-hosted layer for building LLM workflows and agents on top | ||
| of its [private model serving](/infrastructure/local-llm). Five tools cover the | ||
| range from "connect an LLM to 400 other apps" to "write a multi-agent crew in | ||
| Python." They overlap on purpose at the edges; the trick is not deploying all | ||
| five and using none of them. | ||
|
|
||
| ## The five, at a glance | ||
|
|
||
| | Tool | What it is | Shape | Reach for it when | | ||
| | --- | --- | --- | --- | | ||
| | **n8n** | General workflow automation, 400+ integrations | Visual builder (service) | You need to wire an LLM into other systems — email, calendars, webhooks, databases, SaaS APIs | | ||
| | **Dify** | Full LLMOps platform — RAG, prompts, evals, agents, model routing | Visual builder (service) | You want a production AI app with retrieval, prompt versioning, and evaluation, low-code | | ||
| | **LangFlow** | Visual node editor for LangChain graphs | Visual builder (service) | You want to prototype a chain by dragging nodes, then export it to Python | | ||
| | **LangChain** | Library of composable LLM building blocks | Python library | You're writing code and want chains, tools, memory, and retrievers as primitives | | ||
| | **CrewAI** | Framework for role-playing multi-agent "crews" | Python library | You're writing code and want agents with roles collaborating on a task | | ||
|
|
||
| ## Services vs libraries | ||
|
|
||
| The single most useful distinction: | ||
|
|
||
| - **n8n, Dify, and LangFlow are services.** They run as containers, expose a web | ||
| UI, and you build inside them. They are deployed and have an HTTPS front door. | ||
| - **LangChain and CrewAI are libraries.** You do not "deploy" them — you `pip | ||
| install` them into application code. In this homelab they live together in one | ||
| Python execution box that runs the agent code people write against them. | ||
|
|
||
| A common misread is treating LangChain/CrewAI as servers to stand up, or treating | ||
| the three visual builders as interchangeable. They are not. | ||
|
|
||
| ## How they fit together | ||
|
|
||
| {/* Shape: layered. Builders + libs sit on serving, all trace to observability. */} | ||
|
|
||
| ```mermaid | ||
| %%{init: {'theme':'base','look':'handDrawn','themeVariables':{'fontFamily':'Geist','fontSize':'14px','primaryColor':'#102937','primaryTextColor':'#F4EFE6','primaryBorderColor':'#4FB3A9','lineColor':'#4FB3A9','secondaryColor':'#0B1D2A','tertiaryColor':'#1A2A38','clusterBkg':'rgba(79,179,169,0.08)','clusterBorder':'#4FB3A9'}}}%% | ||
| flowchart TB | ||
| subgraph build [Build & run] | ||
| N8N([n8n]) | ||
| DIFY([Dify]) | ||
| LF([LangFlow]) | ||
| AX([agent code<br/>LangChain · CrewAI]) | ||
| end | ||
| MODELS([Model providers<br/>local Ollama · external APIs]) | ||
| OBS([Observability<br/>OTEL traces]) | ||
|
|
||
| N8N --> MODELS | ||
| DIFY --> MODELS | ||
| LF --> MODELS | ||
| AX --> MODELS | ||
| N8N -.-> OBS | ||
| DIFY -.-> OBS | ||
| LF -.-> OBS | ||
| AX -.-> OBS | ||
|
|
||
| classDef svc fill:#102937,stroke:#4FB3A9,stroke-width:2px,color:#F4EFE6; | ||
| classDef core fill:#102937,stroke:#E06B4A,stroke-width:2px,color:#F4EFE6; | ||
| classDef obs fill:#102937,stroke:#F4EFE6,stroke-width:1.5px,color:#F4EFE6; | ||
|
|
||
| class N8N,DIFY,LF,AX svc | ||
| class MODELS core | ||
| class OBS obs | ||
|
|
||
| linkStyle 4,5,6,7 stroke:#E06B4A,stroke-width:2px,stroke-dasharray:4 3; | ||
| ``` | ||
|
|
||
| Solid edges are model calls; coral dashed edges are telemetry. Every tool is | ||
| configured to its **own** model provider per its standard install — the local | ||
| OpenAI-compatible endpoint, an external API, or both — never forced onto a shared | ||
| backend that belongs to another stack. Each tool's call is instrumented, and the | ||
| traces fan out to the [LLM observability](/observability/llm-observability) | ||
| pipeline. | ||
|
|
||
| ## Picking one — the blunt version | ||
|
|
||
| - One tool only, and it has to be one → **Dify**. It covers RAG, prompts, evals, | ||
| and agents without a separate automation tool. | ||
| - Already automating with workflows → keep **n8n** and add **Dify** as the AI | ||
| layer beside it. | ||
| - Want to sketch a chain visually and walk away with Python → **LangFlow**. | ||
| - Writing application code → **LangChain** for primitives, **CrewAI** for | ||
| role-based agent teams. Both are imports, not installs-as-a-service. | ||
|
|
||
| LangFlow overlaps Dify's visual builder; it earns its place only for | ||
| lightweight, Python-export prototyping. If that workflow isn't yours, you can run | ||
| the other four and never miss it. | ||
|
|
||
| ## Where to go next | ||
|
|
||
| <CardGroup cols={2}> | ||
| <Card title="Local LLM" icon="microchip" href="/infrastructure/local-llm"> | ||
| The private GPU model serving these tools call. | ||
| </Card> | ||
| <Card title="LLM observability" icon="chart-line" href="/observability/llm-observability"> | ||
| How every LLM call gets traced, costed, and evaluated. | ||
| </Card> | ||
| <Card title="LXC vs Docker" icon="boxes-stacked" href="/infrastructure/lxc-vs-docker"> | ||
| Why the compose-based tools run as Docker-in-LXC. | ||
| </Card> | ||
| <Card title="ansible-proxmox-apps" icon="screwdriver-wrench" href="/infrastructure/repos/ansible-proxmox-apps"> | ||
| The configuration tier that deploys these app payloads. | ||
| </Card> | ||
| </CardGroup> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,105 @@ | ||
| --- | ||
| title: "LLM observability" | ||
| description: "Every LLM call from the orchestration stack emits OpenTelemetry, routes through Cribl, and lands in both Langfuse (trace UX) and Splunk (archival + SIEM)." | ||
| tier: 1 | ||
| --- | ||
|
|
||
| > If a model was called, there's a trace — and you can see what it cost. | ||
|
|
||
| The [AI-coding-tool pipeline](/observability/overview) traces the IDEs. This is | ||
| its sibling for the [AI orchestration stack](/ai-development/ai-orchestration-stack): | ||
| n8n, Dify, LangFlow, and the agent code emit OpenTelemetry for every LLM call, | ||
| and the same Cribl tier routes it — this time to two sinks. | ||
|
|
||
| ## Emitting traces — OpenLLMetry + OTEL GenAI | ||
|
|
||
| Apps are instrumented with [OpenLLMetry](https://github.com/traceloop/openllmetry) | ||
| (the Traceloop SDK), which wraps LLM providers, vector stores, and frameworks | ||
| (LangChain, CrewAI) and emits spans following OpenTelemetry's | ||
| [GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/). | ||
| Those conventions matured in 2026, so framework-native spans and SDK-emitted | ||
| spans now line up on the same schema — prompt, completion, model, token counts, | ||
| latency, cost. | ||
|
|
||
| The spans leave the app over OTLP (gRPC `4317` / HTTP `4318`) pointed at the | ||
| collector, **not** at any one backend. Keeping the emit target on the pipeline — | ||
| not the trace store — is what lets the same telemetry reach more than one place. | ||
|
|
||
| ## Cribl is the hub | ||
|
|
||
| A single collector tier owns ingest and fan-out. Cribl Edge runs **native | ||
| OpenTelemetry sources**, one per signal type on its own port, so it can route by | ||
| type without parsing payloads. From there it forks: | ||
|
|
||
| {/* Shape: fan-out. Apps -> Cribl -> two sinks. */} | ||
|
|
||
| ```mermaid | ||
| %%{init: {'theme':'base','look':'handDrawn','themeVariables':{'fontFamily':'Geist','fontSize':'14px','primaryColor':'#102937','primaryTextColor':'#F4EFE6','primaryBorderColor':'#4FB3A9','lineColor':'#4FB3A9','secondaryColor':'#0B1D2A','tertiaryColor':'#1A2A38','clusterBkg':'rgba(79,179,169,0.08)','clusterBorder':'#4FB3A9'}}}%% | ||
| flowchart LR | ||
| Apps([Orchestration stack<br/>OpenLLMetry]) | ||
| Cribl([Cribl Edge<br/>native OTEL sources]) | ||
| LF([Langfuse<br/>trace · cost · eval]) | ||
| SP[(Splunk<br/>archival · SIEM)] | ||
|
|
||
| Apps -->|OTLP per type| Cribl | ||
| Cribl -->|traces| LF | ||
| Cribl -->|all signals| SP | ||
|
|
||
| classDef app fill:#102937,stroke:#E06B4A,stroke-width:2px,color:#F4EFE6; | ||
| classDef hub fill:#102937,stroke:#4FB3A9,stroke-width:2px,color:#F4EFE6; | ||
| classDef sink fill:#102937,stroke:#F4EFE6,stroke-width:1.5px,color:#F4EFE6; | ||
|
|
||
| class Apps app | ||
| class Cribl hub | ||
| class LF,SP sink | ||
|
|
||
| linkStyle 0 stroke:#E06B4A,stroke-width:2px,stroke-dasharray:4 3; | ||
| linkStyle 1,2 stroke:#4FB3A9,stroke-width:2px; | ||
| ``` | ||
|
|
||
| - **Langfuse** gets the traces. It is the LLM-native view: trace waterfalls per | ||
| request, token cost, prompt and completion inspection, plus datasets, evals, | ||
| and prompt versioning. | ||
| - **Splunk** gets everything, for archival and correlation with the rest of the | ||
| homelab's telemetry — the same indexer the AI-coding pipeline already feeds. | ||
|
|
||
| Apps never talk to a trace store directly, and they never reach across into the | ||
| monitoring tier — they emit to the collector, and the collector decides where it | ||
| goes. One ingest point, two sinks, no second collector to run. | ||
|
|
||
| ## Why Langfuse | ||
|
|
||
| | Criterion | Langfuse | | ||
| | --- | --- | | ||
| | License | MIT — self-host with no feature gates | | ||
| | Ingestion | Native OTLP, GenAI-convention aware | | ||
| | Built for | LLM apps — traces, cost, evals, prompt management | | ||
| | Footprint | Web + worker + Postgres + ClickHouse + Redis + object storage | | ||
|
|
||
| [Laminar](https://laminar.sh/) (Apache-2.0) is the runner-up — lighter, tilted | ||
| toward long-running agent debugging. Arize Phoenix is capable but ships under the | ||
| Elastic License, which gates self-host use. | ||
|
JacobPEvans-personal marked this conversation as resolved.
|
||
|
|
||
| <Note> | ||
| Langfuse keeps its trace-of-record (relational + analytical) on durable local | ||
| storage; its blob store points at the homelab object store. Backend choices like | ||
| the vector store and model provider are made **per tool, per that tool's own | ||
| standard** — never by forcing a shared backend across unrelated stacks. | ||
| </Note> | ||
|
|
||
| ## Where to go next | ||
|
|
||
| <CardGroup cols={2}> | ||
| <Card title="AI orchestration stack" icon="diagram-project" href="/ai-development/ai-orchestration-stack"> | ||
| The tools whose calls this pipeline traces. | ||
| </Card> | ||
| <Card title="Observability overview" icon="chart-line" href="/observability/overview"> | ||
| The AI-coding-tool side of the same Cribl → Splunk spine. | ||
| </Card> | ||
| <Card title="ansible-proxmox-apps" icon="screwdriver-wrench" href="/infrastructure/repos/ansible-proxmox-apps"> | ||
| Deploys Langfuse and the Cribl OTEL sources. | ||
| </Card> | ||
| <Card title="Local LLM" icon="microchip" href="/infrastructure/local-llm"> | ||
| The models being traced. | ||
| </Card> | ||
| </CardGroup> | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.