From dad7dffa408eda6dcdb2f76020ad81c8ae6a4072 Mon Sep 17 00:00:00 2001
From: JacobPEvans <20714140+JacobPEvans-personal@users.noreply.github.com>
Date: Fri, 26 Jun 2026 12:09:38 -0400
Subject: [PATCH] docs(ai): add AI orchestration stack + LLM observability
 pages
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Document the self-hosted AI orchestration layer (n8n, Dify, LangFlow,
CrewAI, LangChain) and its LLM observability pipeline, generically and
ahead of implementation.

- ai-development/ai-orchestration-stack: the five tools, the services-vs-
  libraries distinction, and a blunt "which to reach for" guide.
- observability/llm-observability: OpenLLMetry + OTEL GenAI emission, Cribl
  as the single ingest hub fanning out to Langfuse (trace/cost/eval) and
  Splunk (archival/SIEM); why Langfuse.
- Wire both into docs.json nav.

No homelab topology, VLANs, or addresses — concept and tool guidance only.

Assisted-by: Claude:claude-opus-4-8
Claude-Session: https://claude.ai/code/session_013KC8izFrMx32DVFduQp2tU
---
 ai-development/ai-orchestration-stack.mdx | 111 ++++++++++++++++++++++
 docs.json                                 |   2 +
 observability/llm-observability.mdx       | 105 ++++++++++++++++++++
 3 files changed, 218 insertions(+)
 create mode 100644 ai-development/ai-orchestration-stack.mdx
 create mode 100644 observability/llm-observability.mdx

diff --git a/ai-development/ai-orchestration-stack.mdx b/ai-development/ai-orchestration-stack.mdx
new file mode 100644
index 0000000..3f2b4d3
--- /dev/null
+++ b/ai-development/ai-orchestration-stack.mdx
@@ -0,0 +1,111 @@
+---
+title: "AI orchestration stack"
+description: "Five self-hosted tools for building and running LLM workflows — n8n, Dify, LangFlow, CrewAI, and LangChain — and the blunt rule for which one to reach for."
+tier: 2
+---
+
+> Three of these draw boxes and arrows. Two are Python libraries. Knowing which
+> is which is most of the decision.
+
+The homelab runs a self-hosted layer for building LLM workflows and agents on top
+of its [private model serving](/infrastructure/local-llm). Five tools cover the
+range from "connect an LLM to 400 other apps" to "write a multi-agent crew in
+Python." They overlap on purpose at the edges; the trick is not deploying all
+five and using none of them.
+
+## The five, at a glance
+
+| Tool | What it is | Shape | Reach for it when |
+| --- | --- | --- | --- |
+| **n8n** | General workflow automation, 400+ integrations | Visual builder (service) | You need to wire an LLM into other systems — email, calendars, webhooks, databases, SaaS APIs |
+| **Dify** | Full LLMOps platform — RAG, prompts, evals, agents, model routing | Visual builder (service) | You want a production AI app with retrieval, prompt versioning, and evaluation, low-code |
+| **LangFlow** | Visual node editor for LangChain graphs | Visual builder (service) | You want to prototype a chain by dragging nodes, then export it to Python |
+| **LangChain** | Library of composable LLM building blocks | Python library | You're writing code and want chains, tools, memory, and retrievers as primitives |
+| **CrewAI** | Framework for role-playing multi-agent "crews" | Python library | You're writing code and want agents with roles collaborating on a task |
+
+## Services vs libraries
+
+The single most useful distinction:
+
+- **n8n, Dify, and LangFlow are services.** They run as containers, expose a web
+  UI, and you build inside them. They are deployed and have an HTTPS front door.
+- **LangChain and CrewAI are libraries.** You do not "deploy" them — you `pip
+  install` them into application code. In this homelab they live together in one
+  Python execution box that runs the agent code people write against them.
+
+A common misread is treating LangChain/CrewAI as servers to stand up, or treating
+the three visual builders as interchangeable. They are not.
+
+## How they fit together
+
+{/* Shape: layered. Builders + libs sit on serving, all trace to observability. */}
+
+```mermaid
+%%{init: {'theme':'base','look':'handDrawn','themeVariables':{'fontFamily':'Geist','fontSize':'14px','primaryColor':'#102937','primaryTextColor':'#F4EFE6','primaryBorderColor':'#4FB3A9','lineColor':'#4FB3A9','secondaryColor':'#0B1D2A','tertiaryColor':'#1A2A38','clusterBkg':'rgba(79,179,169,0.08)','clusterBorder':'#4FB3A9'}}}%%
+flowchart TB
+  subgraph build [Build & run]
+    N8N([n8n])
+    DIFY([Dify])
+    LF([LangFlow])
+    AX([agent code<br/>LangChain · CrewAI])
+  end
+  MODELS([Model providers<br/>local Ollama · external APIs])
+  OBS([Observability<br/>OTEL traces])
+
+  N8N --> MODELS
+  DIFY --> MODELS
+  LF --> MODELS
+  AX --> MODELS
+  N8N -.-> OBS
+  DIFY -.-> OBS
+  LF -.-> OBS
+  AX -.-> OBS
+
+  classDef svc fill:#102937,stroke:#4FB3A9,stroke-width:2px,color:#F4EFE6;
+  classDef core fill:#102937,stroke:#E06B4A,stroke-width:2px,color:#F4EFE6;
+  classDef obs fill:#102937,stroke:#F4EFE6,stroke-width:1.5px,color:#F4EFE6;
+
+  class N8N,DIFY,LF,AX svc
+  class MODELS core
+  class OBS obs
+
+  linkStyle 4,5,6,7 stroke:#E06B4A,stroke-width:2px,stroke-dasharray:4 3;
+```
+
+Solid edges are model calls; coral dashed edges are telemetry. Every tool is
+configured to its **own** model provider per its standard install — the local
+OpenAI-compatible endpoint, an external API, or both — never forced onto a shared
+backend that belongs to another stack. Each tool's call is instrumented, and the
+traces fan out to the [LLM observability](/observability/llm-observability)
+pipeline.
+
+## Picking one — the blunt version
+
+- One tool only, and it has to be one → **Dify**. It covers RAG, prompts, evals,
+  and agents without a separate automation tool.
+- Already automating with workflows → keep **n8n** and add **Dify** as the AI
+  layer beside it.
+- Want to sketch a chain visually and walk away with Python → **LangFlow**.
+- Writing application code → **LangChain** for primitives, **CrewAI** for
+  role-based agent teams. Both are imports, not installs-as-a-service.
+
+LangFlow overlaps Dify's visual builder; it earns its place only for
+lightweight, Python-export prototyping. If that workflow isn't yours, you can run
+the other four and never miss it.
+
+## Where to go next
+
+<CardGroup cols={2}>
+  <Card title="Local LLM" icon="microchip" href="/infrastructure/local-llm">
+    The private GPU model serving these tools call.
+  </Card>
+  <Card title="LLM observability" icon="chart-line" href="/observability/llm-observability">
+    How every LLM call gets traced, costed, and evaluated.
+  </Card>
+  <Card title="LXC vs Docker" icon="boxes-stacked" href="/infrastructure/lxc-vs-docker">
+    Why the compose-based tools run as Docker-in-LXC.
+  </Card>
+  <Card title="ansible-proxmox-apps" icon="screwdriver-wrench" href="/infrastructure/repos/ansible-proxmox-apps">
+    The configuration tier that deploys these app payloads.
+  </Card>
+</CardGroup>
diff --git a/docs.json b/docs.json
index 7814ddf..0f85ffa 100644
--- a/docs.json
+++ b/docs.json
@@ -171,6 +171,7 @@
             "group": "AI Development",
             "pages": [
               "ai-development/overview",
+              "ai-development/ai-orchestration-stack",
               "ai-development/repo-boundaries",
               "ai-development/ai-assistant-instructions",
               "ai-development/claude-code-plugins",
@@ -232,6 +233,7 @@
             "group": "Observability",
             "pages": [
               "observability/overview",
+              "observability/llm-observability",
               {
                 "group": "Repos",
                 "pages": [
diff --git a/observability/llm-observability.mdx b/observability/llm-observability.mdx
new file mode 100644
index 0000000..afa5ecb
--- /dev/null
+++ b/observability/llm-observability.mdx
@@ -0,0 +1,105 @@
+---
+title: "LLM observability"
+description: "Every LLM call from the orchestration stack emits OpenTelemetry, routes through Cribl, and lands in both Langfuse (trace UX) and Splunk (archival + SIEM)."
+tier: 1
+---
+
+> If a model was called, there's a trace — and you can see what it cost.
+
+The [AI-coding-tool pipeline](/observability/overview) traces the IDEs. This is
+its sibling for the [AI orchestration stack](/ai-development/ai-orchestration-stack):
+n8n, Dify, LangFlow, and the agent code emit OpenTelemetry for every LLM call,
+and the same Cribl tier routes it — this time to two sinks.
+
+## Emitting traces — OpenLLMetry + OTEL GenAI
+
+Apps are instrumented with [OpenLLMetry](https://github.com/traceloop/openllmetry)
+(the Traceloop SDK), which wraps LLM providers, vector stores, and frameworks
+(LangChain, CrewAI) and emits spans following OpenTelemetry's
+[GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/).
+Those conventions matured in 2026, so framework-native spans and SDK-emitted
+spans now line up on the same schema — prompt, completion, model, token counts,
+latency, cost.
+
+The spans leave the app over OTLP (gRPC `4317` / HTTP `4318`) pointed at the
+collector, **not** at any one backend. Keeping the emit target on the pipeline —
+not the trace store — is what lets the same telemetry reach more than one place.
+
+## Cribl is the hub
+
+A single collector tier owns ingest and fan-out. Cribl Edge runs **native
+OpenTelemetry sources**, one per signal type on its own port, so it can route by
+type without parsing payloads. From there it forks:
+
+{/* Shape: fan-out. Apps -> Cribl -> two sinks. */}
+
+```mermaid
+%%{init: {'theme':'base','look':'handDrawn','themeVariables':{'fontFamily':'Geist','fontSize':'14px','primaryColor':'#102937','primaryTextColor':'#F4EFE6','primaryBorderColor':'#4FB3A9','lineColor':'#4FB3A9','secondaryColor':'#0B1D2A','tertiaryColor':'#1A2A38','clusterBkg':'rgba(79,179,169,0.08)','clusterBorder':'#4FB3A9'}}}%%
+flowchart LR
+  Apps([Orchestration stack<br/>OpenLLMetry])
+  Cribl([Cribl Edge<br/>native OTEL sources])
+  LF([Langfuse<br/>trace · cost · eval])
+  SP[(Splunk<br/>archival · SIEM)]
+
+  Apps -->|OTLP per type| Cribl
+  Cribl -->|traces| LF
+  Cribl -->|all signals| SP
+
+  classDef app  fill:#102937,stroke:#E06B4A,stroke-width:2px,color:#F4EFE6;
+  classDef hub  fill:#102937,stroke:#4FB3A9,stroke-width:2px,color:#F4EFE6;
+  classDef sink fill:#102937,stroke:#F4EFE6,stroke-width:1.5px,color:#F4EFE6;
+
+  class Apps app
+  class Cribl hub
+  class LF,SP sink
+
+  linkStyle 0 stroke:#E06B4A,stroke-width:2px,stroke-dasharray:4 3;
+  linkStyle 1,2 stroke:#4FB3A9,stroke-width:2px;
+```
+
+- **Langfuse** gets the traces. It is the LLM-native view: trace waterfalls per
+  request, token cost, prompt and completion inspection, plus datasets, evals,
+  and prompt versioning.
+- **Splunk** gets everything, for archival and correlation with the rest of the
+  homelab's telemetry — the same indexer the AI-coding pipeline already feeds.
+
+Apps never talk to a trace store directly, and they never reach across into the
+monitoring tier — they emit to the collector, and the collector decides where it
+goes. One ingest point, two sinks, no second collector to run.
+
+## Why Langfuse
+
+| Criterion | Langfuse |
+| --- | --- |
+| License | MIT — self-host with no feature gates |
+| Ingestion | Native OTLP, GenAI-convention aware |
+| Built for | LLM apps — traces, cost, evals, prompt management |
+| Footprint | Web + worker + Postgres + ClickHouse + Redis + object storage |
+
+[Laminar](https://laminar.sh/) (Apache-2.0) is the runner-up — lighter, tilted
+toward long-running agent debugging. Arize Phoenix is capable but ships under the
+Elastic License, which gates self-host use.
+
+<Note>
+Langfuse keeps its trace-of-record (relational + analytical) on durable local
+storage; its blob store points at the homelab object store. Backend choices like
+the vector store and model provider are made **per tool, per that tool's own
+standard** — never by forcing a shared backend across unrelated stacks.
+</Note>
+
+## Where to go next
+
+<CardGroup cols={2}>
+  <Card title="AI orchestration stack" icon="diagram-project" href="/ai-development/ai-orchestration-stack">
+    The tools whose calls this pipeline traces.
+  </Card>
+  <Card title="Observability overview" icon="chart-line" href="/observability/overview">
+    The AI-coding-tool side of the same Cribl → Splunk spine.
+  </Card>
+  <Card title="ansible-proxmox-apps" icon="screwdriver-wrench" href="/infrastructure/repos/ansible-proxmox-apps">
+    Deploys Langfuse and the Cribl OTEL sources.
+  </Card>
+  <Card title="Local LLM" icon="microchip" href="/infrastructure/local-llm">
+    The models being traced.
+  </Card>
+</CardGroup>