From 36043a8c9184587579d97489ad3ad168125d7a37 Mon Sep 17 00:00:00 2001
From: TouqeerHamdani <touqeerhamdani26@gmail.com>
Date: Tue, 12 May 2026 21:03:23 +0530
Subject: [PATCH 1/2] level-5: Touqeer Hamdani

---
 submissions/Touqeer-Hamdani/level5/answers.md | 322 ++++++++++++++++++
 submissions/Touqeer-Hamdani/level5/schema.md  |  79 +++++
 2 files changed, 401 insertions(+)
 create mode 100644 submissions/Touqeer-Hamdani/level5/answers.md
 create mode 100644 submissions/Touqeer-Hamdani/level5/schema.md

diff --git a/submissions/Touqeer-Hamdani/level5/answers.md b/submissions/Touqeer-Hamdani/level5/answers.md
new file mode 100644
index 000000000..5ee08a4fc
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level5/answers.md
@@ -0,0 +1,322 @@
+# Level 5 — Graph Thinking
+
+**Author:** Touqeer Hamdani  
+**Date:** May 2026
+
+---
+
+## Q1. Model It (20 pts)
+
+### Graph Schema
+
+> Full diagram: [`schema.md`](./schema.md)
+
+The graph schema is designed around the 3 factory CSVs and captures the full production planning domain — projects, what they produce, where they're built, who builds them, and when.
+
+### Node Labels (8)
+
+| Label | Source | Key Properties | Count |
+|-------|--------|----------------|-------|
+| **Project** | production.csv → `project_id`, `project_number`, `project_name` | project_id, project_number, project_name | 8 |
+| **Product** | production.csv → `product_type`, `unit` | product_type, unit | 7 |
+| **Station** | production.csv → `station_code`, `station_name` | station_code, station_name | 10 |
+| **Worker** | workers.csv → `worker_id`, `name` | worker_id, name, role, hours_per_week, type | 14 |
+| **Week** | capacity.csv → `week` | week_id | 8 |
+| **Factory** | Implicit overall plant | factory_name | 1 |
+| **Certification** | workers.csv → `certifications` (split by comma) | cert_name | 23 unique |
+| **Etapp** | production.csv → `etapp` | etapp_name | 2 (ET1, ET2) |
+
+### Relationship Types (9)
+
+| Relationship | Direction | Properties |
+|-------------|-----------|------------|
+| **PRODUCES** | `(Project)→(Product)` | `quantity`, `unit_factor`, `unit` |
+| **SCHEDULED_AT** | `(Project)→(Station)` | `planned_hours`, `actual_hours`, `completed_units`, `week`, `etapp`, `bop`, `variance_pct` |
+| **ACTIVE_IN** | `(Project)→(Week)` | — |
+| **IN_PHASE** | `(Project)→(Etapp)` | — |
+| **WORKS_AT** | `(Worker)→(Station)` | — (primary station assignment) |
+| **CAN_COVER** | `(Worker)→(Station)` | — (cross-trained coverage) |
+| **HOLDS** | `(Worker)→(Certification)` | — |
+| **LOADED_IN** | `(Station)→(Week)` | `total_planned`, `total_actual` |
+| **HAS_CAPACITY** | `(Week)→(Factory)` | `own_hours`, `hired_hours`, `overtime_hours`, `total_planned`, `deficit` |
+
+### Data-Carrying Relationships (4)
+
+1. **PRODUCES** — Each project-to-product edge carries `{quantity: 600, unit_factor: 1.77, unit: "meter"}`, capturing the production spec. This lets you query things like "which projects produce more than 500 meters of IQB?" directly from the relationship.
+
+2. **SCHEDULED_AT** — The core operational edge. Each project-station-week combination carries `{planned_hours: 48.0, actual_hours: 45.2, completed_units: 28, week: "w1", etapp: "ET1", bop: "BOP1"}`. This is the richest relationship in the graph — it's where all the variance analysis lives, and where we track the phase (`etapp`, `bop`) of the work.
+
+3. **LOADED_IN** — Aggregated station load per week: `{total_planned: 393, total_actual: 410}`. Enables capacity-vs-demand queries at the station level without re-aggregating from SCHEDULED_AT every time. *(Note: these properties are calculated by aggregating `SCHEDULED_AT` edges during graph construction, as `factory_capacity.csv` only provides factory-wide totals).*
+
+4. **HAS_CAPACITY** — Links each week to the global factory, carrying the `{own_hours, hired_hours, overtime_hours, total_planned, deficit}` workforce metrics straight out of `factory_capacity.csv`. This perfectly mirrors the exact relationship pattern requested in the L6 instructions.
+
+### Design Decisions
+
+- **Certification as a node** (not a Worker property): Workers share certifications (e.g., multiple workers hold MIG/MAG). Modeling it as a node enables queries like "find all workers certified for TIG welding" with a single hop instead of string parsing.
+- **Etapp and BOP as properties on SCHEDULED_AT**: Since a single project can move through different BOPs (phases) across different stations and weeks, treating `bop` and `etapp` as edge properties accurately models *when and where* that phase occurs, rather than applying a blanket phase to the entire project.
+- **SCHEDULED_AT carries `week` as a property** rather than routing through Week nodes: This keeps the most queried relationship (planned vs actual hours) as a direct Project→Station edge. The separate ACTIVE_IN relationship to Week handles the temporal dimension when needed.
+- **Etapp as a node** (for L6 compliance): The L6 spec explicitly requires `Etapp` as a node label. From a pure design perspective, etapp works better as an edge property on SCHEDULED_AT (only 2 values, no properties of its own), and we keep it there for direct querying. The `Etapp` node + `IN_PHASE` relationship is included to meet the L6 minimum graph requirements.
+
+### Implementation Notes for L6
+
+- **SCHEDULED_AT creates parallel edges**: A single (Project, Station) pair can have multiple SCHEDULED_AT edges — one per week/etapp/bop/product combination. For example, P01→Station 011 appears in both w1 and w2. Additionally, P05→Station 018 has two rows in the same week (w1) with the same etapp/bop (ET2/BOP3) but different product types (SB and SD). In `seed_graph.py`, use `MERGE` with a composite key including `week`, `etapp`, `bop`, **and** `product_type` to ensure idempotency without data loss:
+  ```cypher
+  MERGE (p:Project {project_id: row.project_id})
+  MERGE (s:Station {station_code: row.station_code})
+  MERGE (p)-[r:SCHEDULED_AT {week: row.week, etapp: row.etapp, bop: row.bop, product_type: row.product_type}]->(s)
+  SET r.planned_hours = toFloat(row.planned_hours),
+      r.actual_hours = toFloat(row.actual_hours),
+      r.completed_units = toInteger(row.completed_units)
+  ```
+- **PRODUCES needs deduplication**: The same (Project, Product) pair appears across many CSV rows (different weeks/stations), but the production spec (`quantity`, `unit_factor`, `unit`) is constant per pair. Create **one** PRODUCES edge per unique `(project_id, product_type)` — either deduplicate in Python before loading, or use `MERGE` on the pair:
+  ```cypher
+  MERGE (p:Project {project_id: row.project_id})
+  MERGE (prod:Product {product_type: row.product_type})
+  MERGE (p)-[r:PRODUCES]->(prod)
+  SET r.quantity = toInteger(row.quantity),
+      r.unit_factor = toFloat(row.unit_factor),
+      r.unit = row.unit
+  ```
+
+---
+
+## Q2. Why Not Just SQL? (20 pts)
+
+*Prompt: "Which workers are certified to cover Station 016 (Gjutning) when Per Gustafsson is on vacation, and which projects would be affected?"*
+
+> **Data Reality Check:** In `factory_workers.csv`, the worker at Station 016 is actually named **Per Hansen** (W07), not Per Gustafsson. The queries below reflect the actual data.
+
+### 1. The SQL Query
+Assuming a standard relational schema with normalized tables (`Workers`, `Worker_Coverage`, `Stations`, `Project_Schedules`, `Projects`), we must join 5 tables to traverse the relationships:
+
+```sql
+SELECT 
+    w.name AS CoveringWorker, 
+    GROUP_CONCAT(DISTINCT p.project_name) AS AffectedProjects
+FROM Workers w
+JOIN Worker_Coverage wc ON w.worker_id = wc.worker_id
+JOIN Project_Schedules ps ON wc.station_code = ps.station_code
+JOIN Projects p ON ps.project_id = p.project_id
+WHERE wc.station_code = '016' 
+  AND w.name != 'Per Hansen'
+GROUP BY w.name;
+```
+
+### 2. The Cypher Query
+Using our graph schema, the query becomes a visual representation of the path: `Worker → Station ← Project`:
+
+```cypher
+MATCH (w:Worker)-[:CAN_COVER]->(s:Station {station_code: "016"})<-[:SCHEDULED_AT]-(p:Project)
+WHERE w.name <> "Per Hansen"
+RETURN w.name AS CoveringWorker, collect(DISTINCT p.project_name) AS AffectedProjects
+```
+
+### 3. What the graph makes obvious that SQL hides
+SQL forces you to think about database mechanics—specifically, resolving foreign keys across multiple intermediate junction tables just to traverse a simple real-world relationship. The graph version (Cypher) hides those storage mechanics and makes the network topology instantly obvious, perfectly mirroring how a human visualizes the factory floor: "Find workers who point to this station, and find projects that point to this station."
+
+---
+
+## Q3. Spot the Bottleneck (20 pts)
+
+### 1. Identifying the Overload
+
+From `factory_capacity.csv`, five out of eight weeks show capacity deficits:
+
+| Week | Total Capacity | Total Planned | Deficit |
+|------|---------------|---------------|---------|
+| w1 | 480 | 612 | **-132** |
+| w2 | 520 | 645 | **-125** |
+| w4 | 500 | 550 | **-50** |
+| w6 | 440 | 520 | **-80** |
+| w7 | 520 | 600 | **-80** |
+
+Using `factory_production.csv` to drill into the two worst weeks (w1 and w2):
+
+**Volume Bottleneck (Station 011):** Station 011 (FS IQB) is the primary structural bottleneck. In w1, it is scheduled to handle work from 7 projects simultaneously (P01, P02, P03, P04, P05, P07, P08). As the entry point of the manufacturing pipeline, it creates a massive initial capacity constraint.
+
+**Volume Driver (Project P05):** Project P05 (Sjukhus Linköping ET2) is the largest individual contributor (1200 meters of IQB). It heavily loads the early-stage stations in w1.
+
+**Efficiency Overruns (Station 016):** While 011 causes deficits via sheer scheduled volume, Station 016 (Gjutning / Casting) causes deficits through poor execution efficiency. Looking at the worst overruns by percentage (actual vs planned hours):
+
+| Project | Station | Week | Planned | Actual | Variance |
+|---------|---------|------|---------|--------|----------|
+| P03 | 016 Gjutning | w2 | 28.0 | 35.0 | **+25%** |
+| P04 | 018 SB B/F-hall | w1 | 19.0 | 22.0 | **+16%** |
+| P05 | 016 Gjutning | w2 | 35.0 | 40.0 | **+14%** |
+| P03 | 014 Svets o montage | w1 | 42.0 | 48.0 | **+14%** |
+| P08 | 016 Gjutning | w3 | 22.0 | 25.0 | **+14%** |
+
+Station 016 appears repeatedly in the worst overruns. Therefore, the factory capacity deficit is a dual problem: structural schedule overload at the start of the pipeline (011), and severe execution overruns at the finishing stages (016).
+
+### 2. Cypher Query
+
+```cypher
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+WHERE r.actual_hours > r.planned_hours * 1.1
+RETURN s.station_name AS Station, 
+       collect({
+           project: p.project_name, 
+           variance_pct: round((r.actual_hours - r.planned_hours) / r.planned_hours * 100, 1)
+       }) AS Overruns
+```
+
+### 3. Modeling the Alert as a Graph Pattern
+
+**Approach: Store `variance_pct` as a numeric property on SCHEDULED_AT.**
+
+During graph seeding, compute and store the variance percentage on each scheduling edge:
+
+```cypher
+SET r.variance_pct = round((r.actual_hours - r.planned_hours) / r.planned_hours * 100, 1)
+```
+
+This means the threshold is applied at **query time**, not at seed time — making it fully flexible:
+
+```cypher
+// 10% threshold for alerts
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+WHERE r.variance_pct > 10
+RETURN s.station_name, p.project_name, r.variance_pct ORDER BY r.variance_pct DESC
+
+// 5% threshold for Q4's hybrid query (finding well-executed projects)
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+WHERE r.variance_pct < 5
+RETURN p.project_name, avg(r.variance_pct)
+```
+
+**Why this over a `(:Bottleneck)` node or a boolean flag:**
+- A dedicated `(:Alert)` node adds schema complexity (extra nodes + relationships) for what is essentially a simple numeric comparison on existing data.
+- A boolean `overrun: true/false` flag loses the magnitude — a 11% overrun and a 50% overrun both say `true`, and changing the threshold requires re-seeding.
+- A numeric `variance_pct` preserves full fidelity, keeps the data where it naturally belongs (on the scheduling edge), and lets dashboards apply any threshold on the fly.
+
+---
+
+## Q4. Vector + Graph Hybrid (20 pts)
+
+*Prompt text:* "450 meters of IQB beams for a hospital extension in Linköping, similar scope to previous hospital projects, tight timeline"
+
+### 1. What to Embed
+There are two ways to handle this, ranging from a simple baseline to a robust production system:
+
+**Approach A: Composite Description (Baseline & Simplicity)**
+The simplest method is to create a single composite text block for each project combining its name, location, building type, and product scope (e.g., `"Sjukhus Linköping ET2, hospital, Linköping, 1200m IQB..."`). We embed this entire paragraph. This captures the overall semantic context ("vibe") perfectly for basic similarity searches.
+
+**Approach B: Metadata Extraction & Filtering (Robust Precision)**
+Relying entirely on a single embedding can sometimes be risky (e.g., the model might heavily weight "tight timeline" and return a project from the wrong city). A more precise, production-grade approach is to use an LLM to extract structured metadata from the free-text query (e.g., `location: "Linköping"`, `material: "IQB beams"`). We then use those extracted properties to perform exact comparisons and use them as **hard graph filters**, relying on the vector embedding purely for the fuzzy semantic matching of the remaining context.
+
+*For the L5/L6 scope, Approach A is the standard expected baseline, but Approach B represents a more advanced architecture.*
+
+### 2. The Hybrid Query
+This query performs a two-stage pipeline: it uses Neo4j's vector index to find semantically similar projects, and then traverses the graph to filter out projects that were executed poorly.
+
+```cypher
+// Stage 1: Vector Search for top 5 semantic matches
+CALL db.index.vector.queryNodes('project_embeddings', 5, $queryEmbedding)
+YIELD node AS candidate, score
+
+// Stage 2: Graph Traversal for operational quality
+MATCH (candidate)-[r:SCHEDULED_AT]->(s:Station)
+WHERE s.station_code IN ["011", "012", "013", "014", "016", "017"] // IQB pipeline stations
+  AND r.variance_pct < 5 // Must be a well-executed project
+  
+RETURN candidate.project_name AS ReferenceProject, 
+       score AS SimilarityScore,
+       collect(DISTINCT s.station_name) AS StationsUsed
+ORDER BY score DESC
+```
+
+### 3. Why this is better than just filtering by product type
+If we only filtered the database by `product_type = 'IQB'`, we would return almost every project in the factory's history (P01–P06, P08). This is useless for accurate planning. 
+
+The hybrid approach provides two crucial layers of intelligence:
+1. **The Vector Layer** captures human context. A "hospital extension in Linköping" is semantically similar to past project P05 ("Sjukhus Linköping ET2") due to building type and location, whereas a standard filter would treat it exactly the same as a parking garage in Helsingborg (P04).
+2. **The Graph Layer** ensures operational reliability. By traversing the `SCHEDULED_AT` edges and checking the `variance_pct` (our property from Q3), we ensure that the semantically matched project was actually executed well on the factory floor, making it a trustworthy baseline for scheduling the new request.
+
+---
+
+## Q5. Your L6 Plan (20 pts)
+
+### 1. Node Labels → CSV Column Mappings
+
+| Node Label | Source | CSV Columns | Key | Count |
+|------------|--------|-------------|-----|-------|
+| **Project** | production.csv | `project_id`, `project_number`, `project_name` | `project_id` | 8 |
+| **Product** | production.csv | `product_type`, `unit` | `product_type` | 7 |
+| **Station** | production.csv | `station_code`, `station_name` | `station_code` | 10 |
+| **Worker** | workers.csv | `worker_id`, `name`, `role`, `hours_per_week`, `type` | `worker_id` | 14 |
+| **Week** | capacity.csv | `week` | `week` | 8 |
+| **Factory** | Implicit | — (single node) | — | 1 |
+| **Certification** | workers.csv | `certifications` (comma-split) | `cert_name` | 23 |
+| **Etapp** | production.csv | `etapp` | `etapp_name` | 2 |
+
+### 2. Relationship Types → What Creates Them
+
+| Relationship | Created By | Properties |
+|---|---|---|
+| **PRODUCES** | `MERGE` on unique `(project_id, product_type)` pairs from production.csv | `quantity`, `unit_factor`, `unit` |
+| **SCHEDULED_AT** | Each row of production.csv → `MERGE` with composite key `{week, etapp, bop, product_type}` | `planned_hours`, `actual_hours`, `completed_units`, `week`, `etapp`, `bop`, `product_type`, `variance_pct` |
+| **ACTIVE_IN** | Distinct `(project_id, week)` pairs from production.csv | — |
+| **IN_PHASE** | Distinct `(project_id, etapp)` pairs from production.csv | — |
+| **WORKS_AT** | workers.csv → `primary_station` column | — |
+| **CAN_COVER** | workers.csv → `can_cover_stations` (comma-split, one edge per station) | — |
+| **HOLDS** | workers.csv → `certifications` (comma-split, one edge per cert) | — |
+| **LOADED_IN** | Aggregated from SCHEDULED_AT per (station, week) during seeding | `total_planned`, `total_actual` |
+| **HAS_CAPACITY** | Each row of capacity.csv | `own_hours`, `hired_hours`, `overtime_hours`, `total_planned`, `deficit` |
+
+#### Seed Script Constraints & Idiosyncrasies
+- **Uniqueness Constraints:** To ensure idempotency during the `MERGE` process, the script must create constraints beforehand:
+  `CREATE CONSTRAINT IF NOT EXISTS FOR (p:Project) REQUIRE p.project_id IS UNIQUE` (and similarly for Station, Worker, Week, Product).
+- **Foreman Assignment:** Worker W11 (Victor Elm) is listed as a Foreman with `primary_station = "all"`. The seed script must handle `"all"` correctly (either by skipping the `WORKS_AT` edge and relying solely on his `can_cover_stations` list, or by explicitly creating edges to all 10 stations) to avoid creating a junk station node named "all".
+
+### 3. Streamlit Dashboard Panels (4 + Self-Test)
+
+#### Page 1: Project Overview (10 pts)
+A summary table showing all 8 projects with total planned hours, total actual hours, variance %, and products involved.
+
+```cypher
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+WITH p, sum(r.planned_hours) AS planned, sum(r.actual_hours) AS actual
+MATCH (p)-[:PRODUCES]->(prod:Product)
+RETURN p.project_name AS Project, planned AS PlannedHours, actual AS ActualHours,
+       round((actual - planned) / planned * 100, 1) AS VariancePct,
+       collect(DISTINCT prod.product_type) AS Products
+ORDER BY p.project_id
+```
+
+#### Page 2: Station Load (10 pts)
+Interactive Plotly bar chart showing hours per station across weeks. Stations where actual > planned are highlighted in red.
+
+```cypher
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+RETURN s.station_code AS StationCode, s.station_name AS Station, r.week AS Week,
+       sum(r.planned_hours) AS Planned, sum(r.actual_hours) AS Actual
+ORDER BY StationCode, Week
+```
+
+#### Page 3: Capacity Tracker (10 pts)
+Weekly capacity (own + hired + overtime) vs total planned demand. Deficit weeks are color-coded red.
+
+```cypher
+MATCH (w:Week)-[r:HAS_CAPACITY]->(f:Factory)
+RETURN w.week_id AS Week, 
+       r.own_hours + r.hired_hours + r.overtime_hours AS TotalCapacity,
+       r.total_planned AS PlannedDemand,
+       r.deficit AS Deficit
+ORDER BY w.week_id
+```
+
+#### Page 4: Worker Coverage (10 pts)
+Matrix showing which workers can cover which stations. Single-point-of-failure stations (only 1 worker) are flagged.
+
+```cypher
+MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+RETURN s.station_name AS Station, collect(w.name) AS Workers, count(w) AS WorkerCount
+ORDER BY WorkerCount ASC
+```
+
+#### Navigation (5 pts)
+A sidebar will be implemented to allow users to switch seamlessly between the 4 dashboard pages and the Self-Test page without reloading the app.
+
+#### Page 5: Self-Test (20 pts)
+Automated green/red checklist verifying: Neo4j connection, node count ≥ 50, relationship count ≥ 100, 6+ node labels, 8+ relationship types, and variance query returns results.
+
diff --git a/submissions/Touqeer-Hamdani/level5/schema.md b/submissions/Touqeer-Hamdani/level5/schema.md
new file mode 100644
index 000000000..5a4e1dde7
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level5/schema.md
@@ -0,0 +1,79 @@
+# Factory Knowledge Graph — Schema
+
+```mermaid
+graph LR
+    %% ── Node definitions ──
+    Project["🏗️ <b>Project</b><br/>project_id · project_number<br/>project_name"]
+    Product["📦 <b>Product</b><br/>product_type · unit"]
+    Station["🏭 <b>Station</b><br/>station_code · station_name"]
+    Worker["👷 <b>Worker</b><br/>worker_id · name<br/>role · hours_per_week · type"]
+    Week["📅 <b>Week</b><br/>week_id"]
+    Factory["🏭 <b>Factory</b><br/>factory_name"]
+    Certification["🎓 <b>Certification</b><br/>cert_name"]
+    Etapp["🔄 <b>Etapp</b><br/>etapp_name"]
+
+    %% ── Relationships ──
+    Project -->|"PRODUCES<br/>{quantity, unit_factor, unit}"| Product
+    Project -->|"SCHEDULED_AT<br/>{planned_hours, actual_hours, week,<br/>completed_units, etapp, bop, variance_pct}"| Station
+    Project -->|"ACTIVE_IN"| Week
+    Project -->|"IN_PHASE"| Etapp
+
+    Worker -->|"WORKS_AT"| Station
+    Worker -->|"CAN_COVER"| Station
+    Worker -->|"HOLDS"| Certification
+
+    Station -->|"LOADED_IN<br/>{total_planned,<br/>total_actual}"| Week
+    Week -->|"HAS_CAPACITY<br/>{own_hours, hired_hours, overtime_hours, total_planned, deficit}"| Factory
+
+    %% ── Styling ──
+    classDef proj fill:#4F46E5,stroke:#3730A3,color:#fff,rx:12
+    classDef prod fill:#059669,stroke:#047857,color:#fff,rx:12
+    classDef stat fill:#D97706,stroke:#B45309,color:#fff,rx:12
+    classDef work fill:#DC2626,stroke:#B91C1C,color:#fff,rx:12
+    classDef week fill:#7C3AED,stroke:#6D28D9,color:#fff,rx:12
+    classDef meta fill:#6B7280,stroke:#4B5563,color:#fff,rx:12
+    classDef cert fill:#0891B2,stroke:#0E7490,color:#fff,rx:12
+    classDef etap fill:#E11D48,stroke:#BE123C,color:#fff,rx:12
+
+    class Project proj
+    class Product prod
+    class Station stat
+    class Worker work
+    class Week week
+    class Factory meta
+    class Certification cert
+    class Etapp etap
+```
+
+## Node Labels (8)
+
+| # | Label | Source CSV | Key Properties | Count |
+|---|-------|-----------|----------------|-------|
+| 1 | **Project** | production.csv | project_id, project_number, project_name | 8 |
+| 2 | **Product** | production.csv | product_type, unit | 7 (IQB, IQP, SB, SD, SP, SR, HSQ) |
+| 3 | **Station** | production.csv | station_code, station_name | 10 (011–019, 021) |
+| 4 | **Worker** | workers.csv | worker_id, name, role, hours_per_week, type | 14 |
+| 5 | **Week** | capacity.csv | week_id | 8 (w1–w8) |
+| 6 | **Factory** | Implicit | factory_name | 1 |
+| 7 | **Certification** | workers.csv | cert_name | 23 unique certs |
+| 8 | **Etapp** | production.csv | etapp_name | 2 (ET1, ET2) |
+
+## Relationship Types (9)
+
+| # | Relationship | From → To | Properties (data-carrying?) |
+|---|-------------|-----------|----------------------------|
+| 1 | **PRODUCES** | Project → Product | ✅ `{quantity, unit_factor, unit}` |
+| 2 | **SCHEDULED_AT** | Project → Station | ✅ `{planned_hours, actual_hours, completed_units, week, etapp, bop, variance_pct}` |
+| 3 | **ACTIVE_IN** | Project → Week | — |
+| 4 | **IN_PHASE** | Project → Etapp | — |
+| 5 | **WORKS_AT** | Worker → Station | — (primary station) |
+| 6 | **CAN_COVER** | Worker → Station | — (coverage capability) |
+| 7 | **HOLDS** | Worker → Certification | — |
+| 8 | **LOADED_IN** | Station → Week | ✅ `{total_planned, total_actual}`* |
+| 9 | **HAS_CAPACITY**| Week → Factory | ✅ `{own_hours, hired_hours, overtime_hours, total_planned, deficit}` |
+
+> 4 relationships carry data properties (**PRODUCES**, **SCHEDULED_AT**, **LOADED_IN**, **HAS_CAPACITY**), exceeding the minimum of 2.
+>
+> *\*Note: `LOADED_IN` properties are calculated by aggregating the `SCHEDULED_AT` edges for each station/week.*
+>
+> *\*Note: `etapp` is also kept as a property on `SCHEDULED_AT` for direct querying. The `Etapp` node is included for L6 compliance, but from a pure design perspective, etapp works better as an edge property since it only has 2 values and carries no properties of its own.*

From d1386a6433f3b3f8ef3a7be02caa94a8c8f7f9a8 Mon Sep 17 00:00:00 2001
From: TouqeerHamdani <touqeerhamdani26@gmail.com>
Date: Tue, 12 May 2026 21:41:01 +0530
Subject: [PATCH 2/2] level-6: Touqeer Hamdani

---
 .../Touqeer-Hamdani/level6/.env.example       |   3 +
 .../Touqeer-Hamdani/level6/DASHBOARD_URL.txt  |   1 +
 submissions/Touqeer-Hamdani/level6/README.md  |  69 ++
 submissions/Touqeer-Hamdani/level6/app.py     | 824 ++++++++++++++++++
 .../Touqeer-Hamdani/level6/requirements.txt   |   6 +
 .../Touqeer-Hamdani/level6/seed_graph.py      | 297 +++++++
 6 files changed, 1200 insertions(+)
 create mode 100644 submissions/Touqeer-Hamdani/level6/.env.example
 create mode 100644 submissions/Touqeer-Hamdani/level6/DASHBOARD_URL.txt
 create mode 100644 submissions/Touqeer-Hamdani/level6/README.md
 create mode 100644 submissions/Touqeer-Hamdani/level6/app.py
 create mode 100644 submissions/Touqeer-Hamdani/level6/requirements.txt
 create mode 100644 submissions/Touqeer-Hamdani/level6/seed_graph.py

diff --git a/submissions/Touqeer-Hamdani/level6/.env.example b/submissions/Touqeer-Hamdani/level6/.env.example
new file mode 100644
index 000000000..bdd17bb95
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/.env.example
@@ -0,0 +1,3 @@
+NEO4J_URI = "neo4j+s://xxxxx.databases.neo4j.io"
+NEO4J_USER = "neo4j"
+NEO4J_PASSWORD = "your-password"
\ No newline at end of file
diff --git a/submissions/Touqeer-Hamdani/level6/DASHBOARD_URL.txt b/submissions/Touqeer-Hamdani/level6/DASHBOARD_URL.txt
new file mode 100644
index 000000000..6d1f8d412
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/DASHBOARD_URL.txt
@@ -0,0 +1 @@
+https://l6-factory-dashboard-touqeerhamdani.streamlit.app
diff --git a/submissions/Touqeer-Hamdani/level6/README.md b/submissions/Touqeer-Hamdani/level6/README.md
new file mode 100644
index 000000000..13313e1b5
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/README.md
@@ -0,0 +1,69 @@
+# Factory Knowledge Graph Dashboard — Level 6
+
+A **Neo4j knowledge graph** + **Streamlit dashboard** for a Swedish steel fabrication company managing 8 construction projects across 10 production stations.
+
+##  Quick Start
+
+### 1. Prerequisites
+- Python 3.10+
+- A Neo4j instance (recommended: [Neo4j Aura Free](https://neo4j.io/aura))
+
+### 2. Setup
+```bash
+python -m venv venv
+venv\Scripts\activate          # Windows
+# source venv/bin/activate     # macOS/Linux
+pip install -r requirements.txt
+```
+
+### 3. Configure credentials
+Copy `.env.example` → `.env` and fill in your Neo4j credentials:
+```
+NEO4J_URI=neo4j+s://xxxxx.databases.neo4j.io
+NEO4J_USER=neo4j
+NEO4J_PASSWORD=your-password
+```
+
+### 4. Seed the graph (run once)
+```bash
+python seed_graph.py
+```
+
+### 5. Launch the dashboard
+```bash
+streamlit run app.py
+```
+
+##  Dashboard Pages
+
+| Page | Description |
+|------|-------------|
+| **Project Overview** | All 8 projects with planned/actual hours, variance %, and products |
+| **Station Load** | Interactive bar chart — hours per station per week, overloads in red |
+| **Capacity Tracker** | Stacked capacity bars + demand line, deficit weeks highlighted |
+| **Worker Coverage** | Coverage matrix + SPOF (single-point-of-failure) station detection |
+| **Self-Test** | Automated 6-check verification (20 pts) |
+
+## Project Structure
+
+```
+l6-factory-dashboard/
+├── seed_graph.py        # CSV → Neo4j (idempotent, uses MERGE)
+├── app.py               # Streamlit dashboard (5 pages)
+├── requirements.txt
+├── .env.example
+├── README.md
+├── DASHBOARD_URL.txt
+└── data/
+    ├── factory_production.csv
+    ├── factory_workers.csv
+    └── factory_capacity.csv
+```
+
+##  Deployed URL
+
+See `DASHBOARD_URL.txt`.
+
+##  Author
+
+**Touqeer Hamdani** — Level 6 submission, May 2026.
diff --git a/submissions/Touqeer-Hamdani/level6/app.py b/submissions/Touqeer-Hamdani/level6/app.py
new file mode 100644
index 000000000..bda45bf8c
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/app.py
@@ -0,0 +1,824 @@
+"""
+app.py — Factory Knowledge Graph Dashboard
+Streamlit application with 6 pages powered by Neo4j.
+"""
+
+import streamlit as st
+from neo4j import GraphDatabase
+import pandas as pd
+import plotly.express as px
+import plotly.graph_objects as go
+from plotly.subplots import make_subplots
+import os
+from dotenv import load_dotenv
+import statsmodels.api as sm
+
+# ── Page config ──────────────────────────────────────────────────────────────
+
+st.set_page_config(
+    page_title="Factory Dashboard",
+    page_icon=None,
+    layout="wide",
+    initial_sidebar_state="expanded",
+)
+
+# ── Custom CSS ───────────────────────────────────────────────────────────────
+
+st.markdown("""
+<style>
+    /* Import modern font */
+    @import url('https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&display=swap');
+
+    html, body, [class*="css"] { font-family: 'Inter', sans-serif; }
+
+    /* KPI metric cards */
+    .kpi-card {
+        background: linear-gradient(135deg, #1e293b 0%, #334155 100%);
+        border-radius: 12px;
+        padding: 1.2rem 1.5rem;
+        color: #f1f5f9;
+        border-left: 4px solid;
+        box-shadow: 0 4px 12px rgba(0,0,0,0.15);
+    }
+    .kpi-card h3 { margin: 0 0 0.3rem 0; font-size: 0.85rem; color: #94a3b8; font-weight: 500; }
+    .kpi-card .value { font-size: 1.8rem; font-weight: 700; }
+    .kpi-card .sub { font-size: 0.75rem; color: #64748b; margin-top: 0.2rem; }
+    .kpi-blue   { border-color: #3b82f6; }
+    .kpi-green  { border-color: #22c55e; }
+    .kpi-red    { border-color: #ef4444; }
+    .kpi-amber  { border-color: #f59e0b; }
+
+    /* Self-test items */
+    .check-pass { color: #22c55e; font-weight: 600; }
+    .check-fail { color: #ef4444; font-weight: 600; }
+
+    /* Section dividers */
+    .section-header {
+        font-size: 1.1rem; font-weight: 600; color: #e2e8f0;
+        border-bottom: 2px solid #334155; padding-bottom: 0.4rem;
+        margin: 1.5rem 0 1rem 0;
+    }
+
+    /* Hide default Streamlit footer */
+    footer { visibility: hidden; }
+
+    /* Sidebar styling */
+    [data-testid="stSidebar"] {
+        background: linear-gradient(180deg, #0f172a 0%, #1e293b 100%);
+    }
+    [data-testid="stSidebar"] .stRadio label {
+        color: #cbd5e1 !important;
+        font-weight: 500;
+    }
+</style>
+""", unsafe_allow_html=True)
+
+
+# ── Neo4j connection ─────────────────────────────────────────────────────────
+
+@st.cache_resource
+def get_driver():
+    """Connect to Neo4j — supports both st.secrets (Cloud) and .env (local)."""
+    try:
+        uri      = st.secrets["NEO4J_URI"]
+        user     = st.secrets["NEO4J_USER"]
+        password = st.secrets["NEO4J_PASSWORD"]
+    except Exception:
+        load_dotenv()
+        uri      = os.getenv("NEO4J_URI")
+        user     = os.getenv("NEO4J_USER")
+        password = os.getenv("NEO4J_PASSWORD")
+    return GraphDatabase.driver(uri, auth=(user, password))
+
+
+def query_to_df(cypher: str) -> pd.DataFrame:
+    """Run a Cypher query and return the results as a DataFrame."""
+    driver = get_driver()
+    with driver.session() as session:
+        result = session.run(cypher)
+        return pd.DataFrame([dict(r) for r in result])
+
+
+# ── Sidebar navigation ──────────────────────────────────────────────────────
+
+st.sidebar.markdown("## Factory Dashboard")
+st.sidebar.markdown("---")
+page = st.sidebar.radio("Navigate", [
+    "Project Overview",
+    "Station Load",
+    "Capacity Tracker",
+    "Worker Coverage",
+    "Load Forecast",
+    "Self-Test",
+])
+st.sidebar.markdown("---")
+st.sidebar.caption("Level 6 · Touqeer Hamdani")
+
+
+# ── Helper: render a KPI card ────────────────────────────────────────────────
+
+def kpi(label, value, sub="", color="blue"):
+    st.markdown(f"""
+    <div class="kpi-card kpi-{color}">
+        <h3>{label}</h3>
+        <div class="value">{value}</div>
+        <div class="sub">{sub}</div>
+    </div>
+    """, unsafe_allow_html=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 1 — Project Overview
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_project_overview():
+    st.header("Project Overview")
+    st.caption("All 8 factory projects with planned vs actual hours and variance analysis.")
+
+    df = query_to_df("""
+        MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+        WITH p,
+             sum(r.planned_hours) AS planned,
+             sum(r.actual_hours)  AS actual
+        OPTIONAL MATCH (p)-[:PRODUCES]->(prod:Product)
+        RETURN p.project_id   AS ID,
+               p.project_name AS Project,
+               planned        AS PlannedHours,
+               actual         AS ActualHours,
+               CASE 
+                 WHEN planned = 0 THEN 0.0 
+                 ELSE round((actual - planned) / planned * 100, 1) 
+               END AS VariancePct,
+               collect(DISTINCT prod.product_type) AS Products
+        ORDER BY p.project_id
+    """)
+
+    if df.empty:
+        st.warning("No data found. Has `seed_graph.py` been run?")
+        return
+
+    # KPI cards
+    total_planned = df["PlannedHours"].sum()
+    total_actual  = df["ActualHours"].sum()
+    avg_var       = round((total_actual - total_planned) / total_planned * 100, 1)
+    overrun_count = len(df[df["VariancePct"] > 0])
+
+    c1, c2, c3, c4 = st.columns(4)
+    with c1: kpi("Projects", len(df), "active in schedule", "blue")
+    with c2: kpi("Total Planned Hours", f"{total_planned:,.0f} h", "across all stations", "green")
+    with c3: kpi("Total Actual Hours", f"{total_actual:,.0f} h", f"{'+' if total_actual > total_planned else '-'} vs plan", "amber")
+    with c4: kpi("Average Plan Variance", f"{avg_var:+.1f}%", f"{overrun_count} projects over plan", "red" if avg_var > 0 else "green")
+
+    st.markdown("")
+
+    # Format products column for display
+    display_df = df.copy()
+    display_df["Products"] = display_df["Products"].apply(lambda x: ", ".join(x) if isinstance(x, list) else x)
+    display_df["VariancePct"] = display_df["VariancePct"].apply(lambda v: f"{v:+.1f}%")
+
+    st.dataframe(
+        display_df,
+        use_container_width=True,
+        hide_index=True,
+        column_config={
+            "ID":           st.column_config.TextColumn("ID", width="small"),
+            "Project":      st.column_config.TextColumn("Project"),
+            "PlannedHours": st.column_config.NumberColumn("Planned (h)", format="%.1f"),
+            "ActualHours":  st.column_config.NumberColumn("Actual (h)", format="%.1f"),
+            "VariancePct":  st.column_config.TextColumn("Variance"),
+            "Products":     st.column_config.TextColumn("Products"),
+        },
+    )
+
+    # Bar chart: planned vs actual per project
+    fig = go.Figure()
+    fig.add_trace(go.Bar(name="Planned", x=df["Project"], y=df["PlannedHours"],
+                         marker_color="#3b82f6"))
+    fig.add_trace(go.Bar(name="Actual",  x=df["Project"], y=df["ActualHours"],
+                         marker_color=["#22c55e" if a <= p else "#ef4444"
+                                       for a, p in zip(df["ActualHours"], df["PlannedHours"])]))
+    fig.update_layout(
+        barmode="group", template="plotly_dark",
+        title="Planned vs Actual Hours by Project",
+        xaxis_title="Project", yaxis_title="Hours",
+        height=420, margin=dict(t=50, b=40),
+        legend_title="Metric", showlegend=True
+    )
+    st.plotly_chart(fig, use_container_width=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 2 — Station Load
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_station_load():
+    st.header("Station Load")
+    st.caption("Hours per station across weeks. Red = actual exceeds planned.")
+
+    df = query_to_df("""
+        MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+        RETURN s.station_code AS StationCode,
+               s.station_name AS Station,
+               r.week         AS Week,
+               sum(r.planned_hours) AS Planned,
+               sum(r.actual_hours)  AS Actual
+        ORDER BY s.station_code, r.week
+    """)
+
+    if df.empty:
+        st.warning("No data found.")
+        return
+
+    df["Overloaded"] = df["Actual"] > df["Planned"]
+    df["Label"] = df["StationCode"] + " " + df["Station"]
+
+    # Week filter
+    weeks = sorted(df["Week"].unique())
+    selected_weeks = st.multiselect("Filter by week", weeks, default=weeks)
+    filtered = df[df["Week"].isin(selected_weeks)]
+
+    # Grouped bar chart
+    filtered = filtered.copy()
+    filtered["Label_Week"] = filtered["Label"] + " - " + filtered["Week"].astype(str)
+
+    tick_text = [
+        f'<span style="color: {"#ef4444" if row["Overloaded"] else "#e2e8f0"}">{row["Label_Week"]}</span>'
+        for _, row in filtered.iterrows()
+    ]
+
+    fig = go.Figure()
+    fig.add_trace(go.Bar(
+        name="Planned", x=filtered["Label_Week"],
+        y=filtered["Planned"], marker_color="#3b82f6",
+    ))
+    fig.add_trace(go.Bar(
+        name="Actual", x=filtered["Label_Week"],
+        y=filtered["Actual"],
+        marker_color=["#ef4444" if o else "#22c55e" for o in filtered["Overloaded"]],
+    ))
+    fig.update_layout(
+        barmode="group", template="plotly_dark",
+        title="Station Load: Planned vs Actual",
+        yaxis_title="Hours",
+        height=500, margin=dict(t=50, b=80),
+        legend_title="Metric", showlegend=True,
+        xaxis=dict(
+            title="Station - Week",
+            tickmode="array",
+            tickvals=filtered["Label_Week"],
+            ticktext=tick_text
+        )
+    )
+    st.plotly_chart(fig, use_container_width=True)
+
+    # Summary table
+    with st.expander("Detailed data"):
+        st.dataframe(filtered[["StationCode", "Station", "Week", "Planned", "Actual", "Overloaded"]],
+                      use_container_width=True, hide_index=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 3 — Capacity Tracker
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_capacity_tracker():
+    st.header("Capacity Tracker")
+    st.caption("Weekly capacity (own + hired + overtime) vs planned demand. Deficit weeks in red.")
+
+    df = query_to_df("""
+        MATCH (w:Week)-[r:HAS_CAPACITY]->(f:Factory)
+        RETURN w.week_id       AS Week,
+               r.own_hours     AS Own,
+               r.hired_hours   AS Hired,
+               r.overtime_hours AS Overtime,
+               r.own_hours + r.hired_hours + r.overtime_hours AS TotalCapacity,
+               r.total_planned AS PlannedDemand,
+               r.deficit       AS Deficit
+        ORDER BY w.week_id
+    """)
+
+    if df.empty:
+        st.warning("No data found.")
+        return
+
+    deficit_weeks = len(df[df["Deficit"] < 0])
+    total_deficit = df[df["Deficit"] < 0]["Deficit"].sum()
+
+    c1, c2, c3 = st.columns(3)
+    with c1: kpi("Deficit Weeks", f"{deficit_weeks} / {len(df)}", "weeks over capacity", "red")
+    with c2: kpi("Cumulative Capacity Deficit", f"{total_deficit:+,.0f} h", "cumulative shortfall", "red")
+    with c3: kpi("Maximum Weekly Deficit", f"{df['Deficit'].min():+,.0f} h", f"in {df.loc[df['Deficit'].idxmin(), 'Week']}", "amber")
+
+    st.markdown("")
+
+    # Color x-axis labels red for deficit weeks (used on the bottom chart)
+    tick_text = [
+        f'<span style="color: {"#ef4444" if row["Deficit"] < 0 else "#e2e8f0"}">{row["Week"]}</span>'
+        for _, row in df.iterrows()
+    ]
+
+    fig = make_subplots(
+        rows=2, cols=1, 
+        shared_xaxes=True, 
+        vertical_spacing=0.1,
+        row_heights=[0.75, 0.25]
+    )
+
+    # Top Chart: Stacked bar (capacity components) + line (demand)
+    fig.add_trace(go.Bar(name="Own Staff",  x=df["Week"], y=df["Own"],      marker_color="#3b82f6"), row=1, col=1)
+    fig.add_trace(go.Bar(name="Hired",      x=df["Week"], y=df["Hired"],    marker_color="#8b5cf6"), row=1, col=1)
+    fig.add_trace(go.Bar(name="Overtime",   x=df["Week"], y=df["Overtime"], marker_color="#f59e0b"), row=1, col=1)
+
+    fig.add_trace(go.Scatter(
+        name="Planned Demand", x=df["Week"], y=df["PlannedDemand"],
+        mode="lines+markers", line=dict(color="#ef4444", width=3, dash="dot"),
+        marker=dict(size=8),
+    ), row=1, col=1)
+
+    # Bottom Chart: Surplus/Deficit Bar
+    fig.add_trace(go.Bar(
+        name="Surplus / Deficit", x=df["Week"], y=df["Deficit"],
+        marker_color=["#ef4444" if d < 0 else "#22c55e" for d in df["Deficit"]],
+        text=[f"{d:+.0f}" for d in df["Deficit"]],
+        textposition="outside",
+        showlegend=False,
+        hovertemplate="Variance: %{y:+.0f}h<extra></extra>"
+    ), row=2, col=1)
+
+    fig.update_layout(
+        barmode="stack", template="plotly_dark",
+        title="Weekly Capacity vs Demand",
+        height=600, margin=dict(t=50, b=40),
+        legend=dict(title="Capacity Type", orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1),
+    )
+    fig.update_yaxes(title_text="Hours", row=1, col=1)
+    fig.update_yaxes(title_text="Variance", row=2, col=1)
+    fig.update_xaxes(
+        title="Week", tickmode="array", 
+        tickvals=df["Week"], ticktext=tick_text, 
+        row=2, col=1
+    )
+    st.plotly_chart(fig, use_container_width=True)
+
+    # Data table
+    with st.expander("Detailed data"):
+        st.dataframe(df, use_container_width=True, hide_index=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 4 — Worker Coverage
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_worker_coverage():
+    st.header("Worker Coverage Matrix")
+    st.caption("Which workers can cover which stations. SPOF stations (≤ 1 unique worker) flagged in red.")
+
+    # Coverage data
+    df = query_to_df("""
+        MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+        RETURN s.station_code AS StationCode,
+               s.station_name AS Station,
+               collect(w.name) AS Workers,
+               count(w) AS WorkerCount
+        ORDER BY WorkerCount ASC
+    """)
+
+    if df.empty:
+        st.warning("No data found.")
+        return
+
+    spof = df[df["WorkerCount"] <= 1]
+    c1, c2 = st.columns(2)
+    with c1: kpi("Total Stations", len(df), "with assigned coverage", "blue")
+    with c2: kpi("SPOF Stations", len(spof),
+                 ", ".join(spof["Station"].tolist()) if len(spof) > 0 else "None",
+                 "red" if len(spof) > 0 else "green")
+
+    st.markdown("")
+
+    # -------------------------------------------------------------------------
+    # Heatmap Matrix
+    # -------------------------------------------------------------------------
+    st.markdown('<div class="section-header">Worker Certification Matrix</div>', unsafe_allow_html=True)
+    st.caption("A visual overview of certifications. Blue indicates a worker is certified for that station.")
+    
+    matrix_df = query_to_df("""
+        MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+        RETURN w.name AS Worker, s.station_code AS StationCode, s.station_name AS Station
+    """)
+
+    if not matrix_df.empty:
+        pivot = matrix_df.pivot_table(
+            index=["StationCode", "Station"],
+            columns="Worker",
+            aggfunc="size",
+            fill_value=0,
+        )
+        pivot = pivot.clip(upper=1)
+        pivot = pivot.sort_index()
+        pivot = pivot[sorted(pivot.columns)]
+        
+        y_labels = [f"{idx[0]} - {idx[1]}" for idx in pivot.index]
+        
+        # Hover text matrix
+        hover_text = []
+        for s, row in zip(y_labels, pivot.values):
+            hover_row = []
+            for w, val in zip(pivot.columns, row):
+                status = "Certified" if val == 1 else "Uncertified"
+                hover_row.append(f"Worker: {w}<br>Station: {s}<br>Status: {status}")
+            hover_text.append(hover_row)
+            
+        fig = go.Figure(data=go.Heatmap(
+            z=pivot.values,
+            x=pivot.columns,
+            y=y_labels,
+            colorscale=[[0, "rgba(255, 255, 255, 0.05)"], [1, "#3b82f6"]],
+            showscale=False,
+            xgap=2, ygap=2,
+            hoverinfo="text",
+            text=hover_text
+        ))
+        
+        fig.update_layout(
+            template="plotly_dark",
+            height=400,
+            margin=dict(t=10, b=80, l=180, r=20),
+            xaxis=dict(tickangle=-45, side="bottom"),
+            yaxis=dict(autorange="reversed")
+        )
+        st.plotly_chart(fig, use_container_width=True)
+
+    # Detailed Coverage Table (satisfies the "table" requirement cleanly without horizontal scroll issues)
+    st.markdown('<div class="section-header">Station Coverage Details</div>', unsafe_allow_html=True)
+    display_df = df.copy()
+    display_df["Workers"] = display_df["Workers"].apply(
+        lambda x: ", ".join(x) if isinstance(x, list) else x
+    )
+
+    def highlight_spof(row):
+        if row["WorkerCount"] <= 1:
+            return ["background-color: rgba(239,68,68,0.15)"] * len(row)
+        return [""] * len(row)
+
+    st.dataframe(
+        display_df.style.apply(highlight_spof, axis=1),
+        use_container_width=True, hide_index=True,
+    )
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 5 — Load Forecast
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_load_forecast():
+    st.header("Load Forecast (Week 9)")
+    st.caption("Predictive analysis identifying where production load will exceed station capacity in the coming week.")
+
+    # 1. Get Historical Load + Variance
+    load_df = query_to_df("""
+        MATCH (s:Station)-[l:LOADED_IN]->(w:Week)
+        RETURN s.station_code AS StationCode,
+               s.station_name AS Station,
+               toInteger(substring(w.week_id, 1)) AS WeekNum,
+               l.total_actual AS ActualLoad,
+               l.total_planned AS PlannedLoad,
+               CASE WHEN l.total_planned > 0 
+                    THEN round((l.total_actual - l.total_planned) / l.total_planned * 100, 1) 
+                    ELSE 0.0 END AS VariancePct
+        ORDER BY StationCode, WeekNum
+    """)
+    
+    # 2. Get Graph-Aware Capacity
+    cap_df = query_to_df("""
+        MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+        WITH w, s
+        MATCH (w)-[:CAN_COVER]->(all_s:Station)
+        WITH w, s, count(all_s) AS total_coverage
+        RETURN s.station_code AS StationCode,
+               s.station_name AS Station,
+               sum(toFloat(w.hours_per_week) / total_coverage) AS Capacity
+        ORDER BY StationCode
+    """)
+
+    if load_df.empty or cap_df.empty:
+        st.warning("No data found. Ensure the graph is seeded.")
+        return
+        
+    load_df["Load"] = load_df["ActualLoad"].fillna(load_df["PlannedLoad"])
+
+    # 3. Process Forecasts
+    forecasts = []
+    
+    # Iterate over all known stations from both load and capacity queries
+    all_stations = sorted(list(set(load_df["StationCode"]).union(set(cap_df["StationCode"]))))
+    
+    for station_code in all_stations:
+        station_data = load_df[load_df["StationCode"] == station_code]
+        cap_series = cap_df[cap_df["StationCode"] == station_code]
+        
+        station_name = station_data["Station"].iloc[0] if not station_data.empty else cap_series["Station"].iloc[0]
+        cap = cap_series["Capacity"].iloc[0] if not cap_series.empty else 0.0
+        
+        if len(station_data) > 1:
+            X = sm.add_constant(station_data["WeekNum"])
+            model = sm.OLS(station_data["Load"], X).fit()
+            pred_9 = model.predict([1, 9])[0]
+        elif len(station_data) == 1:
+            pred_9 = station_data["Load"].iloc[0]
+        else:
+            pred_9 = 0
+            
+        pred_9 = max(0, pred_9)
+        
+        util_pct = (pred_9 / cap * 100) if cap > 0 else (float('inf') if pred_9 > 0 else 0)
+            
+        forecasts.append({
+            "StationCode": station_code,
+            "Station": station_name,
+            "Week9_Forecast": pred_9,
+            "Capacity": cap,
+            "UtilPct": util_pct
+        })
+        
+    forecast_df = pd.DataFrame(forecasts)
+    forecast_df["Status"] = forecast_df.apply(lambda x: "OVERLOAD" if x["Week9_Forecast"] > x["Capacity"] else "SAFE", axis=1)
+    
+    # KPIs
+    overloaded_count = len(forecast_df[forecast_df["Status"] == "OVERLOAD"])
+    avg_util = forecast_df["UtilPct"].mean()
+
+    c1, c2, c3 = st.columns(3)
+    with c1: kpi("Average Factory Utilization", f"{avg_util:.1f}%", "projected for week 9", "blue")
+    with c2: kpi("Critical Stations", overloaded_count, "over capacity in week 9", "red" if overloaded_count > 0 else "green")
+    with c3: kpi("Highest Load", f"{forecast_df['UtilPct'].max():.0f}%", f"at {forecast_df.loc[forecast_df['UtilPct'].idxmax(), 'StationCode']}", "amber")
+
+    st.markdown('<div class="section-header">Global Forecast: Load vs. Capacity (Week 9)</div>', unsafe_allow_html=True)
+    
+    # Global comparison chart
+    fig_global = go.Figure()
+    fig_global.add_trace(go.Bar(
+        name="Projected Load", x=forecast_df["StationCode"], y=forecast_df["Week9_Forecast"],
+        marker_color=["#ef4444" if s == "OVERLOAD" else "#3b82f6" for s in forecast_df["Status"]],
+        hovertemplate="Load: %{y:.1f}h<extra></extra>"
+    ))
+    fig_global.add_trace(go.Scatter(
+        name="Station Capacity", x=forecast_df["StationCode"], y=forecast_df["Capacity"],
+        mode="markers", marker=dict(color="#ffffff", size=12, symbol="line-ew-open", line=dict(width=3)),
+        hovertemplate="Capacity: %{y:.1f}h<extra></extra>"
+    ))
+    fig_global.update_layout(
+        template="plotly_dark", height=350, margin=dict(t=20, b=40, l=10, r=10),
+        xaxis_title="Station Code", yaxis_title="Hours", barmode="group",
+        legend=dict(title="Metric", orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1)
+    )
+    st.plotly_chart(fig_global, use_container_width=True)
+
+    # 4. Station Deep-Dive
+    st.markdown('<div class="section-header">Station Deep-Dive</div>', unsafe_allow_html=True)
+    
+    selected_st = st.selectbox("Select a station to see trend details", 
+                              options=forecast_df["StationCode"].tolist(),
+                              format_func=lambda x: f"{x} - {forecast_df[forecast_df['StationCode']==x]['Station'].iloc[0]}")
+    
+    sd = forecast_df[forecast_df["StationCode"] == selected_st].iloc[0]
+    hist = load_df[load_df["StationCode"] == selected_st]
+    
+    c1, c2 = st.columns([2, 1])
+    
+    with c1:
+        fig_detail = go.Figure()
+        
+        # OLS logic
+        if len(hist) > 1:
+            X = sm.add_constant(hist["WeekNum"])
+            model = sm.OLS(hist["Load"], X).fit()
+            x_range = list(range(1, 10))
+            y_range = model.predict(sm.add_constant(x_range))
+            preds = model.get_prediction(sm.add_constant(x_range))
+            ci = preds.conf_int(alpha=0.1) # 90% confidence
+            y_lower, y_upper = ci[:, 0], ci[:, 1]
+            
+            # Confidence Band
+            fig_detail.add_trace(go.Scatter(
+                x=x_range + x_range[::-1], y=list(y_upper) + list(y_lower)[::-1],
+                fill='toself', fillcolor='rgba(59, 130, 246, 0.1)',
+                line=dict(color='rgba(255,255,255,0)'), name="90% Confidence Interval"
+            ))
+            # Trend
+            fig_detail.add_trace(go.Scatter(x=x_range, y=y_range, mode="lines", 
+                                          line=dict(color="#3b82f6", dash="dash"), name="Trendline"))
+        
+        # Capacity line
+        fig_detail.add_hline(y=sd["Capacity"], line_dash="dot", line_color="#ef4444", 
+                            annotation_text="CAPACITY LIMIT", annotation_position="top left")
+        
+        # Historical
+        fig_detail.add_trace(go.Scatter(x=hist["WeekNum"], y=hist["Load"], mode="lines+markers", 
+                                      marker=dict(color="#ffffff", size=10), name="Historical Load"))
+        
+        # Week 9 Target Point
+        fig_detail.add_trace(go.Scatter(x=[9], y=[sd["Week9_Forecast"]], mode="markers", 
+                                      marker=dict(color="#ef4444" if sd["Status"]=="OVERLOAD" else "#22c55e", size=14, symbol="star"),
+                                      name="W9 Projection"))
+
+        fig_detail.update_layout(
+            template="plotly_dark", height=400, title=f"Load Trend Analysis: {selected_st}",
+            xaxis=dict(title="Week", tickmode="linear", range=[0.5, 9.5]),
+            yaxis=dict(title="Hours"), showlegend=True,
+            legend=dict(title="Metric", orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1)
+        )
+        st.plotly_chart(fig_detail, use_container_width=True)
+        st.caption("💡 **Note**: The shaded area represents the 90% confidence interval of the OLS prediction, indicating the expected range of variance based on historical data.")
+
+        # Variance Trend Chart
+        if not hist.empty and len(hist) > 1:
+            fig_var = px.line(hist, x="WeekNum", y="VariancePct", markers=True,
+                             title=f"Historical Plan Variance (%): {selected_st}",
+                             labels={"VariancePct": "Variance %", "WeekNum": "Week"},
+                             template="plotly_dark", height=200)
+            fig_var.add_hline(y=0, line_dash="dash", line_color="#94a3b8")
+            fig_var.update_traces(line_color="#f59e0b")
+            fig_var.update_layout(margin=dict(t=30, b=20, l=10, r=10))
+            st.plotly_chart(fig_var, use_container_width=True)
+        else:
+            st.info("Insufficient data to show variance trend.")
+
+    with c2:
+        st.markdown(f"### {sd['Status']}")
+        st.write(f"Station **{selected_st}** is projected to reach **{sd['Week9_Forecast']:.1f} hours** in Week 9.")
+        st.write(f"Current capacity limit is **{sd['Capacity']:.1f} hours**.")
+        
+        avg_v = hist["VariancePct"].mean() if not hist.empty else 0.0
+        st.metric("Avg Historical Variance", f"{avg_v:+.1f}%", 
+                  help="Positive means actual hours consistently exceed planned hours.")
+        
+        util_color = "red" if sd["UtilPct"] > 100 else "green"
+        util_display = f"{sd['UtilPct']:.1f}%" if sd["UtilPct"] != float('inf') else "∞%"
+        delta_display = f"{sd['UtilPct']-100:.1f}%" if sd["UtilPct"] > 100 and sd["UtilPct"] != float('inf') else None
+        
+        st.metric("Projected Utilization", util_display, 
+                  delta=delta_display,
+                  delta_color="inverse")
+        
+        if sd["Status"] == "OVERLOAD":
+            st.error(f"Action Required: Station will exceed capacity by {sd['Week9_Forecast'] - sd['Capacity']:.1f} hours.")
+        else:
+            st.success("No immediate capacity action required.")
+
+    # 5. Global Action Recommendations
+    overloads = forecast_df[forecast_df["Status"] == "OVERLOAD"]
+    healthy = forecast_df[forecast_df["Status"] == "SAFE"].copy()
+    healthy["Surplus"] = healthy["Capacity"] - healthy["Week9_Forecast"]
+    healthy = healthy[healthy["Surplus"] >= 5].sort_values("Surplus", ascending=False)
+
+    if not overloads.empty:
+        st.markdown('<div class="section-header" style="color: #ef4444; margin-top: 3rem;">⚠️ Action Recommendations</div>', unsafe_allow_html=True)
+        st.error(f"**{len(overloads)} Stations are projected to exceed capacity in Week 9.** Immediate action is required.")
+        
+        # Get worker coverage map to make smart, graph-aware recommendations
+        coverage_df = query_to_df("""
+            MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+            WHERE w.role <> 'Foreman'
+            RETURN w.name AS Worker, s.station_code AS StationCode
+        """)
+        
+        for _, row in overloads.iterrows():
+            target_station = row["StationCode"]
+            deficit = row["Week9_Forecast"] - row["Capacity"]
+            suggestion = f"- **{row['Station']} ({target_station}):** Short by **{deficit:.1f} hours**."
+            
+            # 1. Find workers certified for this overloaded station
+            capable_workers = coverage_df[coverage_df["StationCode"] == target_station]["Worker"].unique()
+            
+            # 2. Find which of these workers are currently at surplus stations
+            reassignment_options = []
+            for worker in capable_workers:
+                # Other stations this worker covers
+                other_stations = coverage_df[
+                    (coverage_df["Worker"] == worker) & 
+                    (coverage_df["StationCode"] != target_station)
+                ]["StationCode"].tolist()
+                
+                # Check if any of these 'other' stations have a surplus
+                worker_surplus_stations = healthy[healthy["StationCode"].isin(other_stations)]
+                if not worker_surplus_stations.empty:
+                    # Sort by highest surplus and take the best one for this worker
+                    best_station = worker_surplus_stations.sort_values("Surplus", ascending=False).iloc[0]
+                    reassignment_options.append(f"**{worker}** (from {best_station['Station']})")
+            
+            # 3. Format and display
+            if reassignment_options:
+                # Show top 3 candidates to keep the UI clean
+                suggestion += f" *Suggestion: Reassign {', '.join(reassignment_options[:3])}*"
+            else:
+                suggestion += f" *No cross-trained line workers available at stations with surplus.*"
+                
+            st.markdown(suggestion)
+    else:
+        st.markdown('<div class="section-header" style="color: #22c55e; margin-top: 3rem;">✅ System Healthy</div>', unsafe_allow_html=True)
+        st.success("All stations are projected to be safely within capacity limits for Week 9.")
+
+    with st.expander("Full Forecast Data Table"):
+        st.dataframe(forecast_df.sort_values("UtilPct", ascending=False), use_container_width=True, hide_index=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 6 — Self-Test
+# ══════════════════════════════════════════════════════════════════════════════
+
+def run_self_test():
+    """Run the 6 automated checks and return a list of (description, passed, points)."""
+    driver = get_driver()
+    checks = []
+
+    # Check 1: Connection
+    try:
+        with driver.session() as s:
+            s.run("RETURN 1").single()
+        checks.append(("Neo4j connected", True, 3))
+    except Exception:
+        checks.append(("Neo4j connected", False, 3))
+        return checks  # can't continue
+
+    with driver.session() as s:
+        # Check 2: Node count >= 50
+        count = s.run("MATCH (n) RETURN count(n) AS c").single()["c"]
+        checks.append((f"{count} nodes (min: 50)", count >= 50, 3))
+
+        # Check 3: Relationship count >= 100
+        count = s.run("MATCH ()-[r]->() RETURN count(r) AS c").single()["c"]
+        checks.append((f"{count} relationships (min: 100)", count >= 100, 3))
+
+        # Check 4: 6+ distinct node labels
+        count = s.run("CALL db.labels() YIELD label RETURN count(label) AS c").single()["c"]
+        checks.append((f"{count} node labels (min: 6)", count >= 6, 3))
+
+        # Check 5: 8+ distinct relationship types
+        count = s.run(
+            "CALL db.relationshipTypes() YIELD relationshipType RETURN count(relationshipType) AS c"
+        ).single()["c"]
+        checks.append((f"{count} relationship types (min: 8)", count >= 8, 3))
+
+        # Check 6: Variance query returns results
+        result = s.run("""
+            MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+            WHERE r.actual_hours > r.planned_hours * 1.1
+            RETURN p.project_name AS project, s.station_name AS station,
+                   r.planned_hours AS planned, r.actual_hours AS actual
+            LIMIT 10
+        """)
+        rows = [dict(r) for r in result]
+        checks.append((f"Variance query: {len(rows)} results", len(rows) > 0, 5))
+
+    return checks
+
+
+def page_self_test():
+    st.header("Self-Test")
+    st.caption("Automated verification of graph requirements.")
+
+    if st.button("Run Self-Test", type="primary"):
+        with st.spinner("Running checks…"):
+            checks = run_self_test()
+
+        total = 0
+        max_total = 0
+
+        for desc, passed, pts in checks:
+            max_total += pts
+            earned = pts if passed else 0
+            total += earned
+            icon = "PASS" if passed else "FAIL"
+            css  = "check-pass" if passed else "check-fail"
+            st.markdown(
+                f'<span class="{css}">{icon} {desc}</span> — **{earned}/{pts}**',
+                unsafe_allow_html=True,
+            )
+
+        st.markdown("---")
+        color = "check-pass" if total == max_total else "check-fail"
+        st.markdown(
+            f'<h3 class="{color}">SELF-TEST SCORE: {total}/{max_total}</h3>',
+            unsafe_allow_html=True,
+        )
+    else:
+        st.info("Click the button above to run the self-test checks.")
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Router
+# ══════════════════════════════════════════════════════════════════════════════
+
+if page == "Project Overview":
+    page_project_overview()
+elif page == "Station Load":
+    page_station_load()
+elif page == "Capacity Tracker":
+    page_capacity_tracker()
+elif page == "Worker Coverage":
+    page_worker_coverage()
+elif page == "Load Forecast":
+    page_load_forecast()
+elif page == "Self-Test":
+    page_self_test()
diff --git a/submissions/Touqeer-Hamdani/level6/requirements.txt b/submissions/Touqeer-Hamdani/level6/requirements.txt
new file mode 100644
index 000000000..5a418921c
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/requirements.txt
@@ -0,0 +1,6 @@
+streamlit
+neo4j
+python-dotenv
+pandas
+plotly
+statsmodels
diff --git a/submissions/Touqeer-Hamdani/level6/seed_graph.py b/submissions/Touqeer-Hamdani/level6/seed_graph.py
new file mode 100644
index 000000000..ec9ec8597
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/seed_graph.py
@@ -0,0 +1,297 @@
+"""
+seed_graph.py — Populate Neo4j with factory production data.
+
+Run once:  python seed_graph.py
+Idempotent: safe to re-run (clears graph, then uses MERGE).
+"""
+
+import os
+import csv
+from neo4j import GraphDatabase
+from dotenv import load_dotenv
+
+load_dotenv()
+
+NEO4J_URI = os.getenv("NEO4J_URI")
+NEO4J_USER = os.getenv("NEO4J_USER")
+NEO4J_PASSWORD = os.getenv("NEO4J_PASSWORD")
+
+DATA_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "data")
+
+
+# ── Helpers ──────────────────────────────────────────────────────────────────
+
+def read_csv(filename):
+    """Read a CSV file from the data/ directory and return a list of dicts."""
+    filepath = os.path.join(DATA_DIR, filename)
+    with open(filepath, newline="", encoding="utf-8-sig") as f:
+        return list(csv.DictReader(f))
+
+
+def run(session, query, **kwargs):
+    """Run a Cypher query and return the result summary."""
+    return session.run(query, **kwargs)
+
+
+# ── Seeding phases ───────────────────────────────────────────────────────────
+
+def create_constraints(session):
+    """Phase 1: Uniqueness constraints for idempotent MERGE."""
+    constraints = [
+        "CREATE CONSTRAINT IF NOT EXISTS FOR (p:Project)       REQUIRE p.project_id   IS UNIQUE",
+        "CREATE CONSTRAINT IF NOT EXISTS FOR (s:Station)       REQUIRE s.station_code IS UNIQUE",
+        "CREATE CONSTRAINT IF NOT EXISTS FOR (w:Worker)        REQUIRE w.worker_id    IS UNIQUE",
+        "CREATE CONSTRAINT IF NOT EXISTS FOR (wk:Week)         REQUIRE wk.week_id     IS UNIQUE",
+        "CREATE CONSTRAINT IF NOT EXISTS FOR (prod:Product)    REQUIRE prod.product_type IS UNIQUE",
+        "CREATE CONSTRAINT IF NOT EXISTS FOR (c:Certification) REQUIRE c.cert_name    IS UNIQUE",
+        "CREATE CONSTRAINT IF NOT EXISTS FOR (e:Etapp)         REQUIRE e.etapp_name   IS UNIQUE",
+    ]
+    for c in constraints:
+        run(session, c)
+    print("  Created 7 uniqueness constraints")
+
+
+def seed_production(session, rows):
+    """Phase 3: Nodes and relationships from factory_production.csv."""
+
+    # ── Nodes ──
+    run(session, """
+        UNWIND $rows AS row
+        MERGE (p:Project {project_id: row.project_id})
+        SET p.project_number = row.project_number,
+            p.project_name   = row.project_name
+    """, rows=rows)
+
+    run(session, """
+        UNWIND $rows AS row
+        MERGE (:Product {product_type: row.product_type, unit: row.unit})
+    """, rows=rows)
+
+    run(session, """
+        UNWIND $rows AS row
+        MERGE (s:Station {station_code: row.station_code})
+        SET s.station_name = row.station_name
+    """, rows=rows)
+
+    run(session, """
+        UNWIND $rows AS row
+        MERGE (:Week {week_id: row.week})
+    """, rows=rows)
+
+    run(session, """
+        UNWIND $rows AS row
+        MERGE (:Etapp {etapp_name: row.etapp})
+    """, rows=rows)
+
+    # ── Relationships ──
+    # PRODUCES — one per unique (project_id, product_type)
+    run(session, """
+        UNWIND $rows AS row
+        MATCH (p:Project {project_id: row.project_id})
+        MATCH (prod:Product {product_type: row.product_type})
+        MERGE (p)-[r:PRODUCES]->(prod)
+        SET r.quantity    = toInteger(row.quantity),
+            r.unit_factor = toFloat(row.unit_factor),
+            r.unit        = row.unit
+    """, rows=rows)
+
+    # SCHEDULED_AT — composite key includes product_type to avoid P05/018 collision
+    run(session, """
+        UNWIND $rows AS row
+        MATCH (p:Project {project_id: row.project_id})
+        MATCH (s:Station {station_code: row.station_code})
+        MERGE (p)-[r:SCHEDULED_AT {
+            week:         row.week,
+            etapp:        row.etapp,
+            bop:          row.bop,
+            product_type: row.product_type
+        }]->(s)
+        SET r.planned_hours   = toFloat(row.planned_hours),
+            r.actual_hours    = toFloat(row.actual_hours),
+            r.completed_units = toInteger(row.completed_units),
+            r.variance_pct    = CASE
+                WHEN toFloat(row.planned_hours) > 0
+                THEN round((toFloat(row.actual_hours) - toFloat(row.planned_hours))
+                     / toFloat(row.planned_hours) * 100, 1)
+                ELSE 0.0
+            END
+    """, rows=rows)
+
+    # ACTIVE_IN — project ↔ week
+    run(session, """
+        UNWIND $rows AS row
+        MATCH (p:Project {project_id: row.project_id})
+        MATCH (wk:Week {week_id: row.week})
+        MERGE (p)-[:ACTIVE_IN]->(wk)
+    """, rows=rows)
+
+    # IN_PHASE — project ↔ etapp
+    run(session, """
+        UNWIND $rows AS row
+        MATCH (p:Project {project_id: row.project_id})
+        MATCH (e:Etapp {etapp_name: row.etapp})
+        MERGE (p)-[:IN_PHASE]->(e)
+    """, rows=rows)
+
+    print(f"  Loaded {len(rows)} production rows → nodes + relationships")
+
+
+def seed_workers(session, rows):
+    """Phase 4: Nodes and relationships from factory_workers.csv."""
+
+    # Pre-process: split comma-separated fields in Python
+    worker_data = []
+    for w in rows:
+        certs = [c.strip() for c in w["certifications"].split(",")]
+        cover = [s.strip() for s in w["can_cover_stations"].split(",")]
+        worker_data.append({
+            "worker_id":       w["worker_id"],
+            "name":            w["name"],
+            "role":            w["role"],
+            "primary_station": w["primary_station"],
+            "hours_per_week":  int(w["hours_per_week"]),
+            "type":            w["type"],
+            "certifications":  certs,
+            "can_cover":       cover,
+        })
+
+    # Worker nodes
+    run(session, """
+        UNWIND $rows AS row
+        MERGE (w:Worker {worker_id: row.worker_id})
+        SET w.name           = row.name,
+            w.role           = row.role,
+            w.hours_per_week = row.hours_per_week,
+            w.type           = row.type
+    """, rows=worker_data)
+
+    # Certification nodes + HOLDS
+    run(session, """
+        UNWIND $rows AS row
+        MATCH (w:Worker {worker_id: row.worker_id})
+        UNWIND row.certifications AS cert
+        MERGE (c:Certification {cert_name: cert})
+        MERGE (w)-[:HOLDS]->(c)
+    """, rows=worker_data)
+
+    # WORKS_AT — skip W11 (primary_station = "all")
+    run(session, """
+        UNWIND $rows AS row
+        WITH row WHERE row.primary_station <> 'all'
+        MATCH (w:Worker {worker_id: row.worker_id})
+        MATCH (s:Station {station_code: row.primary_station})
+        MERGE (w)-[:WORKS_AT]->(s)
+    """, rows=worker_data)
+
+    # CAN_COVER
+    run(session, """
+        UNWIND $rows AS row
+        MATCH (w:Worker {worker_id: row.worker_id})
+        UNWIND row.can_cover AS sc
+        MATCH (s:Station {station_code: sc})
+        MERGE (w)-[:CAN_COVER]->(s)
+    """, rows=worker_data)
+
+    print(f"  Loaded {len(rows)} workers → Workers, Certifications + relationships")
+
+
+def seed_capacity(session, rows):
+    """Phase 5: HAS_CAPACITY relationships from factory_capacity.csv."""
+
+    cap_data = []
+    for c in rows:
+        cap_data.append({
+            "week":           c["week"],
+            "own_hours":      int(c["own_hours"]),
+            "hired_hours":    int(c["hired_hours"]),
+            "overtime_hours": int(c["overtime_hours"]),
+            "total_planned":  int(c["total_planned"]),
+            "deficit":        int(c["deficit"]),
+        })
+
+    run(session, """
+        UNWIND $rows AS row
+        MERGE (wk:Week {week_id: row.week})
+        MATCH (f:Factory)
+        MERGE (wk)-[r:HAS_CAPACITY]->(f)
+        SET r.own_hours      = row.own_hours,
+            r.hired_hours    = row.hired_hours,
+            r.overtime_hours = row.overtime_hours,
+            r.total_planned  = row.total_planned,
+            r.deficit        = row.deficit
+    """, rows=cap_data)
+
+    print(f"  Loaded {len(rows)} capacity rows → HAS_CAPACITY")
+
+
+def compute_loaded_in(session):
+    """Phase 6: Aggregate SCHEDULED_AT into LOADED_IN per (station, week)."""
+    run(session, """
+        MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+        WITH s, r.week AS week,
+             sum(r.planned_hours) AS tp,
+             sum(r.actual_hours)  AS ta
+        MATCH (wk:Week {week_id: week})
+        MERGE (s)-[l:LOADED_IN]->(wk)
+        SET l.total_planned = tp,
+            l.total_actual  = ta
+    """)
+    print("  Computed LOADED_IN aggregations")
+
+
+# ── Main ─────────────────────────────────────────────────────────────────────
+
+def main():
+    print("=" * 55)
+    print("  Factory Knowledge Graph — Seeder")
+    print("=" * 55)
+
+    driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))
+    driver.verify_connectivity()
+    print("Connected to Neo4j\n")
+
+    # Read CSVs
+    production = read_csv("factory_production.csv")
+    workers    = read_csv("factory_workers.csv")
+    capacity   = read_csv("factory_capacity.csv")
+
+    with driver.session() as session:
+        # Phase 0: Clear
+        run(session, "MATCH (n) DETACH DELETE n")
+        print("Cleared existing graph\n")
+
+        # Phase 1: Constraints
+        create_constraints(session)
+
+        # Phase 2: Factory singleton
+        run(session, 'MERGE (:Factory {factory_name: "VSAB Stålbyggnad"})')
+        print("  Created Factory node")
+
+        # Phase 3–6
+        seed_production(session, production)
+        seed_workers(session, workers)
+        seed_capacity(session, capacity)
+        compute_loaded_in(session)
+
+    # ── Summary ──
+    with driver.session() as session:
+        nodes     = session.run("MATCH (n) RETURN count(n) AS c").single()["c"]
+        rels      = session.run("MATCH ()-[r]->() RETURN count(r) AS c").single()["c"]
+        labels    = session.run("CALL db.labels() YIELD label RETURN collect(label) AS l").single()["l"]
+        rel_types = session.run(
+            "CALL db.relationshipTypes() YIELD relationshipType RETURN collect(relationshipType) AS t"
+        ).single()["t"]
+
+    print(f"\n{'=' * 55}")
+    print(f"  Seeding complete!")
+    print(f"     Nodes:              {nodes}")
+    print(f"     Relationships:      {rels}")
+    print(f"     Labels ({len(labels)}):        {labels}")
+    print(f"     Rel types ({len(rel_types)}):     {rel_types}")
+    print(f"{'=' * 55}")
+
+    driver.close()
+
+
+if __name__ == "__main__":
+    main()