diff --git a/submissions/Touqeer-Hamdani/level5/answers.md b/submissions/Touqeer-Hamdani/level5/answers.md
new file mode 100644
index 000000000..5ee08a4fc
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level5/answers.md
@@ -0,0 +1,322 @@
+# Level 5 — Graph Thinking
+
+**Author:** Touqeer Hamdani
+**Date:** May 2026
+
+---
+
+## Q1. Model It (20 pts)
+
+### Graph Schema
+
+> Full diagram: [`schema.md`](./schema.md)
+
+The graph schema is designed around the 3 factory CSVs and captures the full production planning domain — projects, what they produce, where they're built, who builds them, and when.
+
+### Node Labels (8)
+
+| Label | Source | Key Properties | Count |
+|-------|--------|----------------|-------|
+| **Project** | production.csv → `project_id`, `project_number`, `project_name` | project_id, project_number, project_name | 8 |
+| **Product** | production.csv → `product_type`, `unit` | product_type, unit | 7 |
+| **Station** | production.csv → `station_code`, `station_name` | station_code, station_name | 10 |
+| **Worker** | workers.csv → `worker_id`, `name` | worker_id, name, role, hours_per_week, type | 14 |
+| **Week** | capacity.csv → `week` | week_id | 8 |
+| **Factory** | Implicit overall plant | factory_name | 1 |
+| **Certification** | workers.csv → `certifications` (split by comma) | cert_name | 23 unique |
+| **Etapp** | production.csv → `etapp` | etapp_name | 2 (ET1, ET2) |
+
+### Relationship Types (9)
+
+| Relationship | Direction | Properties |
+|-------------|-----------|------------|
+| **PRODUCES** | `(Project)→(Product)` | `quantity`, `unit_factor`, `unit` |
+| **SCHEDULED_AT** | `(Project)→(Station)` | `planned_hours`, `actual_hours`, `completed_units`, `week`, `etapp`, `bop`, `variance_pct` |
+| **ACTIVE_IN** | `(Project)→(Week)` | — |
+| **IN_PHASE** | `(Project)→(Etapp)` | — |
+| **WORKS_AT** | `(Worker)→(Station)` | — (primary station assignment) |
+| **CAN_COVER** | `(Worker)→(Station)` | — (cross-trained coverage) |
+| **HOLDS** | `(Worker)→(Certification)` | — |
+| **LOADED_IN** | `(Station)→(Week)` | `total_planned`, `total_actual` |
+| **HAS_CAPACITY** | `(Week)→(Factory)` | `own_hours`, `hired_hours`, `overtime_hours`, `total_planned`, `deficit` |
+
+### Data-Carrying Relationships (4)
+
+1. **PRODUCES** — Each project-to-product edge carries `{quantity: 600, unit_factor: 1.77, unit: "meter"}`, capturing the production spec. This lets you query things like "which projects produce more than 500 meters of IQB?" directly from the relationship.
+
+2. **SCHEDULED_AT** — The core operational edge. Each project-station-week combination carries `{planned_hours: 48.0, actual_hours: 45.2, completed_units: 28, week: "w1", etapp: "ET1", bop: "BOP1"}`. This is the richest relationship in the graph — it's where all the variance analysis lives, and where we track the phase (`etapp`, `bop`) of the work.
+
+3. **LOADED_IN** — Aggregated station load per week: `{total_planned: 393, total_actual: 410}`. Enables capacity-vs-demand queries at the station level without re-aggregating from SCHEDULED_AT every time. *(Note: these properties are calculated by aggregating `SCHEDULED_AT` edges during graph construction, as `factory_capacity.csv` only provides factory-wide totals).*
+
+4. **HAS_CAPACITY** — Links each week to the global factory, carrying the `{own_hours, hired_hours, overtime_hours, total_planned, deficit}` workforce metrics straight out of `factory_capacity.csv`. This perfectly mirrors the exact relationship pattern requested in the L6 instructions.
+
+### Design Decisions
+
+- **Certification as a node** (not a Worker property): Workers share certifications (e.g., multiple workers hold MIG/MAG). Modeling it as a node enables queries like "find all workers certified for TIG welding" with a single hop instead of string parsing.
+- **Etapp and BOP as properties on SCHEDULED_AT**: Since a single project can move through different BOPs (phases) across different stations and weeks, treating `bop` and `etapp` as edge properties accurately models *when and where* that phase occurs, rather than applying a blanket phase to the entire project.
+- **SCHEDULED_AT carries `week` as a property** rather than routing through Week nodes: This keeps the most queried relationship (planned vs actual hours) as a direct Project→Station edge. The separate ACTIVE_IN relationship to Week handles the temporal dimension when needed.
+- **Etapp as a node** (for L6 compliance): The L6 spec explicitly requires `Etapp` as a node label. From a pure design perspective, etapp works better as an edge property on SCHEDULED_AT (only 2 values, no properties of its own), and we keep it there for direct querying. The `Etapp` node + `IN_PHASE` relationship is included to meet the L6 minimum graph requirements.
+
+### Implementation Notes for L6
+
+- **SCHEDULED_AT creates parallel edges**: A single (Project, Station) pair can have multiple SCHEDULED_AT edges — one per week/etapp/bop/product combination. For example, P01→Station 011 appears in both w1 and w2. Additionally, P05→Station 018 has two rows in the same week (w1) with the same etapp/bop (ET2/BOP3) but different product types (SB and SD). In `seed_graph.py`, use `MERGE` with a composite key including `week`, `etapp`, `bop`, **and** `product_type` to ensure idempotency without data loss:
+ ```cypher
+ MERGE (p:Project {project_id: row.project_id})
+ MERGE (s:Station {station_code: row.station_code})
+ MERGE (p)-[r:SCHEDULED_AT {week: row.week, etapp: row.etapp, bop: row.bop, product_type: row.product_type}]->(s)
+ SET r.planned_hours = toFloat(row.planned_hours),
+ r.actual_hours = toFloat(row.actual_hours),
+ r.completed_units = toInteger(row.completed_units)
+ ```
+- **PRODUCES needs deduplication**: The same (Project, Product) pair appears across many CSV rows (different weeks/stations), but the production spec (`quantity`, `unit_factor`, `unit`) is constant per pair. Create **one** PRODUCES edge per unique `(project_id, product_type)` — either deduplicate in Python before loading, or use `MERGE` on the pair:
+ ```cypher
+ MERGE (p:Project {project_id: row.project_id})
+ MERGE (prod:Product {product_type: row.product_type})
+ MERGE (p)-[r:PRODUCES]->(prod)
+ SET r.quantity = toInteger(row.quantity),
+ r.unit_factor = toFloat(row.unit_factor),
+ r.unit = row.unit
+ ```
+
+---
+
+## Q2. Why Not Just SQL? (20 pts)
+
+*Prompt: "Which workers are certified to cover Station 016 (Gjutning) when Per Gustafsson is on vacation, and which projects would be affected?"*
+
+> **Data Reality Check:** In `factory_workers.csv`, the worker at Station 016 is actually named **Per Hansen** (W07), not Per Gustafsson. The queries below reflect the actual data.
+
+### 1. The SQL Query
+Assuming a standard relational schema with normalized tables (`Workers`, `Worker_Coverage`, `Stations`, `Project_Schedules`, `Projects`), we must join 5 tables to traverse the relationships:
+
+```sql
+SELECT
+ w.name AS CoveringWorker,
+ GROUP_CONCAT(DISTINCT p.project_name) AS AffectedProjects
+FROM Workers w
+JOIN Worker_Coverage wc ON w.worker_id = wc.worker_id
+JOIN Project_Schedules ps ON wc.station_code = ps.station_code
+JOIN Projects p ON ps.project_id = p.project_id
+WHERE wc.station_code = '016'
+ AND w.name != 'Per Hansen'
+GROUP BY w.name;
+```
+
+### 2. The Cypher Query
+Using our graph schema, the query becomes a visual representation of the path: `Worker → Station ← Project`:
+
+```cypher
+MATCH (w:Worker)-[:CAN_COVER]->(s:Station {station_code: "016"})<-[:SCHEDULED_AT]-(p:Project)
+WHERE w.name <> "Per Hansen"
+RETURN w.name AS CoveringWorker, collect(DISTINCT p.project_name) AS AffectedProjects
+```
+
+### 3. What the graph makes obvious that SQL hides
+SQL forces you to think about database mechanics—specifically, resolving foreign keys across multiple intermediate junction tables just to traverse a simple real-world relationship. The graph version (Cypher) hides those storage mechanics and makes the network topology instantly obvious, perfectly mirroring how a human visualizes the factory floor: "Find workers who point to this station, and find projects that point to this station."
+
+---
+
+## Q3. Spot the Bottleneck (20 pts)
+
+### 1. Identifying the Overload
+
+From `factory_capacity.csv`, five out of eight weeks show capacity deficits:
+
+| Week | Total Capacity | Total Planned | Deficit |
+|------|---------------|---------------|---------|
+| w1 | 480 | 612 | **-132** |
+| w2 | 520 | 645 | **-125** |
+| w4 | 500 | 550 | **-50** |
+| w6 | 440 | 520 | **-80** |
+| w7 | 520 | 600 | **-80** |
+
+Using `factory_production.csv` to drill into the two worst weeks (w1 and w2):
+
+**Volume Bottleneck (Station 011):** Station 011 (FS IQB) is the primary structural bottleneck. In w1, it is scheduled to handle work from 7 projects simultaneously (P01, P02, P03, P04, P05, P07, P08). As the entry point of the manufacturing pipeline, it creates a massive initial capacity constraint.
+
+**Volume Driver (Project P05):** Project P05 (Sjukhus Linköping ET2) is the largest individual contributor (1200 meters of IQB). It heavily loads the early-stage stations in w1.
+
+**Efficiency Overruns (Station 016):** While 011 causes deficits via sheer scheduled volume, Station 016 (Gjutning / Casting) causes deficits through poor execution efficiency. Looking at the worst overruns by percentage (actual vs planned hours):
+
+| Project | Station | Week | Planned | Actual | Variance |
+|---------|---------|------|---------|--------|----------|
+| P03 | 016 Gjutning | w2 | 28.0 | 35.0 | **+25%** |
+| P04 | 018 SB B/F-hall | w1 | 19.0 | 22.0 | **+16%** |
+| P05 | 016 Gjutning | w2 | 35.0 | 40.0 | **+14%** |
+| P03 | 014 Svets o montage | w1 | 42.0 | 48.0 | **+14%** |
+| P08 | 016 Gjutning | w3 | 22.0 | 25.0 | **+14%** |
+
+Station 016 appears repeatedly in the worst overruns. Therefore, the factory capacity deficit is a dual problem: structural schedule overload at the start of the pipeline (011), and severe execution overruns at the finishing stages (016).
+
+### 2. Cypher Query
+
+```cypher
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+WHERE r.actual_hours > r.planned_hours * 1.1
+RETURN s.station_name AS Station,
+ collect({
+ project: p.project_name,
+ variance_pct: round((r.actual_hours - r.planned_hours) / r.planned_hours * 100, 1)
+ }) AS Overruns
+```
+
+### 3. Modeling the Alert as a Graph Pattern
+
+**Approach: Store `variance_pct` as a numeric property on SCHEDULED_AT.**
+
+During graph seeding, compute and store the variance percentage on each scheduling edge:
+
+```cypher
+SET r.variance_pct = round((r.actual_hours - r.planned_hours) / r.planned_hours * 100, 1)
+```
+
+This means the threshold is applied at **query time**, not at seed time — making it fully flexible:
+
+```cypher
+// 10% threshold for alerts
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+WHERE r.variance_pct > 10
+RETURN s.station_name, p.project_name, r.variance_pct ORDER BY r.variance_pct DESC
+
+// 5% threshold for Q4's hybrid query (finding well-executed projects)
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+WHERE r.variance_pct < 5
+RETURN p.project_name, avg(r.variance_pct)
+```
+
+**Why this over a `(:Bottleneck)` node or a boolean flag:**
+- A dedicated `(:Alert)` node adds schema complexity (extra nodes + relationships) for what is essentially a simple numeric comparison on existing data.
+- A boolean `overrun: true/false` flag loses the magnitude — a 11% overrun and a 50% overrun both say `true`, and changing the threshold requires re-seeding.
+- A numeric `variance_pct` preserves full fidelity, keeps the data where it naturally belongs (on the scheduling edge), and lets dashboards apply any threshold on the fly.
+
+---
+
+## Q4. Vector + Graph Hybrid (20 pts)
+
+*Prompt text:* "450 meters of IQB beams for a hospital extension in Linköping, similar scope to previous hospital projects, tight timeline"
+
+### 1. What to Embed
+There are two ways to handle this, ranging from a simple baseline to a robust production system:
+
+**Approach A: Composite Description (Baseline & Simplicity)**
+The simplest method is to create a single composite text block for each project combining its name, location, building type, and product scope (e.g., `"Sjukhus Linköping ET2, hospital, Linköping, 1200m IQB..."`). We embed this entire paragraph. This captures the overall semantic context ("vibe") perfectly for basic similarity searches.
+
+**Approach B: Metadata Extraction & Filtering (Robust Precision)**
+Relying entirely on a single embedding can sometimes be risky (e.g., the model might heavily weight "tight timeline" and return a project from the wrong city). A more precise, production-grade approach is to use an LLM to extract structured metadata from the free-text query (e.g., `location: "Linköping"`, `material: "IQB beams"`). We then use those extracted properties to perform exact comparisons and use them as **hard graph filters**, relying on the vector embedding purely for the fuzzy semantic matching of the remaining context.
+
+*For the L5/L6 scope, Approach A is the standard expected baseline, but Approach B represents a more advanced architecture.*
+
+### 2. The Hybrid Query
+This query performs a two-stage pipeline: it uses Neo4j's vector index to find semantically similar projects, and then traverses the graph to filter out projects that were executed poorly.
+
+```cypher
+// Stage 1: Vector Search for top 5 semantic matches
+CALL db.index.vector.queryNodes('project_embeddings', 5, $queryEmbedding)
+YIELD node AS candidate, score
+
+// Stage 2: Graph Traversal for operational quality
+MATCH (candidate)-[r:SCHEDULED_AT]->(s:Station)
+WHERE s.station_code IN ["011", "012", "013", "014", "016", "017"] // IQB pipeline stations
+ AND r.variance_pct < 5 // Must be a well-executed project
+
+RETURN candidate.project_name AS ReferenceProject,
+ score AS SimilarityScore,
+ collect(DISTINCT s.station_name) AS StationsUsed
+ORDER BY score DESC
+```
+
+### 3. Why this is better than just filtering by product type
+If we only filtered the database by `product_type = 'IQB'`, we would return almost every project in the factory's history (P01–P06, P08). This is useless for accurate planning.
+
+The hybrid approach provides two crucial layers of intelligence:
+1. **The Vector Layer** captures human context. A "hospital extension in Linköping" is semantically similar to past project P05 ("Sjukhus Linköping ET2") due to building type and location, whereas a standard filter would treat it exactly the same as a parking garage in Helsingborg (P04).
+2. **The Graph Layer** ensures operational reliability. By traversing the `SCHEDULED_AT` edges and checking the `variance_pct` (our property from Q3), we ensure that the semantically matched project was actually executed well on the factory floor, making it a trustworthy baseline for scheduling the new request.
+
+---
+
+## Q5. Your L6 Plan (20 pts)
+
+### 1. Node Labels → CSV Column Mappings
+
+| Node Label | Source | CSV Columns | Key | Count |
+|------------|--------|-------------|-----|-------|
+| **Project** | production.csv | `project_id`, `project_number`, `project_name` | `project_id` | 8 |
+| **Product** | production.csv | `product_type`, `unit` | `product_type` | 7 |
+| **Station** | production.csv | `station_code`, `station_name` | `station_code` | 10 |
+| **Worker** | workers.csv | `worker_id`, `name`, `role`, `hours_per_week`, `type` | `worker_id` | 14 |
+| **Week** | capacity.csv | `week` | `week` | 8 |
+| **Factory** | Implicit | — (single node) | — | 1 |
+| **Certification** | workers.csv | `certifications` (comma-split) | `cert_name` | 23 |
+| **Etapp** | production.csv | `etapp` | `etapp_name` | 2 |
+
+### 2. Relationship Types → What Creates Them
+
+| Relationship | Created By | Properties |
+|---|---|---|
+| **PRODUCES** | `MERGE` on unique `(project_id, product_type)` pairs from production.csv | `quantity`, `unit_factor`, `unit` |
+| **SCHEDULED_AT** | Each row of production.csv → `MERGE` with composite key `{week, etapp, bop, product_type}` | `planned_hours`, `actual_hours`, `completed_units`, `week`, `etapp`, `bop`, `product_type`, `variance_pct` |
+| **ACTIVE_IN** | Distinct `(project_id, week)` pairs from production.csv | — |
+| **IN_PHASE** | Distinct `(project_id, etapp)` pairs from production.csv | — |
+| **WORKS_AT** | workers.csv → `primary_station` column | — |
+| **CAN_COVER** | workers.csv → `can_cover_stations` (comma-split, one edge per station) | — |
+| **HOLDS** | workers.csv → `certifications` (comma-split, one edge per cert) | — |
+| **LOADED_IN** | Aggregated from SCHEDULED_AT per (station, week) during seeding | `total_planned`, `total_actual` |
+| **HAS_CAPACITY** | Each row of capacity.csv | `own_hours`, `hired_hours`, `overtime_hours`, `total_planned`, `deficit` |
+
+#### Seed Script Constraints & Idiosyncrasies
+- **Uniqueness Constraints:** To ensure idempotency during the `MERGE` process, the script must create constraints beforehand:
+ `CREATE CONSTRAINT IF NOT EXISTS FOR (p:Project) REQUIRE p.project_id IS UNIQUE` (and similarly for Station, Worker, Week, Product).
+- **Foreman Assignment:** Worker W11 (Victor Elm) is listed as a Foreman with `primary_station = "all"`. The seed script must handle `"all"` correctly (either by skipping the `WORKS_AT` edge and relying solely on his `can_cover_stations` list, or by explicitly creating edges to all 10 stations) to avoid creating a junk station node named "all".
+
+### 3. Streamlit Dashboard Panels (4 + Self-Test)
+
+#### Page 1: Project Overview (10 pts)
+A summary table showing all 8 projects with total planned hours, total actual hours, variance %, and products involved.
+
+```cypher
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+WITH p, sum(r.planned_hours) AS planned, sum(r.actual_hours) AS actual
+MATCH (p)-[:PRODUCES]->(prod:Product)
+RETURN p.project_name AS Project, planned AS PlannedHours, actual AS ActualHours,
+ round((actual - planned) / planned * 100, 1) AS VariancePct,
+ collect(DISTINCT prod.product_type) AS Products
+ORDER BY p.project_id
+```
+
+#### Page 2: Station Load (10 pts)
+Interactive Plotly bar chart showing hours per station across weeks. Stations where actual > planned are highlighted in red.
+
+```cypher
+MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+RETURN s.station_code AS StationCode, s.station_name AS Station, r.week AS Week,
+ sum(r.planned_hours) AS Planned, sum(r.actual_hours) AS Actual
+ORDER BY StationCode, Week
+```
+
+#### Page 3: Capacity Tracker (10 pts)
+Weekly capacity (own + hired + overtime) vs total planned demand. Deficit weeks are color-coded red.
+
+```cypher
+MATCH (w:Week)-[r:HAS_CAPACITY]->(f:Factory)
+RETURN w.week_id AS Week,
+ r.own_hours + r.hired_hours + r.overtime_hours AS TotalCapacity,
+ r.total_planned AS PlannedDemand,
+ r.deficit AS Deficit
+ORDER BY w.week_id
+```
+
+#### Page 4: Worker Coverage (10 pts)
+Matrix showing which workers can cover which stations. Single-point-of-failure stations (only 1 worker) are flagged.
+
+```cypher
+MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+RETURN s.station_name AS Station, collect(w.name) AS Workers, count(w) AS WorkerCount
+ORDER BY WorkerCount ASC
+```
+
+#### Navigation (5 pts)
+A sidebar will be implemented to allow users to switch seamlessly between the 4 dashboard pages and the Self-Test page without reloading the app.
+
+#### Page 5: Self-Test (20 pts)
+Automated green/red checklist verifying: Neo4j connection, node count ≥ 50, relationship count ≥ 100, 6+ node labels, 8+ relationship types, and variance query returns results.
+
diff --git a/submissions/Touqeer-Hamdani/level5/schema.md b/submissions/Touqeer-Hamdani/level5/schema.md
new file mode 100644
index 000000000..5a4e1dde7
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level5/schema.md
@@ -0,0 +1,79 @@
+# Factory Knowledge Graph — Schema
+
+```mermaid
+graph LR
+ %% ── Node definitions ──
+ Project["🏗️ Project
project_id · project_number
project_name"]
+ Product["📦 Product
product_type · unit"]
+ Station["🏭 Station
station_code · station_name"]
+ Worker["👷 Worker
worker_id · name
role · hours_per_week · type"]
+ Week["📅 Week
week_id"]
+ Factory["🏭 Factory
factory_name"]
+ Certification["🎓 Certification
cert_name"]
+ Etapp["🔄 Etapp
etapp_name"]
+
+ %% ── Relationships ──
+ Project -->|"PRODUCES
{quantity, unit_factor, unit}"| Product
+ Project -->|"SCHEDULED_AT
{planned_hours, actual_hours, week,
completed_units, etapp, bop, variance_pct}"| Station
+ Project -->|"ACTIVE_IN"| Week
+ Project -->|"IN_PHASE"| Etapp
+
+ Worker -->|"WORKS_AT"| Station
+ Worker -->|"CAN_COVER"| Station
+ Worker -->|"HOLDS"| Certification
+
+ Station -->|"LOADED_IN
{total_planned,
total_actual}"| Week
+ Week -->|"HAS_CAPACITY
{own_hours, hired_hours, overtime_hours, total_planned, deficit}"| Factory
+
+ %% ── Styling ──
+ classDef proj fill:#4F46E5,stroke:#3730A3,color:#fff,rx:12
+ classDef prod fill:#059669,stroke:#047857,color:#fff,rx:12
+ classDef stat fill:#D97706,stroke:#B45309,color:#fff,rx:12
+ classDef work fill:#DC2626,stroke:#B91C1C,color:#fff,rx:12
+ classDef week fill:#7C3AED,stroke:#6D28D9,color:#fff,rx:12
+ classDef meta fill:#6B7280,stroke:#4B5563,color:#fff,rx:12
+ classDef cert fill:#0891B2,stroke:#0E7490,color:#fff,rx:12
+ classDef etap fill:#E11D48,stroke:#BE123C,color:#fff,rx:12
+
+ class Project proj
+ class Product prod
+ class Station stat
+ class Worker work
+ class Week week
+ class Factory meta
+ class Certification cert
+ class Etapp etap
+```
+
+## Node Labels (8)
+
+| # | Label | Source CSV | Key Properties | Count |
+|---|-------|-----------|----------------|-------|
+| 1 | **Project** | production.csv | project_id, project_number, project_name | 8 |
+| 2 | **Product** | production.csv | product_type, unit | 7 (IQB, IQP, SB, SD, SP, SR, HSQ) |
+| 3 | **Station** | production.csv | station_code, station_name | 10 (011–019, 021) |
+| 4 | **Worker** | workers.csv | worker_id, name, role, hours_per_week, type | 14 |
+| 5 | **Week** | capacity.csv | week_id | 8 (w1–w8) |
+| 6 | **Factory** | Implicit | factory_name | 1 |
+| 7 | **Certification** | workers.csv | cert_name | 23 unique certs |
+| 8 | **Etapp** | production.csv | etapp_name | 2 (ET1, ET2) |
+
+## Relationship Types (9)
+
+| # | Relationship | From → To | Properties (data-carrying?) |
+|---|-------------|-----------|----------------------------|
+| 1 | **PRODUCES** | Project → Product | ✅ `{quantity, unit_factor, unit}` |
+| 2 | **SCHEDULED_AT** | Project → Station | ✅ `{planned_hours, actual_hours, completed_units, week, etapp, bop, variance_pct}` |
+| 3 | **ACTIVE_IN** | Project → Week | — |
+| 4 | **IN_PHASE** | Project → Etapp | — |
+| 5 | **WORKS_AT** | Worker → Station | — (primary station) |
+| 6 | **CAN_COVER** | Worker → Station | — (coverage capability) |
+| 7 | **HOLDS** | Worker → Certification | — |
+| 8 | **LOADED_IN** | Station → Week | ✅ `{total_planned, total_actual}`* |
+| 9 | **HAS_CAPACITY**| Week → Factory | ✅ `{own_hours, hired_hours, overtime_hours, total_planned, deficit}` |
+
+> 4 relationships carry data properties (**PRODUCES**, **SCHEDULED_AT**, **LOADED_IN**, **HAS_CAPACITY**), exceeding the minimum of 2.
+>
+> *\*Note: `LOADED_IN` properties are calculated by aggregating the `SCHEDULED_AT` edges for each station/week.*
+>
+> *\*Note: `etapp` is also kept as a property on `SCHEDULED_AT` for direct querying. The `Etapp` node is included for L6 compliance, but from a pure design perspective, etapp works better as an edge property since it only has 2 values and carries no properties of its own.*
diff --git a/submissions/Touqeer-Hamdani/level6/.env.example b/submissions/Touqeer-Hamdani/level6/.env.example
new file mode 100644
index 000000000..bdd17bb95
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/.env.example
@@ -0,0 +1,3 @@
+NEO4J_URI = "neo4j+s://xxxxx.databases.neo4j.io"
+NEO4J_USER = "neo4j"
+NEO4J_PASSWORD = "your-password"
\ No newline at end of file
diff --git a/submissions/Touqeer-Hamdani/level6/DASHBOARD_URL.txt b/submissions/Touqeer-Hamdani/level6/DASHBOARD_URL.txt
new file mode 100644
index 000000000..6d1f8d412
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/DASHBOARD_URL.txt
@@ -0,0 +1 @@
+https://l6-factory-dashboard-touqeerhamdani.streamlit.app
diff --git a/submissions/Touqeer-Hamdani/level6/README.md b/submissions/Touqeer-Hamdani/level6/README.md
new file mode 100644
index 000000000..13313e1b5
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/README.md
@@ -0,0 +1,69 @@
+# Factory Knowledge Graph Dashboard — Level 6
+
+A **Neo4j knowledge graph** + **Streamlit dashboard** for a Swedish steel fabrication company managing 8 construction projects across 10 production stations.
+
+## Quick Start
+
+### 1. Prerequisites
+- Python 3.10+
+- A Neo4j instance (recommended: [Neo4j Aura Free](https://neo4j.io/aura))
+
+### 2. Setup
+```bash
+python -m venv venv
+venv\Scripts\activate # Windows
+# source venv/bin/activate # macOS/Linux
+pip install -r requirements.txt
+```
+
+### 3. Configure credentials
+Copy `.env.example` → `.env` and fill in your Neo4j credentials:
+```
+NEO4J_URI=neo4j+s://xxxxx.databases.neo4j.io
+NEO4J_USER=neo4j
+NEO4J_PASSWORD=your-password
+```
+
+### 4. Seed the graph (run once)
+```bash
+python seed_graph.py
+```
+
+### 5. Launch the dashboard
+```bash
+streamlit run app.py
+```
+
+## Dashboard Pages
+
+| Page | Description |
+|------|-------------|
+| **Project Overview** | All 8 projects with planned/actual hours, variance %, and products |
+| **Station Load** | Interactive bar chart — hours per station per week, overloads in red |
+| **Capacity Tracker** | Stacked capacity bars + demand line, deficit weeks highlighted |
+| **Worker Coverage** | Coverage matrix + SPOF (single-point-of-failure) station detection |
+| **Self-Test** | Automated 6-check verification (20 pts) |
+
+## Project Structure
+
+```
+l6-factory-dashboard/
+├── seed_graph.py # CSV → Neo4j (idempotent, uses MERGE)
+├── app.py # Streamlit dashboard (5 pages)
+├── requirements.txt
+├── .env.example
+├── README.md
+├── DASHBOARD_URL.txt
+└── data/
+ ├── factory_production.csv
+ ├── factory_workers.csv
+ └── factory_capacity.csv
+```
+
+## Deployed URL
+
+See `DASHBOARD_URL.txt`.
+
+## Author
+
+**Touqeer Hamdani** — Level 6 submission, May 2026.
diff --git a/submissions/Touqeer-Hamdani/level6/app.py b/submissions/Touqeer-Hamdani/level6/app.py
new file mode 100644
index 000000000..bda45bf8c
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/app.py
@@ -0,0 +1,824 @@
+"""
+app.py — Factory Knowledge Graph Dashboard
+Streamlit application with 6 pages powered by Neo4j.
+"""
+
+import streamlit as st
+from neo4j import GraphDatabase
+import pandas as pd
+import plotly.express as px
+import plotly.graph_objects as go
+from plotly.subplots import make_subplots
+import os
+from dotenv import load_dotenv
+import statsmodels.api as sm
+
+# ── Page config ──────────────────────────────────────────────────────────────
+
+st.set_page_config(
+ page_title="Factory Dashboard",
+ page_icon=None,
+ layout="wide",
+ initial_sidebar_state="expanded",
+)
+
+# ── Custom CSS ───────────────────────────────────────────────────────────────
+
+st.markdown("""
+
+""", unsafe_allow_html=True)
+
+
+# ── Neo4j connection ─────────────────────────────────────────────────────────
+
+@st.cache_resource
+def get_driver():
+ """Connect to Neo4j — supports both st.secrets (Cloud) and .env (local)."""
+ try:
+ uri = st.secrets["NEO4J_URI"]
+ user = st.secrets["NEO4J_USER"]
+ password = st.secrets["NEO4J_PASSWORD"]
+ except Exception:
+ load_dotenv()
+ uri = os.getenv("NEO4J_URI")
+ user = os.getenv("NEO4J_USER")
+ password = os.getenv("NEO4J_PASSWORD")
+ return GraphDatabase.driver(uri, auth=(user, password))
+
+
+def query_to_df(cypher: str) -> pd.DataFrame:
+ """Run a Cypher query and return the results as a DataFrame."""
+ driver = get_driver()
+ with driver.session() as session:
+ result = session.run(cypher)
+ return pd.DataFrame([dict(r) for r in result])
+
+
+# ── Sidebar navigation ──────────────────────────────────────────────────────
+
+st.sidebar.markdown("## Factory Dashboard")
+st.sidebar.markdown("---")
+page = st.sidebar.radio("Navigate", [
+ "Project Overview",
+ "Station Load",
+ "Capacity Tracker",
+ "Worker Coverage",
+ "Load Forecast",
+ "Self-Test",
+])
+st.sidebar.markdown("---")
+st.sidebar.caption("Level 6 · Touqeer Hamdani")
+
+
+# ── Helper: render a KPI card ────────────────────────────────────────────────
+
+def kpi(label, value, sub="", color="blue"):
+ st.markdown(f"""
+
+
{label}
+
{value}
+
{sub}
+
+ """, unsafe_allow_html=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 1 — Project Overview
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_project_overview():
+ st.header("Project Overview")
+ st.caption("All 8 factory projects with planned vs actual hours and variance analysis.")
+
+ df = query_to_df("""
+ MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+ WITH p,
+ sum(r.planned_hours) AS planned,
+ sum(r.actual_hours) AS actual
+ OPTIONAL MATCH (p)-[:PRODUCES]->(prod:Product)
+ RETURN p.project_id AS ID,
+ p.project_name AS Project,
+ planned AS PlannedHours,
+ actual AS ActualHours,
+ CASE
+ WHEN planned = 0 THEN 0.0
+ ELSE round((actual - planned) / planned * 100, 1)
+ END AS VariancePct,
+ collect(DISTINCT prod.product_type) AS Products
+ ORDER BY p.project_id
+ """)
+
+ if df.empty:
+ st.warning("No data found. Has `seed_graph.py` been run?")
+ return
+
+ # KPI cards
+ total_planned = df["PlannedHours"].sum()
+ total_actual = df["ActualHours"].sum()
+ avg_var = round((total_actual - total_planned) / total_planned * 100, 1)
+ overrun_count = len(df[df["VariancePct"] > 0])
+
+ c1, c2, c3, c4 = st.columns(4)
+ with c1: kpi("Projects", len(df), "active in schedule", "blue")
+ with c2: kpi("Total Planned Hours", f"{total_planned:,.0f} h", "across all stations", "green")
+ with c3: kpi("Total Actual Hours", f"{total_actual:,.0f} h", f"{'+' if total_actual > total_planned else '-'} vs plan", "amber")
+ with c4: kpi("Average Plan Variance", f"{avg_var:+.1f}%", f"{overrun_count} projects over plan", "red" if avg_var > 0 else "green")
+
+ st.markdown("")
+
+ # Format products column for display
+ display_df = df.copy()
+ display_df["Products"] = display_df["Products"].apply(lambda x: ", ".join(x) if isinstance(x, list) else x)
+ display_df["VariancePct"] = display_df["VariancePct"].apply(lambda v: f"{v:+.1f}%")
+
+ st.dataframe(
+ display_df,
+ use_container_width=True,
+ hide_index=True,
+ column_config={
+ "ID": st.column_config.TextColumn("ID", width="small"),
+ "Project": st.column_config.TextColumn("Project"),
+ "PlannedHours": st.column_config.NumberColumn("Planned (h)", format="%.1f"),
+ "ActualHours": st.column_config.NumberColumn("Actual (h)", format="%.1f"),
+ "VariancePct": st.column_config.TextColumn("Variance"),
+ "Products": st.column_config.TextColumn("Products"),
+ },
+ )
+
+ # Bar chart: planned vs actual per project
+ fig = go.Figure()
+ fig.add_trace(go.Bar(name="Planned", x=df["Project"], y=df["PlannedHours"],
+ marker_color="#3b82f6"))
+ fig.add_trace(go.Bar(name="Actual", x=df["Project"], y=df["ActualHours"],
+ marker_color=["#22c55e" if a <= p else "#ef4444"
+ for a, p in zip(df["ActualHours"], df["PlannedHours"])]))
+ fig.update_layout(
+ barmode="group", template="plotly_dark",
+ title="Planned vs Actual Hours by Project",
+ xaxis_title="Project", yaxis_title="Hours",
+ height=420, margin=dict(t=50, b=40),
+ legend_title="Metric", showlegend=True
+ )
+ st.plotly_chart(fig, use_container_width=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 2 — Station Load
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_station_load():
+ st.header("Station Load")
+ st.caption("Hours per station across weeks. Red = actual exceeds planned.")
+
+ df = query_to_df("""
+ MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+ RETURN s.station_code AS StationCode,
+ s.station_name AS Station,
+ r.week AS Week,
+ sum(r.planned_hours) AS Planned,
+ sum(r.actual_hours) AS Actual
+ ORDER BY s.station_code, r.week
+ """)
+
+ if df.empty:
+ st.warning("No data found.")
+ return
+
+ df["Overloaded"] = df["Actual"] > df["Planned"]
+ df["Label"] = df["StationCode"] + " " + df["Station"]
+
+ # Week filter
+ weeks = sorted(df["Week"].unique())
+ selected_weeks = st.multiselect("Filter by week", weeks, default=weeks)
+ filtered = df[df["Week"].isin(selected_weeks)]
+
+ # Grouped bar chart
+ filtered = filtered.copy()
+ filtered["Label_Week"] = filtered["Label"] + " - " + filtered["Week"].astype(str)
+
+ tick_text = [
+ f'{row["Label_Week"]}'
+ for _, row in filtered.iterrows()
+ ]
+
+ fig = go.Figure()
+ fig.add_trace(go.Bar(
+ name="Planned", x=filtered["Label_Week"],
+ y=filtered["Planned"], marker_color="#3b82f6",
+ ))
+ fig.add_trace(go.Bar(
+ name="Actual", x=filtered["Label_Week"],
+ y=filtered["Actual"],
+ marker_color=["#ef4444" if o else "#22c55e" for o in filtered["Overloaded"]],
+ ))
+ fig.update_layout(
+ barmode="group", template="plotly_dark",
+ title="Station Load: Planned vs Actual",
+ yaxis_title="Hours",
+ height=500, margin=dict(t=50, b=80),
+ legend_title="Metric", showlegend=True,
+ xaxis=dict(
+ title="Station - Week",
+ tickmode="array",
+ tickvals=filtered["Label_Week"],
+ ticktext=tick_text
+ )
+ )
+ st.plotly_chart(fig, use_container_width=True)
+
+ # Summary table
+ with st.expander("Detailed data"):
+ st.dataframe(filtered[["StationCode", "Station", "Week", "Planned", "Actual", "Overloaded"]],
+ use_container_width=True, hide_index=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 3 — Capacity Tracker
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_capacity_tracker():
+ st.header("Capacity Tracker")
+ st.caption("Weekly capacity (own + hired + overtime) vs planned demand. Deficit weeks in red.")
+
+ df = query_to_df("""
+ MATCH (w:Week)-[r:HAS_CAPACITY]->(f:Factory)
+ RETURN w.week_id AS Week,
+ r.own_hours AS Own,
+ r.hired_hours AS Hired,
+ r.overtime_hours AS Overtime,
+ r.own_hours + r.hired_hours + r.overtime_hours AS TotalCapacity,
+ r.total_planned AS PlannedDemand,
+ r.deficit AS Deficit
+ ORDER BY w.week_id
+ """)
+
+ if df.empty:
+ st.warning("No data found.")
+ return
+
+ deficit_weeks = len(df[df["Deficit"] < 0])
+ total_deficit = df[df["Deficit"] < 0]["Deficit"].sum()
+
+ c1, c2, c3 = st.columns(3)
+ with c1: kpi("Deficit Weeks", f"{deficit_weeks} / {len(df)}", "weeks over capacity", "red")
+ with c2: kpi("Cumulative Capacity Deficit", f"{total_deficit:+,.0f} h", "cumulative shortfall", "red")
+ with c3: kpi("Maximum Weekly Deficit", f"{df['Deficit'].min():+,.0f} h", f"in {df.loc[df['Deficit'].idxmin(), 'Week']}", "amber")
+
+ st.markdown("")
+
+ # Color x-axis labels red for deficit weeks (used on the bottom chart)
+ tick_text = [
+ f'{row["Week"]}'
+ for _, row in df.iterrows()
+ ]
+
+ fig = make_subplots(
+ rows=2, cols=1,
+ shared_xaxes=True,
+ vertical_spacing=0.1,
+ row_heights=[0.75, 0.25]
+ )
+
+ # Top Chart: Stacked bar (capacity components) + line (demand)
+ fig.add_trace(go.Bar(name="Own Staff", x=df["Week"], y=df["Own"], marker_color="#3b82f6"), row=1, col=1)
+ fig.add_trace(go.Bar(name="Hired", x=df["Week"], y=df["Hired"], marker_color="#8b5cf6"), row=1, col=1)
+ fig.add_trace(go.Bar(name="Overtime", x=df["Week"], y=df["Overtime"], marker_color="#f59e0b"), row=1, col=1)
+
+ fig.add_trace(go.Scatter(
+ name="Planned Demand", x=df["Week"], y=df["PlannedDemand"],
+ mode="lines+markers", line=dict(color="#ef4444", width=3, dash="dot"),
+ marker=dict(size=8),
+ ), row=1, col=1)
+
+ # Bottom Chart: Surplus/Deficit Bar
+ fig.add_trace(go.Bar(
+ name="Surplus / Deficit", x=df["Week"], y=df["Deficit"],
+ marker_color=["#ef4444" if d < 0 else "#22c55e" for d in df["Deficit"]],
+ text=[f"{d:+.0f}" for d in df["Deficit"]],
+ textposition="outside",
+ showlegend=False,
+ hovertemplate="Variance: %{y:+.0f}h"
+ ), row=2, col=1)
+
+ fig.update_layout(
+ barmode="stack", template="plotly_dark",
+ title="Weekly Capacity vs Demand",
+ height=600, margin=dict(t=50, b=40),
+ legend=dict(title="Capacity Type", orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1),
+ )
+ fig.update_yaxes(title_text="Hours", row=1, col=1)
+ fig.update_yaxes(title_text="Variance", row=2, col=1)
+ fig.update_xaxes(
+ title="Week", tickmode="array",
+ tickvals=df["Week"], ticktext=tick_text,
+ row=2, col=1
+ )
+ st.plotly_chart(fig, use_container_width=True)
+
+ # Data table
+ with st.expander("Detailed data"):
+ st.dataframe(df, use_container_width=True, hide_index=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 4 — Worker Coverage
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_worker_coverage():
+ st.header("Worker Coverage Matrix")
+ st.caption("Which workers can cover which stations. SPOF stations (≤ 1 unique worker) flagged in red.")
+
+ # Coverage data
+ df = query_to_df("""
+ MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+ RETURN s.station_code AS StationCode,
+ s.station_name AS Station,
+ collect(w.name) AS Workers,
+ count(w) AS WorkerCount
+ ORDER BY WorkerCount ASC
+ """)
+
+ if df.empty:
+ st.warning("No data found.")
+ return
+
+ spof = df[df["WorkerCount"] <= 1]
+ c1, c2 = st.columns(2)
+ with c1: kpi("Total Stations", len(df), "with assigned coverage", "blue")
+ with c2: kpi("SPOF Stations", len(spof),
+ ", ".join(spof["Station"].tolist()) if len(spof) > 0 else "None",
+ "red" if len(spof) > 0 else "green")
+
+ st.markdown("")
+
+ # -------------------------------------------------------------------------
+ # Heatmap Matrix
+ # -------------------------------------------------------------------------
+ st.markdown('', unsafe_allow_html=True)
+ st.caption("A visual overview of certifications. Blue indicates a worker is certified for that station.")
+
+ matrix_df = query_to_df("""
+ MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+ RETURN w.name AS Worker, s.station_code AS StationCode, s.station_name AS Station
+ """)
+
+ if not matrix_df.empty:
+ pivot = matrix_df.pivot_table(
+ index=["StationCode", "Station"],
+ columns="Worker",
+ aggfunc="size",
+ fill_value=0,
+ )
+ pivot = pivot.clip(upper=1)
+ pivot = pivot.sort_index()
+ pivot = pivot[sorted(pivot.columns)]
+
+ y_labels = [f"{idx[0]} - {idx[1]}" for idx in pivot.index]
+
+ # Hover text matrix
+ hover_text = []
+ for s, row in zip(y_labels, pivot.values):
+ hover_row = []
+ for w, val in zip(pivot.columns, row):
+ status = "Certified" if val == 1 else "Uncertified"
+ hover_row.append(f"Worker: {w}
Station: {s}
Status: {status}")
+ hover_text.append(hover_row)
+
+ fig = go.Figure(data=go.Heatmap(
+ z=pivot.values,
+ x=pivot.columns,
+ y=y_labels,
+ colorscale=[[0, "rgba(255, 255, 255, 0.05)"], [1, "#3b82f6"]],
+ showscale=False,
+ xgap=2, ygap=2,
+ hoverinfo="text",
+ text=hover_text
+ ))
+
+ fig.update_layout(
+ template="plotly_dark",
+ height=400,
+ margin=dict(t=10, b=80, l=180, r=20),
+ xaxis=dict(tickangle=-45, side="bottom"),
+ yaxis=dict(autorange="reversed")
+ )
+ st.plotly_chart(fig, use_container_width=True)
+
+ # Detailed Coverage Table (satisfies the "table" requirement cleanly without horizontal scroll issues)
+ st.markdown('', unsafe_allow_html=True)
+ display_df = df.copy()
+ display_df["Workers"] = display_df["Workers"].apply(
+ lambda x: ", ".join(x) if isinstance(x, list) else x
+ )
+
+ def highlight_spof(row):
+ if row["WorkerCount"] <= 1:
+ return ["background-color: rgba(239,68,68,0.15)"] * len(row)
+ return [""] * len(row)
+
+ st.dataframe(
+ display_df.style.apply(highlight_spof, axis=1),
+ use_container_width=True, hide_index=True,
+ )
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 5 — Load Forecast
+# ══════════════════════════════════════════════════════════════════════════════
+
+def page_load_forecast():
+ st.header("Load Forecast (Week 9)")
+ st.caption("Predictive analysis identifying where production load will exceed station capacity in the coming week.")
+
+ # 1. Get Historical Load + Variance
+ load_df = query_to_df("""
+ MATCH (s:Station)-[l:LOADED_IN]->(w:Week)
+ RETURN s.station_code AS StationCode,
+ s.station_name AS Station,
+ toInteger(substring(w.week_id, 1)) AS WeekNum,
+ l.total_actual AS ActualLoad,
+ l.total_planned AS PlannedLoad,
+ CASE WHEN l.total_planned > 0
+ THEN round((l.total_actual - l.total_planned) / l.total_planned * 100, 1)
+ ELSE 0.0 END AS VariancePct
+ ORDER BY StationCode, WeekNum
+ """)
+
+ # 2. Get Graph-Aware Capacity
+ cap_df = query_to_df("""
+ MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+ WITH w, s
+ MATCH (w)-[:CAN_COVER]->(all_s:Station)
+ WITH w, s, count(all_s) AS total_coverage
+ RETURN s.station_code AS StationCode,
+ s.station_name AS Station,
+ sum(toFloat(w.hours_per_week) / total_coverage) AS Capacity
+ ORDER BY StationCode
+ """)
+
+ if load_df.empty or cap_df.empty:
+ st.warning("No data found. Ensure the graph is seeded.")
+ return
+
+ load_df["Load"] = load_df["ActualLoad"].fillna(load_df["PlannedLoad"])
+
+ # 3. Process Forecasts
+ forecasts = []
+
+ # Iterate over all known stations from both load and capacity queries
+ all_stations = sorted(list(set(load_df["StationCode"]).union(set(cap_df["StationCode"]))))
+
+ for station_code in all_stations:
+ station_data = load_df[load_df["StationCode"] == station_code]
+ cap_series = cap_df[cap_df["StationCode"] == station_code]
+
+ station_name = station_data["Station"].iloc[0] if not station_data.empty else cap_series["Station"].iloc[0]
+ cap = cap_series["Capacity"].iloc[0] if not cap_series.empty else 0.0
+
+ if len(station_data) > 1:
+ X = sm.add_constant(station_data["WeekNum"])
+ model = sm.OLS(station_data["Load"], X).fit()
+ pred_9 = model.predict([1, 9])[0]
+ elif len(station_data) == 1:
+ pred_9 = station_data["Load"].iloc[0]
+ else:
+ pred_9 = 0
+
+ pred_9 = max(0, pred_9)
+
+ util_pct = (pred_9 / cap * 100) if cap > 0 else (float('inf') if pred_9 > 0 else 0)
+
+ forecasts.append({
+ "StationCode": station_code,
+ "Station": station_name,
+ "Week9_Forecast": pred_9,
+ "Capacity": cap,
+ "UtilPct": util_pct
+ })
+
+ forecast_df = pd.DataFrame(forecasts)
+ forecast_df["Status"] = forecast_df.apply(lambda x: "OVERLOAD" if x["Week9_Forecast"] > x["Capacity"] else "SAFE", axis=1)
+
+ # KPIs
+ overloaded_count = len(forecast_df[forecast_df["Status"] == "OVERLOAD"])
+ avg_util = forecast_df["UtilPct"].mean()
+
+ c1, c2, c3 = st.columns(3)
+ with c1: kpi("Average Factory Utilization", f"{avg_util:.1f}%", "projected for week 9", "blue")
+ with c2: kpi("Critical Stations", overloaded_count, "over capacity in week 9", "red" if overloaded_count > 0 else "green")
+ with c3: kpi("Highest Load", f"{forecast_df['UtilPct'].max():.0f}%", f"at {forecast_df.loc[forecast_df['UtilPct'].idxmax(), 'StationCode']}", "amber")
+
+ st.markdown('', unsafe_allow_html=True)
+
+ # Global comparison chart
+ fig_global = go.Figure()
+ fig_global.add_trace(go.Bar(
+ name="Projected Load", x=forecast_df["StationCode"], y=forecast_df["Week9_Forecast"],
+ marker_color=["#ef4444" if s == "OVERLOAD" else "#3b82f6" for s in forecast_df["Status"]],
+ hovertemplate="Load: %{y:.1f}h"
+ ))
+ fig_global.add_trace(go.Scatter(
+ name="Station Capacity", x=forecast_df["StationCode"], y=forecast_df["Capacity"],
+ mode="markers", marker=dict(color="#ffffff", size=12, symbol="line-ew-open", line=dict(width=3)),
+ hovertemplate="Capacity: %{y:.1f}h"
+ ))
+ fig_global.update_layout(
+ template="plotly_dark", height=350, margin=dict(t=20, b=40, l=10, r=10),
+ xaxis_title="Station Code", yaxis_title="Hours", barmode="group",
+ legend=dict(title="Metric", orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1)
+ )
+ st.plotly_chart(fig_global, use_container_width=True)
+
+ # 4. Station Deep-Dive
+ st.markdown('', unsafe_allow_html=True)
+
+ selected_st = st.selectbox("Select a station to see trend details",
+ options=forecast_df["StationCode"].tolist(),
+ format_func=lambda x: f"{x} - {forecast_df[forecast_df['StationCode']==x]['Station'].iloc[0]}")
+
+ sd = forecast_df[forecast_df["StationCode"] == selected_st].iloc[0]
+ hist = load_df[load_df["StationCode"] == selected_st]
+
+ c1, c2 = st.columns([2, 1])
+
+ with c1:
+ fig_detail = go.Figure()
+
+ # OLS logic
+ if len(hist) > 1:
+ X = sm.add_constant(hist["WeekNum"])
+ model = sm.OLS(hist["Load"], X).fit()
+ x_range = list(range(1, 10))
+ y_range = model.predict(sm.add_constant(x_range))
+ preds = model.get_prediction(sm.add_constant(x_range))
+ ci = preds.conf_int(alpha=0.1) # 90% confidence
+ y_lower, y_upper = ci[:, 0], ci[:, 1]
+
+ # Confidence Band
+ fig_detail.add_trace(go.Scatter(
+ x=x_range + x_range[::-1], y=list(y_upper) + list(y_lower)[::-1],
+ fill='toself', fillcolor='rgba(59, 130, 246, 0.1)',
+ line=dict(color='rgba(255,255,255,0)'), name="90% Confidence Interval"
+ ))
+ # Trend
+ fig_detail.add_trace(go.Scatter(x=x_range, y=y_range, mode="lines",
+ line=dict(color="#3b82f6", dash="dash"), name="Trendline"))
+
+ # Capacity line
+ fig_detail.add_hline(y=sd["Capacity"], line_dash="dot", line_color="#ef4444",
+ annotation_text="CAPACITY LIMIT", annotation_position="top left")
+
+ # Historical
+ fig_detail.add_trace(go.Scatter(x=hist["WeekNum"], y=hist["Load"], mode="lines+markers",
+ marker=dict(color="#ffffff", size=10), name="Historical Load"))
+
+ # Week 9 Target Point
+ fig_detail.add_trace(go.Scatter(x=[9], y=[sd["Week9_Forecast"]], mode="markers",
+ marker=dict(color="#ef4444" if sd["Status"]=="OVERLOAD" else "#22c55e", size=14, symbol="star"),
+ name="W9 Projection"))
+
+ fig_detail.update_layout(
+ template="plotly_dark", height=400, title=f"Load Trend Analysis: {selected_st}",
+ xaxis=dict(title="Week", tickmode="linear", range=[0.5, 9.5]),
+ yaxis=dict(title="Hours"), showlegend=True,
+ legend=dict(title="Metric", orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1)
+ )
+ st.plotly_chart(fig_detail, use_container_width=True)
+ st.caption("💡 **Note**: The shaded area represents the 90% confidence interval of the OLS prediction, indicating the expected range of variance based on historical data.")
+
+ # Variance Trend Chart
+ if not hist.empty and len(hist) > 1:
+ fig_var = px.line(hist, x="WeekNum", y="VariancePct", markers=True,
+ title=f"Historical Plan Variance (%): {selected_st}",
+ labels={"VariancePct": "Variance %", "WeekNum": "Week"},
+ template="plotly_dark", height=200)
+ fig_var.add_hline(y=0, line_dash="dash", line_color="#94a3b8")
+ fig_var.update_traces(line_color="#f59e0b")
+ fig_var.update_layout(margin=dict(t=30, b=20, l=10, r=10))
+ st.plotly_chart(fig_var, use_container_width=True)
+ else:
+ st.info("Insufficient data to show variance trend.")
+
+ with c2:
+ st.markdown(f"### {sd['Status']}")
+ st.write(f"Station **{selected_st}** is projected to reach **{sd['Week9_Forecast']:.1f} hours** in Week 9.")
+ st.write(f"Current capacity limit is **{sd['Capacity']:.1f} hours**.")
+
+ avg_v = hist["VariancePct"].mean() if not hist.empty else 0.0
+ st.metric("Avg Historical Variance", f"{avg_v:+.1f}%",
+ help="Positive means actual hours consistently exceed planned hours.")
+
+ util_color = "red" if sd["UtilPct"] > 100 else "green"
+ util_display = f"{sd['UtilPct']:.1f}%" if sd["UtilPct"] != float('inf') else "∞%"
+ delta_display = f"{sd['UtilPct']-100:.1f}%" if sd["UtilPct"] > 100 and sd["UtilPct"] != float('inf') else None
+
+ st.metric("Projected Utilization", util_display,
+ delta=delta_display,
+ delta_color="inverse")
+
+ if sd["Status"] == "OVERLOAD":
+ st.error(f"Action Required: Station will exceed capacity by {sd['Week9_Forecast'] - sd['Capacity']:.1f} hours.")
+ else:
+ st.success("No immediate capacity action required.")
+
+ # 5. Global Action Recommendations
+ overloads = forecast_df[forecast_df["Status"] == "OVERLOAD"]
+ healthy = forecast_df[forecast_df["Status"] == "SAFE"].copy()
+ healthy["Surplus"] = healthy["Capacity"] - healthy["Week9_Forecast"]
+ healthy = healthy[healthy["Surplus"] >= 5].sort_values("Surplus", ascending=False)
+
+ if not overloads.empty:
+ st.markdown('', unsafe_allow_html=True)
+ st.error(f"**{len(overloads)} Stations are projected to exceed capacity in Week 9.** Immediate action is required.")
+
+ # Get worker coverage map to make smart, graph-aware recommendations
+ coverage_df = query_to_df("""
+ MATCH (w:Worker)-[:CAN_COVER]->(s:Station)
+ WHERE w.role <> 'Foreman'
+ RETURN w.name AS Worker, s.station_code AS StationCode
+ """)
+
+ for _, row in overloads.iterrows():
+ target_station = row["StationCode"]
+ deficit = row["Week9_Forecast"] - row["Capacity"]
+ suggestion = f"- **{row['Station']} ({target_station}):** Short by **{deficit:.1f} hours**."
+
+ # 1. Find workers certified for this overloaded station
+ capable_workers = coverage_df[coverage_df["StationCode"] == target_station]["Worker"].unique()
+
+ # 2. Find which of these workers are currently at surplus stations
+ reassignment_options = []
+ for worker in capable_workers:
+ # Other stations this worker covers
+ other_stations = coverage_df[
+ (coverage_df["Worker"] == worker) &
+ (coverage_df["StationCode"] != target_station)
+ ]["StationCode"].tolist()
+
+ # Check if any of these 'other' stations have a surplus
+ worker_surplus_stations = healthy[healthy["StationCode"].isin(other_stations)]
+ if not worker_surplus_stations.empty:
+ # Sort by highest surplus and take the best one for this worker
+ best_station = worker_surplus_stations.sort_values("Surplus", ascending=False).iloc[0]
+ reassignment_options.append(f"**{worker}** (from {best_station['Station']})")
+
+ # 3. Format and display
+ if reassignment_options:
+ # Show top 3 candidates to keep the UI clean
+ suggestion += f" *Suggestion: Reassign {', '.join(reassignment_options[:3])}*"
+ else:
+ suggestion += f" *No cross-trained line workers available at stations with surplus.*"
+
+ st.markdown(suggestion)
+ else:
+ st.markdown('', unsafe_allow_html=True)
+ st.success("All stations are projected to be safely within capacity limits for Week 9.")
+
+ with st.expander("Full Forecast Data Table"):
+ st.dataframe(forecast_df.sort_values("UtilPct", ascending=False), use_container_width=True, hide_index=True)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PAGE 6 — Self-Test
+# ══════════════════════════════════════════════════════════════════════════════
+
+def run_self_test():
+ """Run the 6 automated checks and return a list of (description, passed, points)."""
+ driver = get_driver()
+ checks = []
+
+ # Check 1: Connection
+ try:
+ with driver.session() as s:
+ s.run("RETURN 1").single()
+ checks.append(("Neo4j connected", True, 3))
+ except Exception:
+ checks.append(("Neo4j connected", False, 3))
+ return checks # can't continue
+
+ with driver.session() as s:
+ # Check 2: Node count >= 50
+ count = s.run("MATCH (n) RETURN count(n) AS c").single()["c"]
+ checks.append((f"{count} nodes (min: 50)", count >= 50, 3))
+
+ # Check 3: Relationship count >= 100
+ count = s.run("MATCH ()-[r]->() RETURN count(r) AS c").single()["c"]
+ checks.append((f"{count} relationships (min: 100)", count >= 100, 3))
+
+ # Check 4: 6+ distinct node labels
+ count = s.run("CALL db.labels() YIELD label RETURN count(label) AS c").single()["c"]
+ checks.append((f"{count} node labels (min: 6)", count >= 6, 3))
+
+ # Check 5: 8+ distinct relationship types
+ count = s.run(
+ "CALL db.relationshipTypes() YIELD relationshipType RETURN count(relationshipType) AS c"
+ ).single()["c"]
+ checks.append((f"{count} relationship types (min: 8)", count >= 8, 3))
+
+ # Check 6: Variance query returns results
+ result = s.run("""
+ MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+ WHERE r.actual_hours > r.planned_hours * 1.1
+ RETURN p.project_name AS project, s.station_name AS station,
+ r.planned_hours AS planned, r.actual_hours AS actual
+ LIMIT 10
+ """)
+ rows = [dict(r) for r in result]
+ checks.append((f"Variance query: {len(rows)} results", len(rows) > 0, 5))
+
+ return checks
+
+
+def page_self_test():
+ st.header("Self-Test")
+ st.caption("Automated verification of graph requirements.")
+
+ if st.button("Run Self-Test", type="primary"):
+ with st.spinner("Running checks…"):
+ checks = run_self_test()
+
+ total = 0
+ max_total = 0
+
+ for desc, passed, pts in checks:
+ max_total += pts
+ earned = pts if passed else 0
+ total += earned
+ icon = "PASS" if passed else "FAIL"
+ css = "check-pass" if passed else "check-fail"
+ st.markdown(
+ f'{icon} {desc} — **{earned}/{pts}**',
+ unsafe_allow_html=True,
+ )
+
+ st.markdown("---")
+ color = "check-pass" if total == max_total else "check-fail"
+ st.markdown(
+ f'SELF-TEST SCORE: {total}/{max_total}
',
+ unsafe_allow_html=True,
+ )
+ else:
+ st.info("Click the button above to run the self-test checks.")
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Router
+# ══════════════════════════════════════════════════════════════════════════════
+
+if page == "Project Overview":
+ page_project_overview()
+elif page == "Station Load":
+ page_station_load()
+elif page == "Capacity Tracker":
+ page_capacity_tracker()
+elif page == "Worker Coverage":
+ page_worker_coverage()
+elif page == "Load Forecast":
+ page_load_forecast()
+elif page == "Self-Test":
+ page_self_test()
diff --git a/submissions/Touqeer-Hamdani/level6/requirements.txt b/submissions/Touqeer-Hamdani/level6/requirements.txt
new file mode 100644
index 000000000..5a418921c
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/requirements.txt
@@ -0,0 +1,6 @@
+streamlit
+neo4j
+python-dotenv
+pandas
+plotly
+statsmodels
diff --git a/submissions/Touqeer-Hamdani/level6/seed_graph.py b/submissions/Touqeer-Hamdani/level6/seed_graph.py
new file mode 100644
index 000000000..ec9ec8597
--- /dev/null
+++ b/submissions/Touqeer-Hamdani/level6/seed_graph.py
@@ -0,0 +1,297 @@
+"""
+seed_graph.py — Populate Neo4j with factory production data.
+
+Run once: python seed_graph.py
+Idempotent: safe to re-run (clears graph, then uses MERGE).
+"""
+
+import os
+import csv
+from neo4j import GraphDatabase
+from dotenv import load_dotenv
+
+load_dotenv()
+
+NEO4J_URI = os.getenv("NEO4J_URI")
+NEO4J_USER = os.getenv("NEO4J_USER")
+NEO4J_PASSWORD = os.getenv("NEO4J_PASSWORD")
+
+DATA_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "data")
+
+
+# ── Helpers ──────────────────────────────────────────────────────────────────
+
+def read_csv(filename):
+ """Read a CSV file from the data/ directory and return a list of dicts."""
+ filepath = os.path.join(DATA_DIR, filename)
+ with open(filepath, newline="", encoding="utf-8-sig") as f:
+ return list(csv.DictReader(f))
+
+
+def run(session, query, **kwargs):
+ """Run a Cypher query and return the result summary."""
+ return session.run(query, **kwargs)
+
+
+# ── Seeding phases ───────────────────────────────────────────────────────────
+
+def create_constraints(session):
+ """Phase 1: Uniqueness constraints for idempotent MERGE."""
+ constraints = [
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (p:Project) REQUIRE p.project_id IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (s:Station) REQUIRE s.station_code IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (w:Worker) REQUIRE w.worker_id IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (wk:Week) REQUIRE wk.week_id IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (prod:Product) REQUIRE prod.product_type IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (c:Certification) REQUIRE c.cert_name IS UNIQUE",
+ "CREATE CONSTRAINT IF NOT EXISTS FOR (e:Etapp) REQUIRE e.etapp_name IS UNIQUE",
+ ]
+ for c in constraints:
+ run(session, c)
+ print(" Created 7 uniqueness constraints")
+
+
+def seed_production(session, rows):
+ """Phase 3: Nodes and relationships from factory_production.csv."""
+
+ # ── Nodes ──
+ run(session, """
+ UNWIND $rows AS row
+ MERGE (p:Project {project_id: row.project_id})
+ SET p.project_number = row.project_number,
+ p.project_name = row.project_name
+ """, rows=rows)
+
+ run(session, """
+ UNWIND $rows AS row
+ MERGE (:Product {product_type: row.product_type, unit: row.unit})
+ """, rows=rows)
+
+ run(session, """
+ UNWIND $rows AS row
+ MERGE (s:Station {station_code: row.station_code})
+ SET s.station_name = row.station_name
+ """, rows=rows)
+
+ run(session, """
+ UNWIND $rows AS row
+ MERGE (:Week {week_id: row.week})
+ """, rows=rows)
+
+ run(session, """
+ UNWIND $rows AS row
+ MERGE (:Etapp {etapp_name: row.etapp})
+ """, rows=rows)
+
+ # ── Relationships ──
+ # PRODUCES — one per unique (project_id, product_type)
+ run(session, """
+ UNWIND $rows AS row
+ MATCH (p:Project {project_id: row.project_id})
+ MATCH (prod:Product {product_type: row.product_type})
+ MERGE (p)-[r:PRODUCES]->(prod)
+ SET r.quantity = toInteger(row.quantity),
+ r.unit_factor = toFloat(row.unit_factor),
+ r.unit = row.unit
+ """, rows=rows)
+
+ # SCHEDULED_AT — composite key includes product_type to avoid P05/018 collision
+ run(session, """
+ UNWIND $rows AS row
+ MATCH (p:Project {project_id: row.project_id})
+ MATCH (s:Station {station_code: row.station_code})
+ MERGE (p)-[r:SCHEDULED_AT {
+ week: row.week,
+ etapp: row.etapp,
+ bop: row.bop,
+ product_type: row.product_type
+ }]->(s)
+ SET r.planned_hours = toFloat(row.planned_hours),
+ r.actual_hours = toFloat(row.actual_hours),
+ r.completed_units = toInteger(row.completed_units),
+ r.variance_pct = CASE
+ WHEN toFloat(row.planned_hours) > 0
+ THEN round((toFloat(row.actual_hours) - toFloat(row.planned_hours))
+ / toFloat(row.planned_hours) * 100, 1)
+ ELSE 0.0
+ END
+ """, rows=rows)
+
+ # ACTIVE_IN — project ↔ week
+ run(session, """
+ UNWIND $rows AS row
+ MATCH (p:Project {project_id: row.project_id})
+ MATCH (wk:Week {week_id: row.week})
+ MERGE (p)-[:ACTIVE_IN]->(wk)
+ """, rows=rows)
+
+ # IN_PHASE — project ↔ etapp
+ run(session, """
+ UNWIND $rows AS row
+ MATCH (p:Project {project_id: row.project_id})
+ MATCH (e:Etapp {etapp_name: row.etapp})
+ MERGE (p)-[:IN_PHASE]->(e)
+ """, rows=rows)
+
+ print(f" Loaded {len(rows)} production rows → nodes + relationships")
+
+
+def seed_workers(session, rows):
+ """Phase 4: Nodes and relationships from factory_workers.csv."""
+
+ # Pre-process: split comma-separated fields in Python
+ worker_data = []
+ for w in rows:
+ certs = [c.strip() for c in w["certifications"].split(",")]
+ cover = [s.strip() for s in w["can_cover_stations"].split(",")]
+ worker_data.append({
+ "worker_id": w["worker_id"],
+ "name": w["name"],
+ "role": w["role"],
+ "primary_station": w["primary_station"],
+ "hours_per_week": int(w["hours_per_week"]),
+ "type": w["type"],
+ "certifications": certs,
+ "can_cover": cover,
+ })
+
+ # Worker nodes
+ run(session, """
+ UNWIND $rows AS row
+ MERGE (w:Worker {worker_id: row.worker_id})
+ SET w.name = row.name,
+ w.role = row.role,
+ w.hours_per_week = row.hours_per_week,
+ w.type = row.type
+ """, rows=worker_data)
+
+ # Certification nodes + HOLDS
+ run(session, """
+ UNWIND $rows AS row
+ MATCH (w:Worker {worker_id: row.worker_id})
+ UNWIND row.certifications AS cert
+ MERGE (c:Certification {cert_name: cert})
+ MERGE (w)-[:HOLDS]->(c)
+ """, rows=worker_data)
+
+ # WORKS_AT — skip W11 (primary_station = "all")
+ run(session, """
+ UNWIND $rows AS row
+ WITH row WHERE row.primary_station <> 'all'
+ MATCH (w:Worker {worker_id: row.worker_id})
+ MATCH (s:Station {station_code: row.primary_station})
+ MERGE (w)-[:WORKS_AT]->(s)
+ """, rows=worker_data)
+
+ # CAN_COVER
+ run(session, """
+ UNWIND $rows AS row
+ MATCH (w:Worker {worker_id: row.worker_id})
+ UNWIND row.can_cover AS sc
+ MATCH (s:Station {station_code: sc})
+ MERGE (w)-[:CAN_COVER]->(s)
+ """, rows=worker_data)
+
+ print(f" Loaded {len(rows)} workers → Workers, Certifications + relationships")
+
+
+def seed_capacity(session, rows):
+ """Phase 5: HAS_CAPACITY relationships from factory_capacity.csv."""
+
+ cap_data = []
+ for c in rows:
+ cap_data.append({
+ "week": c["week"],
+ "own_hours": int(c["own_hours"]),
+ "hired_hours": int(c["hired_hours"]),
+ "overtime_hours": int(c["overtime_hours"]),
+ "total_planned": int(c["total_planned"]),
+ "deficit": int(c["deficit"]),
+ })
+
+ run(session, """
+ UNWIND $rows AS row
+ MERGE (wk:Week {week_id: row.week})
+ MATCH (f:Factory)
+ MERGE (wk)-[r:HAS_CAPACITY]->(f)
+ SET r.own_hours = row.own_hours,
+ r.hired_hours = row.hired_hours,
+ r.overtime_hours = row.overtime_hours,
+ r.total_planned = row.total_planned,
+ r.deficit = row.deficit
+ """, rows=cap_data)
+
+ print(f" Loaded {len(rows)} capacity rows → HAS_CAPACITY")
+
+
+def compute_loaded_in(session):
+ """Phase 6: Aggregate SCHEDULED_AT into LOADED_IN per (station, week)."""
+ run(session, """
+ MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station)
+ WITH s, r.week AS week,
+ sum(r.planned_hours) AS tp,
+ sum(r.actual_hours) AS ta
+ MATCH (wk:Week {week_id: week})
+ MERGE (s)-[l:LOADED_IN]->(wk)
+ SET l.total_planned = tp,
+ l.total_actual = ta
+ """)
+ print(" Computed LOADED_IN aggregations")
+
+
+# ── Main ─────────────────────────────────────────────────────────────────────
+
+def main():
+ print("=" * 55)
+ print(" Factory Knowledge Graph — Seeder")
+ print("=" * 55)
+
+ driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))
+ driver.verify_connectivity()
+ print("Connected to Neo4j\n")
+
+ # Read CSVs
+ production = read_csv("factory_production.csv")
+ workers = read_csv("factory_workers.csv")
+ capacity = read_csv("factory_capacity.csv")
+
+ with driver.session() as session:
+ # Phase 0: Clear
+ run(session, "MATCH (n) DETACH DELETE n")
+ print("Cleared existing graph\n")
+
+ # Phase 1: Constraints
+ create_constraints(session)
+
+ # Phase 2: Factory singleton
+ run(session, 'MERGE (:Factory {factory_name: "VSAB Stålbyggnad"})')
+ print(" Created Factory node")
+
+ # Phase 3–6
+ seed_production(session, production)
+ seed_workers(session, workers)
+ seed_capacity(session, capacity)
+ compute_loaded_in(session)
+
+ # ── Summary ──
+ with driver.session() as session:
+ nodes = session.run("MATCH (n) RETURN count(n) AS c").single()["c"]
+ rels = session.run("MATCH ()-[r]->() RETURN count(r) AS c").single()["c"]
+ labels = session.run("CALL db.labels() YIELD label RETURN collect(label) AS l").single()["l"]
+ rel_types = session.run(
+ "CALL db.relationshipTypes() YIELD relationshipType RETURN collect(relationshipType) AS t"
+ ).single()["t"]
+
+ print(f"\n{'=' * 55}")
+ print(f" Seeding complete!")
+ print(f" Nodes: {nodes}")
+ print(f" Relationships: {rels}")
+ print(f" Labels ({len(labels)}): {labels}")
+ print(f" Rel types ({len(rel_types)}): {rel_types}")
+ print(f"{'=' * 55}")
+
+ driver.close()
+
+
+if __name__ == "__main__":
+ main()