From 720f43b1add5baa601ba8911abce97376d37ae85 Mon Sep 17 00:00:00 2001 From: SAIMA AFROZ Date: Wed, 13 May 2026 15:18:30 +0530 Subject: [PATCH 1/3] level-5: Saima Afroz --- submissions/saima-afroz/level5/answers.md | 545 ++++++++++++++++++++++ submissions/saima-afroz/level5/schema.md | 24 + 2 files changed, 569 insertions(+) create mode 100644 submissions/saima-afroz/level5/answers.md create mode 100644 submissions/saima-afroz/level5/schema.md diff --git a/submissions/saima-afroz/level5/answers.md b/submissions/saima-afroz/level5/answers.md new file mode 100644 index 000000000..5a5e1f853 --- /dev/null +++ b/submissions/saima-afroz/level5/answers.md @@ -0,0 +1,545 @@ +# Level 5 — Graph Thinking: Complete Answers + +--- + +## Q1. Model It (20 pts) + +### Biomimicry Inspiration — The Wood Wide Web + +This schema is structured in three layers, inspired by how forests communicate through +underground fungal (mycorrhizal) networks — the "Wood Wide Web." + +In a forest: +- Trees above ground do the visible work (photosynthesis, growth, fruit) +- Underground, an invisible fungal network connects all roots +- The network transfers nutrients from strong trees to struggling ones +- When a tree is attacked, it sends a chemical distress signal through the network +- If one path breaks, the network reroutes through other connections + +This factory works the same way: + +| Forest concept | Factory equivalent | +|---|---| +| Canopy — visible work | Projects producing Products at Stations | +| Fungal network — invisible connections | Worker CAN_COVER relationships | +| Mother tree — central hub | Victor Elm (covers all stations) + Station 011 (every project passes through it) | +| Distress signal | Alert node fires when actual_hours > planned × 1.10 | +| Ecosystem boundary | Country node — certifications and labor law don't cross borders automatically | +| Nutrient flow | Hours flowing through SCHEDULED_AT relationships | +| Single point of failure | Station 016 — only Per Hansen covers it | + +Because factories are located in **different countries**, the schema adds a Geography +layer between the canopy and the underground. Worker coverage (the fungal network) cannot +cross country boundaries freely — it must check legal jurisdiction, certification +validity, and travel time first. Like mycelium that does not cross ecosystem boundaries. + +--- + +### Node Labels (10 total — minimum required: 6) + +| Node | CSV Source | Key Properties | Forest Analogy | +|------|-----------|---------------|----------------| +| `Project` | factory_production.csv | id, name, number | forest region | +| `Product` | factory_production.csv | type, unit | fruit / output | +| `Station` | factory_production.csv | code, name | nutrient hub | +| `Week` | factory_capacity.csv | name, own_hours, hired_hours, overtime_hours, total_capacity, total_planned, deficit | season / growth cycle | +| `Etapp` | factory_production.csv | name (ET1, ET2) | growth phase | +| `BOP` | factory_production.csv | name (BOP1, BOP2…) | sub-branch | +| `Country` | multi-country constraint | name, code, timezone, currency, labor_law | ecosystem boundary | +| `Factory` | multi-country constraint | name, city | individual tree | +| `Worker` | factory_workers.csv | id, name, role, certifications (list), hours_per_week, type | tree in the network | +| `Alert` | computed when actual > planned × 1.10 | variance_pct, week, type | distress signal | + +**Worker node — certifications stored as a list, not a string:** + +```cypher +MERGE (w:Worker {id: "W01"}) +SET w.name = "Erik Lindberg", + w.role = "Operator", + w.certifications = ["MIG/MAG", "TIG", "ISO 9606"], + w.hours_per_week = 40, + w.type = "permanent" +``` + +Storing as a list means you can query `WHERE "ISO 9606" IN w.certifications` — essential +for the multi-country certification checks. + +--- + +### Relationship Types (13 total — minimum required: 8) + +| Relationship | Direction | Data Carried | Layer | +|---|---|---|---| +| `PRODUCES` | Project → Product | quantity, unit_factor | Canopy | +| `SCHEDULED_AT` | Project → Station | planned_hours, actual_hours, week, variance_pct | Canopy | +| `BELONGS_TO` | Project → Etapp | — | Canopy | +| `CONTAINS_BOP` | Etapp → BOP | — | Canopy | +| `SPANS_WEEK` | BOP → Week | — | Canopy | +| `ACTIVE_IN` | Station → Week | — | Canopy | +| `LOCATED_IN` | Factory → Country | — | Geography | +| `IN_FACTORY` | Station → Factory | — | Geography | +| `EMPLOYED_IN` | Worker → Country | — | Geography | +| `WORKS_AT` | Worker → Station | — | Underground | +| `CAN_COVER` | Worker → Station | valid_in_country, cert_valid, travel_days | Underground (fungal) | +| `TRIGGERED_BY` | Alert → Station | variance_pct | Underground | +| `AFFECTS` | Alert → Project | — | Underground | + +**SCHEDULED_AT — the heartbeat of the system (carries the most important data):** + +```cypher +(p:Project)-[:SCHEDULED_AT { + week: "w1", + planned_hours: 42.0, + actual_hours: 48.0, + variance_pct: 14.3 +}]->(s:Station) +``` + +**CAN_COVER — the fungal network, now with country constraints:** + +```cypher +(w:Worker)-[:CAN_COVER { + valid_in_country: "SE", + cert_valid: true, + travel_days: 0 +}]->(s:Station) +``` + +The `CAN_COVER` relationship is drawn as a dashed line — like the fungal network it is +underground and invisible until something goes wrong. It only becomes critical when a +worker is absent and the manager asks: who can cover, is legally allowed to, and can +physically arrive in time? + +--- + +### Schema Diagram (Mermaid) + +```mermaid +graph TD + + subgraph Canopy["Canopy Layer — Visible Production Flow"] + Project -->|"PRODUCES {qty, unit_factor}"| Product + Project -->|"SCHEDULED_AT {planned_h, actual_h, week}"| Station + Project -->|BELONGS_TO| Etapp + Etapp -->|CONTAINS_BOP| BOP + BOP -->|SPANS_WEEK| Week + Station -->|ACTIVE_IN| Week + end + + subgraph Geography["Geography Layer — Multi-Country Structure"] + Factory -->|LOCATED_IN| Country + Station -->|IN_FACTORY| Factory + Worker -->|EMPLOYED_IN| Country + end + + subgraph Underground["Underground Layer — Fungal Network"] + Worker -->|WORKS_AT| Station + Worker -->|"CAN_COVER {valid_in_country, cert_valid, travel_days}"| Station + Alert -->|"TRIGGERED_BY {variance_pct}"| Station + Alert -->|AFFECTS| Project + end +``` + +--- + +## Q2. Why Not Just SQL? (20 pts) + +### The Query + +*"Which workers are certified to cover Station 016 (Gjutning) when Per Hansen is on +vacation, and which projects would be affected?"* + +--- + +### SQL Version + +Assumes reasonable tables: +`workers`, `stations`, `worker_coverage` (junction table), `production`, `projects` + +```sql +SELECT + w.name AS covering_worker, + w.certifications AS certifications, + p.project_name AS affected_project +FROM workers w +JOIN worker_coverage wc ON w.worker_id = wc.worker_id +JOIN stations s ON wc.station_id = s.station_id +JOIN production prod ON s.station_id = prod.station_id +JOIN projects p ON prod.project_id = p.project_id +WHERE s.station_code = '016' + AND w.worker_id != ( + SELECT worker_id + FROM workers + WHERE name = 'Per Hansen' + ); +``` + +--- + +### Basic Cypher Version + +```cypher +MATCH (cover:Worker)-[:CAN_COVER]->(s:Station {code: '016'}) +WHERE cover.name <> 'Per Hansen' +MATCH (p:Project)-[:SCHEDULED_AT]->(s) +RETURN cover.name AS covering_worker, + cover.certifications AS certifications, + collect(p.name) AS affected_projects +``` + +--- + +### Multi-Country Cypher Version (reflects real factory constraint) + +```cypher +MATCH (cover:Worker)-[c:CAN_COVER]->(s:Station {code: '016'}) +MATCH (s)-[:IN_FACTORY]->(f:Factory)-[:LOCATED_IN]->(country:Country) +WHERE cover.name <> 'Per Hansen' + AND c.valid_in_country = country.code + AND c.cert_valid = true + AND c.travel_days <= 1 +WITH cover, c, s +MATCH (p:Project)-[:SCHEDULED_AT]->(s) +RETURN cover.name AS covering_worker, + c.travel_days AS days_travel_needed, + collect(p.name) AS affected_projects +ORDER BY c.travel_days ASC +``` + +--- + +### What the graph makes obvious that SQL hides + +**First:** In SQL, the relationship between workers and projects is completely invisible. +You discover it only by chaining four JOINs through intermediate tables. The query is +technically correct but structurally opaque — you have to mentally reconstruct the +connection by reading three table definitions. In Cypher, you follow two natural hops: +`CAN_COVER` to find coverage, `SCHEDULED_AT` back to find affected projects. The path +is the query. + +**Second:** SQL completely hides the single-point-of-failure problem. You would only +discover that Per Hansen is the only person covering Station 016 by running a completely +separate COUNT query. In the graph, you can see it structurally — only one `CAN_COVER` +arrow points to that node. The danger is visible in the shape of the graph itself. + +**Third:** The multi-country constraints — legal jurisdiction, certification validity, +travel time — would require two additional JOINs in SQL with no structural reason why +those three conditions belong together. In the graph, they live naturally on the +`CAN_COVER` relationship as properties, because they are all properties of the same +real-world fact: this worker can or cannot cover this station under these conditions. + +--- + +## Q3. Spot the Bottleneck (20 pts) + +### Part 1 — Which weeks, projects and stations are causing the overload + +**Weekly capacity — 5 out of 8 weeks in deficit:** + +| Week | Available | Planned | Deficit | Status | +|------|-----------|---------|---------|--------| +| w1 | 480h | 612h | **−132h** | 🔴 worst week | +| w2 | 520h | 645h | **−125h** | 🔴 critical | +| w3 | 480h | 398h | +82h | 🟢 fine | +| w4 | 500h | 550h | **−50h** | 🔴 over | +| w5 | 510h | 480h | +30h | 🟢 fine | +| w6 | 440h | 520h | **−80h** | 🔴 over | +| w7 | 520h | 600h | **−80h** | 🔴 over | +| w8 | 500h | 470h | +30h | 🟢 fine | + +**Root cause — Station 016 (Gjutning) — worst overrun station:** + +| Project | Week | Planned | Actual | Overrun | +|---------|------|---------|--------|---------| +| P03 Lagerhall Jönköping | w2 | 28h | 35h | **+25%** 🔴 | +| P05 Sjukhus Linköping | w2 | 35h | 40h | **+14%** 🔴 | +| P07 Idrottshall Västerås | w2 | 20h | 22h | +10% | +| P08 Bro E6 Halmstad | w3 | 22h | 25h | **+14%** 🔴 | + +Station 016 is also a single point of failure — only Per Hansen covers it. When it +overloads AND he is absent, the fungal network has no valid rerouting path unless +a cross-country certified backup exists. + +**Root cause — Station 014 (Svets o montage IQB):** + +| Project | Week | Planned | Actual | Overrun | +|---------|------|---------|--------|---------| +| P03 Lagerhall Jönköping | w1 | 42h | 48h | **+14%** 🔴 | +| P05 Sjukhus Linköping | w1 | 58h | 62h | +7% | +| P08 Bro E6 Halmstad | w1 | 40h | 44h | +10% | + +**Root cause — Station 011 (FS IQB) — volume problem:** +Every single project passes through Station 011. No single project overruns badly, but +the combined load in w1 and w2 makes it the mother tree of the factory — the hub that +everything depends on. If Station 011 goes down, all 8 projects are affected. + +--- + +### Part 2 — Cypher query + +```cypher +MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) +WHERE r.actual_hours > r.planned_hours * 1.1 +RETURN + s.name AS station, + collect(p.name) AS overloaded_projects, + round( + avg( + (r.actual_hours - r.planned_hours) + / r.planned_hours * 100 + ) + ) AS avg_overrun_pct +ORDER BY avg_overrun_pct DESC +``` + +**Expected output:** + +| Station | Projects | Avg overrun % | +|---------|----------|--------------| +| Gjutning 016 | P03, P05, P07, P08 | +16% | +| Svets o montage 014 | P03, P05, P08 | +10% | + +--- + +### Part 3 — Alert as a graph pattern + +The alert is modeled as a **first-class node**, not just a flag on a relationship. This +is the biomimicry insight: in the forest, a distress signal travels through the fungal +network to reach workers who can respond. A number in a table cannot travel anywhere. + +```cypher +MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) +WHERE r.actual_hours > r.planned_hours * 1.1 +MERGE (a:Alert {station_code: s.code, week: r.week}) +SET a.variance_pct = round( + (r.actual_hours - r.planned_hours) / r.planned_hours * 100 + ), + a.type = 'overrun' +MERGE (a)-[:TRIGGERED_BY]->(s) +MERGE (a)-[:AFFECTS]->(p) +``` + +Once `Alert` exists as a node, you can: +- Query all active alerts: `MATCH (a:Alert) RETURN a ORDER BY a.variance_pct DESC` +- Find who can resolve it: `TRIGGERED_BY` → Station → reverse `CAN_COVER` → Worker +- Apply multi-country filter: `WHERE c.valid_in_country = country.code` +- Track resolution: add `(a)-[:RESOLVED_BY]->(Worker)` when assigned +- Chain alerts: if Station 016 is overloaded and Per Hansen is absent, the alert + propagates to projects AND triggers the backup search simultaneously + +--- + +## Q4. Vector + Graph Hybrid (20 pts) + +### Part 1 — What to embed + +Embed **project descriptions** built by combining multiple CSV fields into one sentence: + +``` +"P03 — 900m IQB warehouse Jönköping ET1, + stations: 011 012 013 014 016 017 018, + 8 weeks, large-scale, avg variance +5%" +``` + +This captures: product type, quantity, scale, building category, location, stations used, +and execution quality — the full fingerprint of a project, not just the product code. + +Do NOT embed just `product_type`. Two projects can both use IQB beams and be completely +different in complexity, risk, and outcome. The embedding must encode how the project +behaved, not just what it made. + +--- + +### Part 2 — Hybrid query combining vector + graph + +```cypher +// Step 1: vector similarity search finds 5 most similar past projects +CALL db.index.vector.queryNodes( + 'project_embeddings', + 5, + $query_vector +) +YIELD node AS similar_project, score + +// Step 2: graph filter — variance under 5% (well-executed projects only) +MATCH (similar_project)-[r:SCHEDULED_AT]->(s:Station) +WHERE abs(r.actual_hours - r.planned_hours) + / r.planned_hours <= 0.05 + +// Step 3: multi-country filter — stations are in a reachable jurisdiction +MATCH (s)-[:IN_FACTORY]->(f:Factory)-[:LOCATED_IN]->(c:Country) + +RETURN similar_project.name AS reference_project, + score AS similarity_score, + collect(s.name) AS stations_used, + c.name AS country +ORDER BY score DESC +``` + +**Three-step logic:** +1. Vector narrows the field to similar-scope projects +2. Graph removes projects that ran badly (variance > 5%) +3. Geography layer checks the stations are in a legally accessible country + +--- + +### Part 3 — Why better than filtering by product type + +Filtering by product type returns all projects that used IQB beams. That includes P03 +Lagerhall Jönköping (ran 25% over at Station 016) and P05 Sjukhus Linköping (ran cleanly +at +3%). Both use IQB beams. Product type filtering cannot distinguish them. + +Vector similarity finds projects that **behaved** similarly — same scope, same scale, same +risk profile — because the embedding captures the full fingerprint of a project. The graph +layer then ensures only well-executed past projects are surfaced. You get projects worth +copying, not just projects that used the same materials. + +**The Boardy parallel:** +The exact same pattern applies to people matching. Instead of project descriptions, embed +person profiles (skills + interests + needs). Vector search finds people with +complementary profiles. Graph filtering checks: not already teammates, same network +community, schedules compatible. Neither step works alone — vector narrows the field, +graph enforces real-world constraints. + +--- + +## Q5. Your L6 Plan (20 pts) + +### Part 1 — Node labels and CSV column mapping + +| Node | CSV File | Columns | Notes | +|------|----------|---------|-------| +| `Project` | factory_production.csv | project_id, project_number, project_name | one node per unique project_id | +| `Product` | factory_production.csv | product_type, unit | one node per unique product_type | +| `Station` | factory_production.csv | station_code, station_name | one node per unique station_code | +| `Week` | factory_capacity.csv | week + all capacity columns | enriched with own_hours, hired_hours, overtime_hours, total_capacity, total_planned, deficit | +| `Etapp` | factory_production.csv | etapp | ET1 and ET2 | +| `BOP` | factory_production.csv | bop | BOP1, BOP2, BOP3 | +| `Worker` | factory_workers.csv | all columns | certifications split into list | +| `Country` | inferred | SE as base | extended when other country data available | +| `Factory` | inferred | one per physical production site | links stations to countries | +| `Alert` | computed | actual_hours vs planned_hours | created by seed_graph.py logic, not from CSV | + +**Special case — Victor Elm:** +His primary_station is "all" in the CSV. The seed script creates `WORKS_AT` and +`CAN_COVER` relationships to every station rather than storing "all" as a string. + +--- + +### Part 2 — Relationship types and what creates them + +| Relationship | Created when | Data on relationship | +|---|---|---| +| `(Project)-[:PRODUCES]->(Product)` | Each unique project + product_type combination | quantity, unit_factor | +| `(Project)-[:SCHEDULED_AT]->(Station)` | Every row in factory_production.csv | planned_hours, actual_hours, week, variance_pct | +| `(Project)-[:BELONGS_TO]->(Etapp)` | etapp column on each production row | — | +| `(Etapp)-[:CONTAINS_BOP]->(BOP)` | bop column on each production row | — | +| `(BOP)-[:SPANS_WEEK]->(Week)` | bop + week columns together | — | +| `(Station)-[:ACTIVE_IN]->(Week)` | station_code + week on each row | — | +| `(Factory)-[:LOCATED_IN]->(Country)` | factory setup (once per factory) | — | +| `(Station)-[:IN_FACTORY]->(Factory)` | station to factory mapping | — | +| `(Worker)-[:EMPLOYED_IN]->(Country)` | worker type + jurisdiction | — | +| `(Worker)-[:WORKS_AT]->(Station)` | primary_station column | — | +| `(Worker)-[:CAN_COVER]->(Station)` | each code in can_cover_stations, split by comma | valid_in_country, cert_valid, travel_days | +| `(Alert)-[:TRIGGERED_BY]->(Station)` | computed: actual > planned × 1.10 | variance_pct | +| `(Alert)-[:AFFECTS]->(Project)` | same computation | — | + +--- + +### Part 3 — Dashboard panels (4 panels) + +**Panel 1 — Project Overview** +All 8 projects with planned hours, actual hours, and variance %. Bar chart coloured +green (under budget) / yellow (0–10% over) / red (>10% over). Manager opens this first +thing Monday to see which projects need immediate attention. + +**Panel 2 — Station Load Heatmap** +Grid: stations on one axis, weeks on the other, colour intensity = total actual hours. +Dark red = overloaded. Interactive — click a cell to see which projects are contributing +to that station in that week. + +**Panel 3 — Capacity Tracker** +Line chart: available hours vs planned demand across 8 weeks. Deficit weeks shaded red. +Stacked bar below it showing own staff vs hired vs overtime breakdown. Tells the manager +when to approve overtime or bring in hired workers. + +**Panel 4 — Worker Coverage Matrix** +Table: every worker against every station, tick where they can cover. Stations with only +one tick highlighted red (single point of failure). Warning banner when a red station +also has an active Alert node — means the station is both overloaded AND has no backup. + +--- + +### Part 4 — Cypher queries powering each panel + +**Panel 1 — Project Overview:** +```cypher +MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) +RETURN p.name AS project, + sum(r.planned_hours) AS planned, + sum(r.actual_hours) AS actual, + round( + (sum(r.actual_hours) - sum(r.planned_hours)) + / sum(r.planned_hours) * 100 + ) AS variance_pct +ORDER BY p.id +``` + +**Panel 2 — Station Load Heatmap:** +```cypher +MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) +RETURN s.name AS station, + r.week AS week, + sum(r.actual_hours) AS total_actual_hours +ORDER BY station, week +``` + +**Panel 3 — Capacity Tracker:** +```cypher +MATCH (wk:Week) +WHERE wk.total_capacity IS NOT NULL +RETURN wk.name AS week, + wk.own_hours AS own, + wk.hired_hours AS hired, + wk.overtime_hours AS overtime, + wk.total_capacity AS available, + wk.total_planned AS planned, + wk.deficit AS deficit +ORDER BY wk.name +``` + +**Panel 4 — Coverage Matrix:** +```cypher +MATCH (w:Worker)-[:CAN_COVER]->(s:Station) +RETURN w.name AS worker, + collect(s.code) AS covered_stations +``` + +**Panel 4 — Single point of failure warning:** +```cypher +MATCH (w:Worker)-[:CAN_COVER]->(s:Station) +WITH s, count(w) AS coverage_count +WHERE coverage_count = 1 +MATCH (w2:Worker)-[:CAN_COVER]->(s) +RETURN s.code AS station_code, + s.name AS station_name, + w2.name AS only_worker +``` + +**Panel 4 — Multi-country backup finder:** +```cypher +MATCH (cover:Worker)-[c:CAN_COVER]->(s:Station {code: '016'}) +MATCH (s)-[:IN_FACTORY]->(f:Factory)-[:LOCATED_IN]->(country:Country) +WHERE c.valid_in_country = country.code + AND c.cert_valid = true + AND c.travel_days <= 1 +RETURN cover.name AS valid_backup, + c.travel_days AS days_needed, + country.name AS jurisdiction +ORDER BY c.travel_days ASC +``` + +--- + +*Schema diagram: see schema.md in the same folder.* +*Level 6 implementation: see seed_graph.py and app.py.* diff --git a/submissions/saima-afroz/level5/schema.md b/submissions/saima-afroz/level5/schema.md new file mode 100644 index 000000000..29babf1b0 --- /dev/null +++ b/submissions/saima-afroz/level5/schema.md @@ -0,0 +1,24 @@ +# Factory Graph Schema — Saima Afroz + +See answers.md Q1 for full written explanation. + +graph TD + subgraph Canopy + Project -->|PRODUCES qty,unit_factor| Product + Project -->|SCHEDULED_AT planned_h,actual_h,week| Station + Project -->|BELONGS_TO| Etapp + Etapp -->|CONTAINS_BOP| BOP + BOP -->|SPANS_WEEK| Week + Station -->|ACTIVE_IN| Week + end + subgraph Geography + Factory -->|LOCATED_IN| Country + Station -->|IN_FACTORY| Factory + Worker -->|EMPLOYED_IN| Country + end + subgraph Underground + Worker -->|WORKS_AT| Station + Worker -->|CAN_COVER valid_in_country,cert_valid,travel_days| Station + Alert -->|TRIGGERED_BY variance_pct| Station + Alert -->|AFFECTS| Project + end \ No newline at end of file From 7124ee812429978467e75d0fa858c21612173107 Mon Sep 17 00:00:00 2001 From: SAIMA AFROZ Date: Wed, 13 May 2026 15:31:58 +0530 Subject: [PATCH 2/3] level-6: Saima Afroz --- submissions/saima-afroz/level6/.env.example | 3 + .../saima-afroz/level6/DASHBOARD_URL.txt | 1 + submissions/saima-afroz/level6/README.md | 4 + submissions/saima-afroz/level6/app.py | 495 ++++++++++++++++++ .../saima-afroz/level6/factory_capacity.csv | 9 + .../saima-afroz/level6/factory_production.csv | 69 +++ .../saima-afroz/level6/factory_workers.csv | 15 + .../saima-afroz/level6/requirements.txt | 5 + submissions/saima-afroz/level6/seed_graph.py | 228 ++++++++ 9 files changed, 829 insertions(+) create mode 100644 submissions/saima-afroz/level6/.env.example create mode 100644 submissions/saima-afroz/level6/DASHBOARD_URL.txt create mode 100644 submissions/saima-afroz/level6/README.md create mode 100644 submissions/saima-afroz/level6/app.py create mode 100644 submissions/saima-afroz/level6/factory_capacity.csv create mode 100644 submissions/saima-afroz/level6/factory_production.csv create mode 100644 submissions/saima-afroz/level6/factory_workers.csv create mode 100644 submissions/saima-afroz/level6/requirements.txt create mode 100644 submissions/saima-afroz/level6/seed_graph.py diff --git a/submissions/saima-afroz/level6/.env.example b/submissions/saima-afroz/level6/.env.example new file mode 100644 index 000000000..a7f967d2f --- /dev/null +++ b/submissions/saima-afroz/level6/.env.example @@ -0,0 +1,3 @@ +NEO4J_URI=neo4j+s://xxxxxxxx.databases.neo4j.io +NEO4J_USER=neo4j +NEO4J_PASSWORD=your-password-here diff --git a/submissions/saima-afroz/level6/DASHBOARD_URL.txt b/submissions/saima-afroz/level6/DASHBOARD_URL.txt new file mode 100644 index 000000000..af16e3a7d --- /dev/null +++ b/submissions/saima-afroz/level6/DASHBOARD_URL.txt @@ -0,0 +1 @@ +https://your-app.streamlit.app diff --git a/submissions/saima-afroz/level6/README.md b/submissions/saima-afroz/level6/README.md new file mode 100644 index 000000000..a00b0b837 --- /dev/null +++ b/submissions/saima-afroz/level6/README.md @@ -0,0 +1,4 @@ +# Factory Graph Dashboard - Level 6 + +Run: python seed_graph.py +Run: streamlit run app.py diff --git a/submissions/saima-afroz/level6/app.py b/submissions/saima-afroz/level6/app.py new file mode 100644 index 000000000..9eeb2a574 --- /dev/null +++ b/submissions/saima-afroz/level6/app.py @@ -0,0 +1,495 @@ +""" +app.py — Streamlit dashboard for the Swedish steel factory knowledge graph. +All data comes from Neo4j queries. +""" + +import os +import streamlit as st +import pandas as pd +import plotly.express as px +import plotly.graph_objects as go +from neo4j import GraphDatabase +from dotenv import load_dotenv + +load_dotenv() + +# ── Neo4j connection ────────────────────────────────────────────────────────── + +@st.cache_resource +def get_driver(): + uri = st.secrets.get("NEO4J_URI", os.getenv("NEO4J_URI")) + user = st.secrets.get("NEO4J_USER", os.getenv("NEO4J_USER", "neo4j")) + pwd = st.secrets.get("NEO4J_PASSWORD", os.getenv("NEO4J_PASSWORD")) + return GraphDatabase.driver(uri, auth=(user, pwd)) + +def run_query(query, params=None): + driver = get_driver() + with driver.session() as session: + result = session.run(query, params or {}) + return [dict(r) for r in result] + +# ── Page config ─────────────────────────────────────────────────────────────── + +st.set_page_config( + page_title="Factory Graph Dashboard", + page_icon="🏭", + layout="wide", +) + +# ── Sidebar navigation ──────────────────────────────────────────────────────── + +st.sidebar.title("🏭 Factory Dashboard") +page = st.sidebar.radio( + "Navigate", + ["Project Overview", "Station Load", "Capacity Tracker", "Worker Coverage", "Self-Test"], +) + +# ───────────────────────────────────────────────────────────────────────────── +# PAGE 1: Project Overview +# ───────────────────────────────────────────────────────────────────────────── + +if page == "Project Overview": + st.title("📋 Project Overview") + st.caption("All 8 projects — planned vs actual hours and product breakdown") + + rows = run_query(""" + MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) + WITH p, + sum(r.planned_hours) AS total_planned, + sum(r.actual_hours) AS total_actual + RETURN p.name AS project, + p.id AS pid, + total_planned, + total_actual, + round((total_actual - total_planned) / total_planned * 100 * 10) / 10 AS variance_pct + ORDER BY p.id + """) + df = pd.DataFrame(rows) + + # KPI row + col1, col2, col3 = st.columns(3) + col1.metric("Projects", len(df)) + col2.metric("Total planned hours", f"{int(df['total_planned'].sum()):,}") + over = len(df[df['variance_pct'] > 10]) + col3.metric("Projects >10% over", over, delta=f"+{over}" if over else "0", delta_color="inverse") + + st.divider() + + # Variance bar chart + colors = ["#e05252" if v > 10 else "#52b852" if v < 0 else "#f0a830" + for v in df["variance_pct"]] + fig = go.Figure(go.Bar( + x=df["project"], y=df["variance_pct"], + marker_color=colors, + text=[f"{v:+.1f}%" for v in df["variance_pct"]], + textposition="outside", + )) + fig.update_layout( + title="Variance % per project (actual vs planned hours)", + yaxis_title="Variance %", + xaxis_title="", + height=380, + plot_bgcolor="rgba(0,0,0,0)", + paper_bgcolor="rgba(0,0,0,0)", + yaxis=dict(gridcolor="rgba(128,128,128,0.15)"), + ) + fig.add_hline(y=10, line_dash="dot", line_color="red", annotation_text="10% alert") + fig.add_hline(y=0, line_dash="solid", line_color="gray", line_width=0.5) + st.plotly_chart(fig, use_container_width=True) + + st.divider() + st.subheader("Detailed table") + + df_display = df.rename(columns={ + "project": "Project", "pid": "ID", + "total_planned": "Planned h", "total_actual": "Actual h", "variance_pct": "Variance %" + }) + df_display["Status"] = df_display["Variance %"].apply( + lambda v: "🔴 Over" if v > 10 else ("🟢 Under" if v < 0 else "🟡 On track")) + + st.dataframe( + df_display[["ID","Project","Planned h","Actual h","Variance %","Status"]], + use_container_width=True, hide_index=True, + ) + + # Products per project + st.subheader("Products per project") + prod_rows = run_query(""" + MATCH (p:Project)-[:PRODUCES]->(pr:Product) + RETURN p.name AS project, p.id AS pid, collect(pr.type) AS products + ORDER BY pid + """) + prod_df = pd.DataFrame(prod_rows) + prod_df["products"] = prod_df["products"].apply(lambda x: ", ".join(sorted(x))) + st.dataframe(prod_df.rename(columns={"project":"Project","products":"Products"}), + use_container_width=True, hide_index=True) + +# ───────────────────────────────────────────────────────────────────────────── +# PAGE 2: Station Load +# ───────────────────────────────────────────────────────────────────────────── + +elif page == "Station Load": + st.title("🏗️ Station Load") + st.caption("Hours per station × week — red cells are over-planned") + + rows = run_query(""" + MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) + RETURN s.name AS station, r.week AS week, + sum(r.planned_hours) AS planned, + sum(r.actual_hours) AS actual + ORDER BY station, week + """) + df = pd.DataFrame(rows) + + week_order = ["w1","w2","w3","w4","w5","w6","w7","w8"] + existing_weeks = sorted(df["week"].unique(), key=lambda w: week_order.index(w) if w in week_order else 99) + + view = st.radio("Show", ["Actual hours", "Planned hours", "Variance %"], horizontal=True) + + if view == "Actual hours": + pivot = df.pivot_table(index="station", columns="week", values="actual", aggfunc="sum").reindex(columns=existing_weeks) + color_scale = "YlOrRd" + title = "Actual hours per station per week" + elif view == "Planned hours": + pivot = df.pivot_table(index="station", columns="week", values="planned", aggfunc="sum").reindex(columns=existing_weeks) + color_scale = "Blues" + title = "Planned hours per station per week" + else: + df["variance"] = (df["actual"] - df["planned"]) / df["planned"].replace(0, 1) * 100 + pivot = df.pivot_table(index="station", columns="week", values="variance", aggfunc="mean").reindex(columns=existing_weeks) + color_scale = "RdYlGn_r" + title = "Mean variance % per station per week" + + fig = px.imshow( + pivot, + color_continuous_scale=color_scale, + title=title, + labels=dict(color=view), + aspect="auto", + text_auto=".0f", + ) + fig.update_layout(height=420, paper_bgcolor="rgba(0,0,0,0)") + st.plotly_chart(fig, use_container_width=True) + + st.divider() + st.subheader("Station total actual vs planned") + + station_totals = run_query(""" + MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) + RETURN s.name AS station, + sum(r.planned_hours) AS planned, + sum(r.actual_hours) AS actual + ORDER BY actual DESC + """) + st_df = pd.DataFrame(station_totals) + fig2 = go.Figure() + fig2.add_trace(go.Bar(name="Planned", x=st_df["station"], y=st_df["planned"], + marker_color="#5b8dee")) + fig2.add_trace(go.Bar(name="Actual", x=st_df["station"], y=st_df["actual"], + marker_color="#e07a5f")) + fig2.update_layout( + barmode="group", height=380, + plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", + yaxis=dict(gridcolor="rgba(128,128,128,0.15)"), + legend=dict(orientation="h", yanchor="bottom", y=1.02), + ) + st.plotly_chart(fig2, use_container_width=True) + +# ───────────────────────────────────────────────────────────────────────────── +# PAGE 3: Capacity Tracker +# ───────────────────────────────────────────────────────────────────────────── + +elif page == "Capacity Tracker": + st.title("📊 Capacity Tracker") + st.caption("Weekly workforce capacity vs planned demand — red = deficit") + + rows = run_query(""" + MATCH (wk:Week) + WHERE wk.total_capacity IS NOT NULL + RETURN wk.name AS week, + wk.own_hours AS own, + wk.hired_hours AS hired, + wk.overtime_hours AS overtime, + wk.total_capacity AS capacity, + wk.total_planned AS planned, + wk.deficit AS deficit + ORDER BY wk.name + """) + df = pd.DataFrame(rows) + + week_order = ["w1","w2","w3","w4","w5","w6","w7","w8"] + df["week"] = pd.Categorical(df["week"], categories=week_order, ordered=True) + df = df.sort_values("week") + + # KPI row + deficit_weeks = df[df["deficit"] < 0] + total_deficit = int(df[df["deficit"] < 0]["deficit"].sum()) + col1, col2, col3 = st.columns(3) + col1.metric("Deficit weeks", len(deficit_weeks), delta=f"{len(deficit_weeks)}/8", delta_color="inverse") + col2.metric("Total deficit hours", total_deficit, delta_color="inverse") + col3.metric("Surplus weeks", len(df[df["deficit"] >= 0])) + + st.divider() + + # Capacity vs planned line chart + fig = go.Figure() + fig.add_trace(go.Scatter( + x=df["week"], y=df["capacity"], name="Total capacity", + line=dict(color="#4e9af1", width=2.5), mode="lines+markers" + )) + fig.add_trace(go.Scatter( + x=df["week"], y=df["planned"], name="Total planned", + line=dict(color="#f06c6c", width=2.5), mode="lines+markers" + )) + # shade deficit areas + for _, r in df.iterrows(): + if r["deficit"] < 0: + fig.add_vrect( + x0=str(r["week"]), x1=str(r["week"]), + fillcolor="rgba(240,108,108,0.15)", layer="below", line_width=0 + ) + fig.update_layout( + title="Capacity vs planned demand by week", + height=380, + plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", + yaxis=dict(gridcolor="rgba(128,128,128,0.15)", title="Hours"), + legend=dict(orientation="h", yanchor="bottom", y=1.02), + ) + st.plotly_chart(fig, use_container_width=True) + + # Stacked bar: own + hired + overtime + fig2 = go.Figure() + fig2.add_trace(go.Bar(name="Own hours", x=df["week"], y=df["own"], marker_color="#5b8dee")) + fig2.add_trace(go.Bar(name="Hired hours", x=df["week"], y=df["hired"], marker_color="#7ed3b2")) + fig2.add_trace(go.Bar(name="Overtime hours", x=df["week"], y=df["overtime"], marker_color="#f0c05a")) + fig2.add_trace(go.Scatter( + name="Planned demand", x=df["week"], y=df["planned"], + mode="lines+markers", line=dict(color="red", dash="dot", width=2), + )) + fig2.update_layout( + barmode="stack", title="Capacity breakdown by week", + height=380, + plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", + yaxis=dict(gridcolor="rgba(128,128,128,0.15)"), + legend=dict(orientation="h", yanchor="bottom", y=1.02), + ) + st.plotly_chart(fig2, use_container_width=True) + + st.subheader("Weekly detail") + df["Status"] = df["deficit"].apply(lambda d: "🔴 Deficit" if d < 0 else "🟢 Surplus") + st.dataframe( + df[["week","own","hired","overtime","capacity","planned","deficit","Status"]].rename( + columns={"week":"Week","own":"Own h","hired":"Hired h","overtime":"Overtime h", + "capacity":"Total cap","planned":"Planned","deficit":"Deficit"}), + use_container_width=True, hide_index=True, + ) + +# ───────────────────────────────────────────────────────────────────────────── +# PAGE 4: Worker Coverage +# ───────────────────────────────────────────────────────────────────────────── + +elif page == "Worker Coverage": + st.title("👷 Worker Coverage") + st.caption("Which workers cover which stations — red = single point of failure") + + # Coverage matrix + rows = run_query(""" + MATCH (w:Worker)-[:CAN_COVER]->(s:Station) + RETURN w.name AS worker, w.role AS role, + collect(s.code) AS stations + ORDER BY w.id + """) + workers_df = pd.DataFrame(rows) + + station_rows = run_query("MATCH (s:Station) RETURN s.code AS code, s.name AS name ORDER BY s.code") + all_stations = [r["code"] for r in station_rows] + station_names = {r["code"]: r["name"] for r in station_rows} + + # Build matrix + matrix = {} + for _, row in workers_df.iterrows(): + matrix[row["worker"]] = {s: ("✓" if s in row["stations"] else "") for s in all_stations} + matrix_df = pd.DataFrame(matrix).T + matrix_df.columns = [f"{c} {station_names.get(c,'')}" for c in matrix_df.columns] + + # Single-point-of-failure stations + spof_rows = run_query(""" + MATCH (w:Worker)-[:CAN_COVER]->(s:Station) + WITH s, count(w) AS worker_count + WHERE worker_count = 1 + MATCH (w2:Worker)-[:CAN_COVER]->(s) + RETURN s.code AS code, s.name AS name, w2.name AS only_worker + """) + spof_df = pd.DataFrame(spof_rows) + + col1, col2 = st.columns([2,1]) + with col1: + st.subheader("Coverage matrix") + st.dataframe(matrix_df, use_container_width=True) + + with col2: + st.subheader("⚠️ Single-point-of-failure stations") + if not spof_df.empty: + for _, r in spof_df.iterrows(): + st.error(f"**{r['code']} {r['name']}**\nOnly covered by: {r['only_worker']}") + else: + st.success("No single-point-of-failure stations") + + st.divider() + st.subheader("Workers per station (coverage count)") + + count_rows = run_query(""" + MATCH (w:Worker)-[:CAN_COVER]->(s:Station) + RETURN s.code + ' ' + s.name AS station, count(w) AS workers + ORDER BY workers ASC + """) + count_df = pd.DataFrame(count_rows) + colors = ["#e05252" if c == 1 else "#f0a830" if c == 2 else "#52b852" + for c in count_df["workers"]] + fig = go.Figure(go.Bar( + x=count_df["station"], y=count_df["workers"], + marker_color=colors, + text=count_df["workers"], textposition="outside", + )) + fig.update_layout( + height=350, plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", + yaxis=dict(gridcolor="rgba(128,128,128,0.15)", title="Workers who can cover"), + xaxis_title="", + ) + fig.add_hline(y=1, line_dash="dot", line_color="red", annotation_text="SPOF threshold") + st.plotly_chart(fig, use_container_width=True) + + st.subheader("Worker details") + detail_rows = run_query(""" + MATCH (w:Worker) + OPTIONAL MATCH (w)-[:WORKS_AT]->(ps:Station) + RETURN w.id AS id, w.name AS name, w.role AS role, + w.certifications AS certs, + w.hours_per_week AS hours, + w.type AS type, + ps.name AS primary_station + ORDER BY w.id + """) + st.dataframe( + pd.DataFrame(detail_rows).rename(columns={ + "id":"ID","name":"Name","role":"Role","certs":"Certifications", + "hours":"Hours/week","type":"Type","primary_station":"Primary Station" + }), + use_container_width=True, hide_index=True, + ) + +# ───────────────────────────────────────────────────────────────────────────── +# PAGE 5: Self-Test +# ───────────────────────────────────────────────────────────────────────────── + +elif page == "Self-Test": + st.title("🧪 Self-Test") + st.caption("Automated checks — verifies the graph meets all L6 requirements") + + def run_self_test(driver): + checks = [] + + # Check 1: Connection + try: + with driver.session() as s: + s.run("RETURN 1") + checks.append(("Neo4j connected", True, 3)) + except Exception as e: + checks.append((f"Neo4j connection FAILED: {e}", False, 3)) + return checks # can't continue + + with driver.session() as s: + # Check 2: Node count + result = s.run("MATCH (n) RETURN count(n) AS c").single() + count = result["c"] + checks.append((f"{count} nodes (min: 50)", count >= 50, 3)) + + # Check 3: Relationship count + result = s.run("MATCH ()-[r]->() RETURN count(r) AS c").single() + count = result["c"] + checks.append((f"{count} relationships (min: 100)", count >= 100, 3)) + + # Check 4: Node labels + result = s.run("CALL db.labels() YIELD label RETURN count(label) AS c").single() + count = result["c"] + checks.append((f"{count} node labels (min: 6)", count >= 6, 3)) + + # Check 5: Relationship types + result = s.run("CALL db.relationshipTypes() YIELD relationshipType RETURN count(relationshipType) AS c").single() + count = result["c"] + checks.append((f"{count} relationship types (min: 8)", count >= 8, 3)) + + # Check 6: Variance query + result = s.run(""" + MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) + WHERE r.actual_hours > r.planned_hours * 1.1 + RETURN p.name AS project, s.name AS station, + r.planned_hours AS planned, r.actual_hours AS actual + LIMIT 10 + """) + rows = [dict(r) for r in result] + checks.append((f"Variance query: {len(rows)} results found", len(rows) > 0, 5)) + + return checks + + if st.button("▶ Run self-test", type="primary"): + with st.spinner("Running checks..."): + try: + driver = get_driver() + results = run_self_test(driver) + + total_score = 0 + total_max = sum(pts for _, _, pts in results) + + st.divider() + for label, passed, pts in results: + icon = "✅" if passed else "❌" + score_str = f"{pts}/{pts}" if passed else f"0/{pts}" + col1, col2 = st.columns([6,1]) + col1.markdown(f"{icon} {label}") + col2.markdown(f"**{score_str}**") + if passed: + total_score += pts + + st.divider() + pct = int(total_score / total_max * 100) + if total_score == total_max: + st.success(f"### 🎉 SELF-TEST SCORE: {total_score}/{total_max} — Perfect!") + elif total_score >= 14: + st.warning(f"### SELF-TEST SCORE: {total_score}/{total_max} ({pct}%)") + else: + st.error(f"### SELF-TEST SCORE: {total_score}/{total_max} ({pct}%)") + + # Show variance results table if check 6 passed + if results[-1][1]: + with st.expander("Variance query results (projects >10% over)"): + driver2 = get_driver() + with driver2.session() as s: + r = s.run(""" + MATCH (p:Project)-[r:SCHEDULED_AT]->(s:Station) + WHERE r.actual_hours > r.planned_hours * 1.1 + RETURN p.name AS project, s.name AS station, + r.week AS week, + r.planned_hours AS planned, + r.actual_hours AS actual, + round((r.actual_hours - r.planned_hours) + / r.planned_hours * 100 * 10) / 10 AS variance_pct + ORDER BY variance_pct DESC + """) + st.dataframe(pd.DataFrame([dict(row) for row in r]), + use_container_width=True, hide_index=True) + + except Exception as e: + st.error(f"Could not connect to Neo4j: {e}") + st.info("Check your .env file or Streamlit secrets — NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD") + else: + st.info("Click **Run self-test** to verify your graph meets all L6 requirements.") + st.markdown(""" + **Checks:** + - ✅ Neo4j connection alive **(3 pts)** + - ✅ Node count ≥ 50 **(3 pts)** + - ✅ Relationship count ≥ 100 **(3 pts)** + - ✅ 6+ distinct node labels **(3 pts)** + - ✅ 8+ distinct relationship types **(3 pts)** + - ✅ Variance query returns results **(5 pts)** + """) diff --git a/submissions/saima-afroz/level6/factory_capacity.csv b/submissions/saima-afroz/level6/factory_capacity.csv new file mode 100644 index 000000000..795ff52f0 --- /dev/null +++ b/submissions/saima-afroz/level6/factory_capacity.csv @@ -0,0 +1,9 @@ +week,own_staff_count,hired_staff_count,own_hours,hired_hours,overtime_hours,total_capacity,total_planned,deficit +w1,10,2,400,80,0,480,612,-132 +w2,10,2,400,80,40,520,645,-125 +w3,10,2,400,80,0,480,398,82 +w4,10,2,400,80,20,500,550,-50 +w5,10,2,400,80,30,510,480,30 +w6,9,2,360,80,0,440,520,-80 +w7,10,2,400,80,40,520,600,-80 +w8,10,2,400,80,20,500,470,30 \ No newline at end of file diff --git a/submissions/saima-afroz/level6/factory_production.csv b/submissions/saima-afroz/level6/factory_production.csv new file mode 100644 index 000000000..ca6ce43e1 --- /dev/null +++ b/submissions/saima-afroz/level6/factory_production.csv @@ -0,0 +1,69 @@ +project_id,project_number,project_name,product_type,unit,quantity,unit_factor,station_code,station_name,etapp,bop,week,planned_hours,actual_hours,completed_units +P01,4501,Stålverket Borås,IQB,meter,600,1.77,011,FS IQB,ET1,BOP1,w1,48.0,45.2,28 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,012,Förmontering IQB,ET1,BOP1,w1,32.0,35.5,25 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,013,Montering IQB,ET1,BOP1,w1,28.0,26.0,22 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,014,Svets o montage IQB,ET1,BOP1,w1,35.0,38.2,20 +P01,4501,Stålverket Borås,SB,styck,40,4.0,018,SB B/F-hall,ET1,BOP1,w1,16.0,14.5,4 +P01,4501,Stålverket Borås,SP,styck,180,2.0,019,SP B/F-hall,ET1,BOP1,w1,12.0,13.0,7 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,011,FS IQB,ET1,BOP1,w2,48.0,50.0,32 +P01,4501,Stålverket Borås,IQB,meter,600,1.77,012,Förmontering IQB,ET1,BOP1,w2,32.0,30.0,28 +P01,4501,Stålverket Borås,IQP,styck,90,2.80,015,Montering IQP,ET1,BOP2,w2,25.0,28.0,9 +P01,4501,Stålverket Borås,SR,styck,8,45.0,021,SR B/F-hall,ET1,BOP2,w2,40.0,42.0,1 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,011,FS IQB,ET1,BOP1,w1,30.0,28.0,20 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,012,Förmontering IQB,ET1,BOP1,w1,22.0,24.5,18 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,013,Montering IQB,ET1,BOP1,w1,18.0,17.0,16 +P02,4502,Kontorshus Mölndal,IQP,styck,70,2.70,015,Montering IQP,ET1,BOP1,w1,19.0,21.0,7 +P02,4502,Kontorshus Mölndal,SD,styck,30,3.00,018,SB B/F-hall,ET1,BOP1,w1,9.0,8.5,3 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,011,FS IQB,ET1,BOP1,w2,30.0,32.0,24 +P02,4502,Kontorshus Mölndal,IQB,meter,350,1.50,014,Svets o montage IQB,ET1,BOP1,w2,25.0,23.0,20 +P02,4502,Kontorshus Mölndal,SP,styck,120,1.75,019,SP B/F-hall,ET1,BOP2,w2,14.0,15.5,8 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,011,FS IQB,ET1,BOP1,w1,72.0,70.0,40 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,012,Förmontering IQB,ET1,BOP1,w1,48.0,52.0,35 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,013,Montering IQB,ET1,BOP1,w1,38.0,36.5,30 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,014,Svets o montage IQB,ET1,BOP1,w1,42.0,48.0,28 +P03,4503,Lagerhall Jönköping,SB,styck,60,6.00,018,SB B/F-hall,ET1,BOP1,w1,36.0,38.0,6 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,011,FS IQB,ET1,BOP1,w2,72.0,75.0,45 +P03,4503,Lagerhall Jönköping,IQP,styck,110,2.90,015,Montering IQP,ET1,BOP2,w2,32.0,30.0,11 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,016,Gjutning,ET1,BOP2,w2,28.0,35.0,8 +P03,4503,Lagerhall Jönköping,IQB,meter,900,1.89,017,Målning,ET1,BOP2,w3,24.0,22.0,20 +P04,4504,Parkering Helsingborg,IQB,meter,450,1.65,011,FS IQB,ET1,BOP1,w1,38.0,36.0,24 +P04,4504,Parkering Helsingborg,IQB,meter,450,1.65,012,Förmontering IQB,ET1,BOP1,w1,25.0,27.0,20 +P04,4504,Parkering Helsingborg,IQB,meter,450,1.65,013,Montering IQB,ET1,BOP1,w1,20.0,19.0,18 +P04,4504,Parkering Helsingborg,IQP,styck,55,2.85,015,Montering IQP,ET1,BOP1,w1,16.0,18.0,6 +P04,4504,Parkering Helsingborg,SB,styck,25,7.50,018,SB B/F-hall,ET1,BOP1,w1,19.0,22.0,3 +P04,4504,Parkering Helsingborg,IQB,meter,450,1.65,011,FS IQB,ET1,BOP1,w2,38.0,40.0,28 +P04,4504,Parkering Helsingborg,SP,styck,100,2.00,019,SP B/F-hall,ET1,BOP2,w2,12.0,11.0,6 +P04,4504,Parkering Helsingborg,SR,styck,12,120.0,021,SR B/F-hall,ET1,BOP2,w2,60.0,65.0,1 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,011,FS IQB,ET2,BOP3,w1,95.0,90.0,50 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,012,Förmontering IQB,ET2,BOP3,w1,65.0,68.0,42 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,013,Montering IQB,ET2,BOP3,w1,50.0,48.0,38 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,014,Svets o montage IQB,ET2,BOP3,w1,58.0,62.0,35 +P05,4505,Sjukhus Linköping ET2,IQP,styck,150,2.88,015,Montering IQP,ET2,BOP3,w1,30.0,33.0,10 +P05,4505,Sjukhus Linköping ET2,SB,styck,50,5.00,018,SB B/F-hall,ET2,BOP3,w1,25.0,28.0,5 +P05,4505,Sjukhus Linköping ET2,SD,styck,45,2.75,018,SB B/F-hall,ET2,BOP3,w1,12.0,11.5,4 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,011,FS IQB,ET2,BOP3,w2,95.0,98.0,55 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,016,Gjutning,ET2,BOP3,w2,35.0,40.0,12 +P05,4505,Sjukhus Linköping ET2,IQB,meter,1200,1.85,017,Målning,ET2,BOP3,w2,28.0,26.0,25 +P05,4505,Sjukhus Linköping ET2,SR,styck,20,274.0,021,SR B/F-hall,ET2,BOP3,w3,120.0,115.0,2 +P06,4506,Skola Uppsala,IQB,meter,500,1.60,011,FS IQB,ET1,BOP1,w2,40.0,38.0,26 +P06,4506,Skola Uppsala,IQB,meter,500,1.60,012,Förmontering IQB,ET1,BOP1,w2,28.0,30.0,22 +P06,4506,Skola Uppsala,IQB,meter,500,1.60,013,Montering IQB,ET1,BOP1,w2,22.0,20.0,18 +P06,4506,Skola Uppsala,IQP,styck,80,2.75,015,Montering IQP,ET1,BOP1,w2,22.0,24.0,8 +P06,4506,Skola Uppsala,SB,styck,35,4.50,018,SB B/F-hall,ET1,BOP1,w2,16.0,18.0,4 +P06,4506,Skola Uppsala,SP,styck,140,1.50,019,SP B/F-hall,ET1,BOP2,w3,14.0,12.0,10 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,011,FS IQB,ET1,BOP1,w1,45.0,42.0,22 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,012,Förmontering IQB,ET1,BOP1,w1,30.0,33.0,18 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,014,Svets o montage IQB,ET1,BOP1,w1,35.0,32.0,16 +P07,4507,Idrottshall Västerås,SB,styck,45,3.50,018,SB B/F-hall,ET1,BOP1,w1,16.0,18.0,5 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,011,FS IQB,ET1,BOP1,w2,45.0,48.0,26 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,016,Gjutning,ET1,BOP2,w2,20.0,22.0,5 +P07,4507,Idrottshall Västerås,HSQ,meter,400,2.05,017,Målning,ET1,BOP2,w3,18.0,16.0,15 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,011,FS IQB,ET1,BOP1,w1,65.0,62.0,36 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,012,Förmontering IQB,ET1,BOP1,w1,42.0,45.0,30 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,013,Montering IQB,ET1,BOP1,w1,35.0,38.0,25 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,014,Svets o montage IQB,ET1,BOP1,w1,40.0,44.0,22 +P08,4508,Bro E6 Halmstad,SP,styck,200,2.50,019,SP B/F-hall,ET1,BOP1,w1,20.0,18.0,8 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,011,FS IQB,ET1,BOP1,w2,65.0,68.0,42 +P08,4508,Bro E6 Halmstad,IQP,styck,95,2.93,015,Montering IQP,ET1,BOP2,w2,28.0,30.0,10 +P08,4508,Bro E6 Halmstad,IQB,meter,800,1.80,016,Gjutning,ET1,BOP2,w3,22.0,25.0,8 +P08,4508,Bro E6 Halmstad,SR,styck,15,180.0,021,SR B/F-hall,ET1,BOP2,w3,90.0,85.0,2 \ No newline at end of file diff --git a/submissions/saima-afroz/level6/factory_workers.csv b/submissions/saima-afroz/level6/factory_workers.csv new file mode 100644 index 000000000..3110285cc --- /dev/null +++ b/submissions/saima-afroz/level6/factory_workers.csv @@ -0,0 +1,15 @@ +worker_id,name,role,primary_station,can_cover_stations,certifications,hours_per_week,type +W01,Erik Lindberg,Operator,011,"011,012","MIG/MAG,TIG,ISO 9606",40,permanent +W02,Anna Berg,Operator,011,"011,014","MIG/MAG,TIG",40,permanent +W03,Lars Jensen,Operator,012,"012,013","Surface treatment,CE marking",40,permanent +W04,Maria Stone,Operator,013,"013","Blasting,Surface protection",40,permanent +W05,Johan Peters,Operator,014,"014,015","Hydraulics,Mechanics,Crane",40,permanent +W06,Karen Nilsen,Inspector,015,"015","SIS,SS-EN 1090,NDT",40,permanent +W07,Per Hansen,Operator,016,"016,017","Casting,Formwork",40,permanent +W08,Sofia Arden,Operator,017,"017","Surface treatment,Spray painting",40,permanent +W09,Magnus Stone,Operator,018,"018,019","Sheet metal,Assembly",40,permanent +W10,Elin Frank,Operator,019,"019,018","Assembly,Welding",32,permanent +W11,Victor Elm,Foreman,all,"011,012,013,014,015,016,017,018,019,021","Leadership,CE,ISO 9001",45,permanent +W12,Lena Dale,Quality Manager,015,"015","ISO 9001,SS-EN 1090,Audit",40,permanent +W13,Ahmed Hassan,Operator,011,"011","MIG/MAG",40,hired +W14,Petra Steen,Operator,012,"012,013","Surface treatment",40,hired \ No newline at end of file diff --git a/submissions/saima-afroz/level6/requirements.txt b/submissions/saima-afroz/level6/requirements.txt new file mode 100644 index 000000000..220531d21 --- /dev/null +++ b/submissions/saima-afroz/level6/requirements.txt @@ -0,0 +1,5 @@ +streamlit +neo4j +python-dotenv +pandas +plotly diff --git a/submissions/saima-afroz/level6/seed_graph.py b/submissions/saima-afroz/level6/seed_graph.py new file mode 100644 index 000000000..cf7d73045 --- /dev/null +++ b/submissions/saima-afroz/level6/seed_graph.py @@ -0,0 +1,228 @@ +""" +seed_graph.py — Populate Neo4j with factory data from 3 CSVs. +Run once: python seed_graph.py +Safe to re-run (uses MERGE everywhere — idempotent). +""" + +import os +import csv +from neo4j import GraphDatabase +from dotenv import load_dotenv + +load_dotenv() + +URI = os.getenv("NEO4J_URI") +USER = os.getenv("NEO4J_USER", "neo4j") +PASSWORD = os.getenv("NEO4J_PASSWORD") + +# ── helpers ────────────────────────────────────────────────────────────────── + +def safe_float(val, default=0.0): + try: + return float(val) + except (ValueError, TypeError): + return default + +def safe_int(val, default=0): + try: + return int(val) + except (ValueError, TypeError): + return default + +# ── constraints ────────────────────────────────────────────────────────────── + +def create_constraints(session): + constraints = [ + "CREATE CONSTRAINT IF NOT EXISTS FOR (p:Project) REQUIRE p.id IS UNIQUE", + "CREATE CONSTRAINT IF NOT EXISTS FOR (s:Station) REQUIRE s.code IS UNIQUE", + "CREATE CONSTRAINT IF NOT EXISTS FOR (pr:Product) REQUIRE pr.type IS UNIQUE", + "CREATE CONSTRAINT IF NOT EXISTS FOR (w:Worker) REQUIRE w.id IS UNIQUE", + "CREATE CONSTRAINT IF NOT EXISTS FOR (wk:Week) REQUIRE wk.name IS UNIQUE", + "CREATE CONSTRAINT IF NOT EXISTS FOR (e:Etapp) REQUIRE e.name IS UNIQUE", + "CREATE CONSTRAINT IF NOT EXISTS FOR (b:BOP) REQUIRE b.name IS UNIQUE", + ] + for c in constraints: + session.run(c) + print("✅ Constraints created") + +# ── production.csv ──────────────────────────────────────────────────────────── + +def load_production(session, filepath): + with open(filepath, newline="", encoding="utf-8") as f: + rows = list(csv.DictReader(f)) + + for row in rows: + project_id = row["project_id"].strip() + project_num = row["project_number"].strip() + project_name = row["project_name"].strip() + product_type = row["product_type"].strip() + unit = row["unit"].strip() + quantity = safe_int(row["quantity"]) + unit_factor = safe_float(row["unit_factor"]) + station_code = row["station_code"].strip() + station_name = row["station_name"].strip() + etapp = row["etapp"].strip() + bop = row["bop"].strip() + week = row["week"].strip() + planned = safe_float(row["planned_hours"]) + actual = safe_float(row["actual_hours"]) + completed = safe_int(row["completed_units"]) + + session.run(""" + MERGE (p:Project {id: $pid}) + ON CREATE SET p.number = $pnum, p.name = $pname + + MERGE (pr:Product {type: $ptype}) + ON CREATE SET pr.unit = $unit + + MERGE (s:Station {code: $scode}) + ON CREATE SET s.name = $sname + + MERGE (e:Etapp {name: $etapp}) + MERGE (b:BOP {name: $bop}) + MERGE (wk:Week {name: $week}) + + MERGE (p)-[:BELONGS_TO]->(e) + MERGE (e)-[:CONTAINS_BOP]->(b) + MERGE (b)-[:SPANS_WEEK]->(wk) + + MERGE (p)-[prod:PRODUCES]->(pr) + ON CREATE SET prod.quantity = $qty, prod.unit_factor = $uf + + MERGE (p)-[sched:SCHEDULED_AT {week: $week}]->(s) + ON CREATE SET + sched.planned_hours = $planned, + sched.actual_hours = $actual, + sched.completed_units = $completed, + sched.variance_pct = CASE WHEN $planned > 0 + THEN round(($actual - $planned) / $planned * 100 * 10) / 10 + ELSE 0 END + + MERGE (s)-[:ACTIVE_IN]->(wk) + """, pid=project_id, pnum=project_num, pname=project_name, + ptype=product_type, unit=unit, + scode=station_code, sname=station_name, + etapp=etapp, bop=bop, week=week, + qty=quantity, uf=unit_factor, + planned=planned, actual=actual, completed=completed) + + print(f"✅ Production loaded ({len(rows)} rows → Projects, Products, Stations, Weeks, Etapps, BOPs)") + +# ── factory_workers.csv ─────────────────────────────────────────────────────── + +def load_workers(session, filepath): + with open(filepath, newline="", encoding="utf-8") as f: + rows = list(csv.DictReader(f)) + + for row in rows: + worker_id = row["worker_id"].strip() + name = row["name"].strip() + role = row["role"].strip() + primary = row["primary_station"].strip() + can_cover_raw = row["can_cover_stations"].strip() + certs = row["certifications"].strip() + hours_per_week = safe_int(row["hours_per_week"]) + wtype = row["type"].strip() + + session.run(""" + MERGE (w:Worker {id: $wid}) + ON CREATE SET + w.name = $name, + w.role = $role, + w.certifications = $certs, + w.hours_per_week = $hpw, + w.type = $wtype + """, wid=worker_id, name=name, role=role, + certs=certs, hpw=hours_per_week, wtype=wtype) + + # primary station + if primary and primary != "all": + session.run(""" + MATCH (w:Worker {id: $wid}) + MERGE (s:Station {code: $scode}) + MERGE (w)-[:WORKS_AT]->(s) + """, wid=worker_id, scode=primary) + + # coverage stations + cover_codes = [c.strip() for c in can_cover_raw.split(",") if c.strip()] + if primary == "all": + # Victor Elm can cover all — fetch all stations + session.run(""" + MATCH (w:Worker {id: $wid}) + MATCH (s:Station) + MERGE (w)-[:CAN_COVER]->(s) + """, wid=worker_id) + else: + for code in cover_codes: + session.run(""" + MATCH (w:Worker {id: $wid}) + MERGE (s:Station {code: $scode}) + MERGE (w)-[:CAN_COVER]->(s) + """, wid=worker_id, scode=code) + + print(f"✅ Workers loaded ({len(rows)} workers)") + +# ── factory_capacity.csv ────────────────────────────────────────────────────── + +def load_capacity(session, filepath): + with open(filepath, newline="", encoding="utf-8") as f: + rows = list(csv.DictReader(f)) + + for row in rows: + week = row["week"].strip() + own_staff = safe_int(row["own_staff_count"]) + hired_staff = safe_int(row["hired_staff_count"]) + own_hours = safe_float(row["own_hours"]) + hired_hours = safe_float(row["hired_hours"]) + overtime = safe_float(row["overtime_hours"]) + total_cap = safe_float(row["total_capacity"]) + total_plan = safe_float(row["total_planned"]) + deficit = safe_float(row["deficit"]) + + session.run(""" + MERGE (wk:Week {name: $week}) + ON MATCH SET + wk.own_staff_count = $own_staff, + wk.hired_staff_count = $hired_staff, + wk.own_hours = $own_hours, + wk.hired_hours = $hired_hours, + wk.overtime_hours = $overtime, + wk.total_capacity = $total_cap, + wk.total_planned = $total_plan, + wk.deficit = $deficit + """, week=week, own_staff=own_staff, hired_staff=hired_staff, + own_hours=own_hours, hired_hours=hired_hours, overtime=overtime, + total_cap=total_cap, total_plan=total_plan, deficit=deficit) + + print(f"✅ Capacity loaded ({len(rows)} weeks)") + +# ── main ────────────────────────────────────────────────────────────────────── + +def main(): + print("Connecting to Neo4j...") + driver = GraphDatabase.driver(URI, auth=(USER, PASSWORD)) + driver.verify_connectivity() + print(f"✅ Connected to {URI}\n") + + with driver.session() as session: + create_constraints(session) + load_production(session, "factory_production.csv") + load_workers(session, "factory_workers.csv") + load_capacity(session, "factory_capacity.csv") + + # Summary + with driver.session() as session: + nodes = session.run("MATCH (n) RETURN count(n) AS c").single()["c"] + rels = session.run("MATCH ()-[r]->() RETURN count(r) AS c").single()["c"] + labels = session.run("CALL db.labels() YIELD label RETURN count(label) AS c").single()["c"] + rel_types = session.run("CALL db.relationshipTypes() YIELD relationshipType RETURN count(relationshipType) AS c").single()["c"] + + print(f"\n📊 Graph summary:") + print(f" Nodes: {nodes}") + print(f" Relationships: {rels}") + print(f" Node labels: {labels}") + print(f" Relationship types: {rel_types}") + driver.close() + +if __name__ == "__main__": + main() From 98a0df4c33d80739a60ff3f1dbda44b59f8593f4 Mon Sep 17 00:00:00 2001 From: SAIMA AFROZ Date: Wed, 13 May 2026 15:49:19 +0530 Subject: [PATCH 3/3] level-6: add dashboard URL --- submissions/saima-afroz/level6/DASHBOARD_URL.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/submissions/saima-afroz/level6/DASHBOARD_URL.txt b/submissions/saima-afroz/level6/DASHBOARD_URL.txt index af16e3a7d..236cbc905 100644 --- a/submissions/saima-afroz/level6/DASHBOARD_URL.txt +++ b/submissions/saima-afroz/level6/DASHBOARD_URL.txt @@ -1 +1 @@ -https://your-app.streamlit.app +https://lpi-developer-kit-dwynwc8iz7ezj4zrpz3ris.streamlit.app