Khairunnisa Maharani KMoex-HZ

Khairunnisa Maharani

Data Engineer | Building Production-Ready Data Platforms

Data Science Student @ ITERA

I design and build scalable, reliable data systems — from real-time streaming pipelines to batch lakehouse architectures. My focus is on production engineering practices: idempotent pipelines, data quality enforcement, fault tolerance, and reproducibility.

🛠️ Core Engineering Stack

Domain	Technologies
Orchestration & Infrastructure
Data Processing
Streaming
Storage & Warehouse
Data Quality & Observability
Other

🚀 Featured Data Engineering Projects

🛒 GlowCart — Production-Inspired E-commerce Data Platform

Kafka · Spark · dbt · Airflow · FastAPI · DuckDB · Docker

Designed and implemented a production-oriented data platform simulating real-world e-commerce analytics with streaming ingestion and robust failure handling.

Architecture: Real-time pipeline using Kafka + Medallion Architecture (Bronze → Silver → Gold)
Reliability: Implemented Dead Letter Queue (DLQ) and idempotent pipelines (safe retries, zero duplication)
Data Quality: Enforced validation gates at the Silver layer before downstream processing
Orchestration: Managed workflows using Airflow DAGs with failure handling
Serving Layer: FastAPI + DuckDB for zero-ETL analytics endpoints (revenue, funnel, traffic, top products)
Engineering Practice: Documented all major decisions using Architecture Decision Records (ADR)

🛍️ Modern Data Platform — Automated ELT with CI/CD

Dagster · dbt Core · DuckDB · Soda Core · GitHub Actions

Built a modern data stack (MDS) with strong data contracts and full automation.

Implemented SCD Type 2 for historical change tracking
Fully automated CI/CD pipeline with testing & validation on every push
Enforced data quality checks as deployment gates via Soda Core

🚖 NYC Taxi Data Pipeline

Apache Spark · Apache Airflow · MinIO · PostgreSQL · Great Expectations

Processed 2.9M+ records using distributed Spark clusters
Built an end-to-end pipeline with automated data validation
Simulated cloud data lake storage using MinIO (S3-compatible)

🇧🇷 Olist Data Warehouse

dbt Core · PostgreSQL · Docker

Designed a star schema data warehouse for analytics-ready reporting
Built modular transformations with dbt, including tests and lineage tracking

🧱 Market Data Ingestion Pipeline

Apache Airflow · Docker · PostgreSQL · Python

Developed a fault-tolerant ETL pipeline for real-time market data ingestion
Implemented retry logic, scheduling, and containerized deployment

🏫 LPMPP Institutional Data Warehouse

Azure · SSIS · SQL Server

Role: Principal Data Engineer & Team Lead
Delivered an institutional warehouse with dimensional modeling and cloud ETL pipelines on Azure

🧠 Additional Experience

👁️ Glaucoma Detection — ML Pipeline (IDSC 2026)

PyTorch · EfficientNet · Docker · DVC

Built a fully reproducible ML pipeline with Docker + DVC + fixed seeds
Achieved ROC-AUC 0.9801 on blind test set with quality-aware loss engineering

Primary focus is Data Engineering & platform reliability — ML projects demonstrate systems thinking applied to the full model lifecycle.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly