Real-time-sensor-data-processing

Real time sensor data processing and analytics using Kafka and Tableau

Built for real-world sensor-driven industries — a scalable, cloud-based solution for ingesting, storing, cataloging, and visualizing high-frequency sensor data.

Project Summary

This project showcases a real-time data processing pipeline built for sensor-intensive industries such as Oil & Gas, Manufacturing, and Smart Infrastructure. Designed using a blend of Apache Kafka, AWS services, and interactive dashboards, it supports seamless integration of sensor simulators, data streaming, and advanced analytics.

Developed as part of a systems-focused academic initiative, this pipeline demonstrates end-to-end technical fluency across data engineering, cloud architecture, and real-time analytics.

Pipeline Flow:

Sensor Data Simulation:
Simulated sensor values (e.g., pressure, temperature) generated from real-world CSV datasets.
Data Producer:
Python script acts as Kafka producer pushing records to a Kafka broker (hosted on Amazon EC2).
Kafka Streaming:
Apache Kafka handles high-throughput, low-latency streaming to the consumer layer.
S3 Storage:
A Kafka consumer writes incoming data to Amazon S3 in real-time.
Data Cataloging with AWS Glue:
Glue Crawlers scan the S3 bucket and update the AWS Glue Data Catalog for schema discovery.
Query with Amazon Athena:
Athena enables SQL-like querying directly on raw S3 data via the Glue catalog.
Interactive Dashboards in Tableau:
Real-time insights visualized using Tableau, helping stakeholders monitor KPIs and detect anomalies.

Tech Stack

Layer	Technologies Used
Data Simulation	Python, Pandas
Streaming Broker	Apache Kafka (EC2-hosted)
Cloud Platform	Amazon Web Services (EC2, S3, Glue, Athena)
Visualization	Tableau
Data Format	CSV, JSON

Pre-requisites: Python 3.10+, Kafka setup on EC2, AWS account with S3, Glue, and Athena enabled.

Clone the repo:

git clone 'https://github.com/rasaghnak/Real-time-sensor-data-processing'
cd sensor-data-pipeline

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Architecture.png		Architecture.png
KafkaConsumer.ipynb		KafkaConsumer.ipynb
KafkaProducer.ipynb		KafkaProducer.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real-time-sensor-data-processing

Project Summary

Tech Stack

About

Uh oh!

Releases

Packages

Languages

rasaghnak/Real-time-sensor-data-processing

Folders and files

Latest commit

History

Repository files navigation

Real-time-sensor-data-processing

Project Summary

Tech Stack

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages