The Pollution Monitoring and Forecasting System is a Cloud-Edge-IoT application designed to monitor and forecast air quality in real time. The system processes environmental data, including pollutants such as NO2, O3, PM1, and PM2.5, from various stations sourced from the European Environment Agency [1], aggregates it for visualization, and performs forecasting. The primary goals are to visualize pollution trends, aggregate historical data, and predict future pollution levels to optimize environmental management.
In the IoT layer, devices transmit air quality data at a variable rate, depending on the sensor and environmental conditions. This data is collected from connected IoT devices and transmitted to the Edge. At the Edge, the air quality data is gathered, stored in a time-series database, and aggregated for further processing in the Cloud.
In the Cloud, the aggregated data from all Edge devices (and thereby from all stations) is gathered for forecasting. The Cloud processor uses machine learning models to forecast future pollution levels based on this historical data, providing proactive insights for environmental monitoring and management. This allows the system to offer both real-time insights and long-term predictions, enhancing the decision-making process for air quality management.
-
Python: Ensure Python 3.12+ is installed.
-
Ensure NATS is running:
docker run -p 4222:4222 -d nats:latest -
Ensure Redis with Time Series is running:
docker run -p 6379:6379 -d redis/redis-stack-server
- Clone this repository:
git clone https://github.com/UIBK-DPS-DC/AirQuality.git
cd AirQuality
- Poetry install:
poetry install
Set the following environment variables as needed:
-
Preprocessor Module:
NATS_SERVER_URL: URL of the NATS server (default: nats://localhost:4222).RETENTION_PERIOD_IN_MINUTES: Data retention period in minutes (default: 120).AGGREGATION_PERIOD_IN_MINUTES: Aggregation period for measurement data (default: 60).AGGREGATION_INTERVAL_IN_SECONDS: Aggregation job interval in seconds (default: 6).
-
Processor Module:
FIT_INTERVAL_IN_SECONDS: Interval in seconds for fitting forecasting models (default: 144).
-
Station Module:
STATION_ID: Unique station identifier (default: 0).WAIT_INTERVAL_IN_MILLISECONDS: Interval between each data transmission (default: 100).
- Start the Station Module
Run the station module to simulate pollution data transmission:
python station.py
- Start the Preprocessor Module
Run the preprocessor module to aggregate and process pollution data:
python preprocessor.py
- Start the Processor Module
Run the processor module for forecasting pollution levels:
python processor.py
Preprocessor (preprocessor.py):
- GET /measurements: Retrieves aggregated measurement data for various pollutants and sampling points, visualized as time series plots.
- Response: HTML containing base64-encoded images of the plots.
Processor (processor.py):
- GET /predictions: Retrieves the predicted pollution levels for the next 24 hours, visualized as a time series plot.
- Response: HTML containing base64-encoded images of the predicted pollution levels.
This project makes use of the following resources:
This project is licensed under the GPL License. See LICENSE for details.
- [1]: European Environment Agency: Air pollution