This project addresses the classification of drone faults using sound recordings from multiple microphones.
Instead of building three separate models for:
- ⚙ Fault type
- 🧭 Maneuvering direction
- 🚁 Drone model type
We implement a single Multi-Task Learning (MTL) Convolutional Neural Network (CNN) that learns all three tasks simultaneously.
Traditional ML pipelines train one model per task:
- 🛠 A model for fault classification
- 🛠 A model for direction classification
- 🛠 A model for drone model classification
- Redundant feature extraction
- Higher compute cost
- No benefit from shared knowledge between tasks
- 📡 Shared CNN feature extractor
- 🎯 Three task-specific heads
- 🔄 Knowledge transfer across tasks for better generalization
- 🎤
.wavaudio files from mic1 and mic2 - 🚁 Drone models:
A,B,C - 📚 Train, validation, and test sets
- MFCCs (
n_mfcc=48) extracted withlibrosa - Mean pooling across time frames
- Output: 48-dim vector per audio file
- Extracted:
- Drone model
- Maneuvering direction
- Fault type
- Encoded with
LabelEncoder
- Combined mic1 & mic2 data
- Stored features + labels in DataFrames
- Conv1D (1→64) → BatchNorm → ReLU → MaxPool
- Conv1D (64→128) → BatchNorm → ReLU → MaxPool
- Conv1D (128→256) → BatchNorm → ReLU → MaxPool
- Conv1D (256→128) → BatchNorm → ReLU
- Fault Classification (9 classes)
Linear → Dropout(0.4) → Linear - Direction Classification (6 classes)
Linear → Dropout(0.4) → Linear - Model Classification (3 classes)
Linear
- Loss: CrossEntropyLoss per task
- Optimizer: Adam (lr=0.001)
- Batch Size: 32
- Epochs: 40
- Device: GPU if available
Total Loss: = loss_fault + loss_direction + loss_model