Welcome to the Network Security Project for Phishing Data repository! This project is part of the Complete MLOps Bootcamp With End to End Data Science Project and demonstrates an end-to-end data science solution for detecting phishing data. It includes data processing, model training, prediction pipelines, and deployment on AWS EC2 using Docker and GitHub Actions.
This repository focuses on a cybersecurity application designed to identify phishing data. It leverages machine learning to analyze network data, with a deployment pipeline containerized using Docker and hosted on AWS EC2, automated via GitHub Actions.
- Environment Setup: Uses Anaconda with Python 3.10 for virtual environment management.
- Project Structure: Includes
network_data(for datasets),notebooks, andnetwork_security(with subfolders likecomponents,constants,entity,exception,logging,pipeline,utils, andcloud). - Version Control: Managed with Git and GitHub.
- Deployment: Containerized with Docker and deployed on AWS EC2 using GitHub Actions CI/CD pipelines.
- Logging & Exception Handling: Custom logging and exception classes for error management.
- Prediction Pipeline: Flask-based web application for real-time phishing detection using pickle files.
- Python 3.10
- Anaconda
- Git
- Docker
- AWS CLI (configured with credentials)
- Basic understanding of Flask and machine learning concepts
- Clone the repository:
git clone https://github.com/yourusername/network_security.git
- Navigate to the project directory:
cd network_security - Create and activate the virtual environment:
conda create -p venv python=3.10 conda activate venv
- Install dependencies:
pip install -r requirements.txt
- Start the Flask application:
python app.py
- Access the web app at
http://127.0.0.1:5000/predict_data.
-
GitHub Secrets Configuration: Set the following secrets in your GitHub repository settings under "Secrets and variables" > "Actions":
AWS_ACCESS_KEY_ID: Your AWS access key IDAWS_SECRET_ACCESS_KEY: Your AWS secret access keyAWS_REGION:us-east-1AWS_ECR_LOGIN_URI:788614365622.dkr.ecr.us-east-1.amazonaws.com/networkssecurityECR_REPOSITORY_NAME:networkssecurity
-
Build the Docker Image:
docker build -t networkssecurity . -
Docker Setup on EC2: Execute the following commands on your EC2 instance:
# Optional updates sudo apt-get update -y sudo apt-get upgrade # Required Docker installation curl -fsSL https://get.docker.com -o get-docker.sh sudo sh get-docker.sh sudo usermod -aG docker ubuntu newgrp docker
-
Push to ECR and Deploy: Configure AWS CLI, push the image to ECR, and trigger GitHub Actions for deployment to EC2 (see
main.yml).
network_security/
├── network_data/ # Phishing datasets
├── notebooks/ # Jupyter notebooks
├── network_security/ # Main package
│ ├── components/
│ ├── constants/
│ ├── entity/
│ ├── exception/
│ ├── logging/
│ ├── pipeline/
│ ├── utils/
│ └── cloud/
├── .gitignore # Excludes venv and sensitive files
├── Dockerfile # Docker configuration
├── requirements.txt # Project dependencies
├── setup.py # Package configuration
├── app.py # Flask web application
├── .github/workflows/ # GitHub Actions workflows
│ └── main.yml
└── README.md # This file
Fork this repository, submit issues, or create pull requests. Contributions to enhance the phishing detection model, documentation, or deployment process are welcome!
This project is licensed under the MIT License - see the LICENSE file for details.
- Inspired by the Complete MLOps Bootcamp With End to End Data Science Project course.
- Thanks to the open-source community for tools like Docker, Flask, and GitHub Actions.