Fake News Detection System

A professional Machine Learning powered web application that predicts whether a news article is Real or Fake, with comprehensive analytics, AI explainability, and advanced features.

🌟 Highlights

AI Explainability: See which words influenced the prediction with LIME integration
Batch Processing: Analyze hundreds of articles at once
Interactive Dashboards: History tracking, statistics, and comparison tools
Professional UI: Dark theme with mobile-responsive design and granular loading states
Export Options: PDF reports/CSV downloads

📌 Core Features

🔍 Prediction Engine

Machine Learning: Passive Aggressive Classifier with TF-IDF Vectorization
Confidence Score: Probability percentage with interactive tooltips
Multiple Input Methods:
- Full article text
- Headlines only
- URL (with robust validation and automatic text extraction)

✨ Advanced Features

AI Explainability (LIME): Visual word-level influence analysis
Source Credibility: Domain reputation scoring for URLs with security checks
Prediction History: Searchable SQLite database of all past predictions
Batch Analysis: CSV upload for bulk article processing with progress tracking
Comparison Mode: Side-by-side analysis of multiple articles
Interactive Charts: Plotly visualizations for insights
Export Tools: PDF reports and CSV downloads
User Feedback: Rating system with analytics

📂 Project Structure

FakeNewsDetection/
│
├── 🎯 Core Application
│   ├── app.py                  # Main Streamlit application
│   ├── train_model.py          # Model training script
│   └── utils.py                # Text processing & URL validation utilities
│
├── 🔧 Feature Modules
│   ├── database.py             # SQLite database operations
│   ├── explainer.py            # LIME AI explainability
│   ├── credibility.py          # Source credibility checker
│   └── export_utils.py         # PDF/CSV export tools
│
├── 📊 Data & Models
│   ├── dataset/
│   │   ├── news.csv            # Training dataset
│   │   └── sample_data.csv     # Demo dataset
│   ├── model/
│   │   ├── fake_news_model.pkl      # Trained model
│   │   └── tfidf_vectorizer.pkl     # TF-IDF vectorizer
│   └── data/
│       └── predictions.db      # SQLite database (auto-created)
│
├── 📖 Documentation
│   ├── README.md               # Main documentation
│   ├── QUICK_START.md          # Quick start guide
│   ├── NEW_FEATURES.md         # Detailed feature documentation
│   ├── UI_ENHANCEMENTS.md      # UI design documentation
│   └── IMPLEMENTATION_SUMMARY.md  # Development summary
│
└── ⚙️ Configuration
    ├── requirements.txt        # Python dependencies
    ├── .gitignore             # Git ignore rules
    └── download_data.py        # Dataset downloader script

Quick Start

1. Clone the Repository

git clone https://github.com/YOUR_USERNAME/FakeNewsDetection.git

cd FakeNewsDetection

2. Install Dependencies

pip install -r requirements.txt

3. Prepare Dataset

The project includes sample_data.csv for testing. To train on a full dataset:

Run python download_data.py to fetch Politifact data.
OR place your own news.csv in the dataset/ folder.

4. Train the Model

python train_model.py

Creates the model and vectorizer in the model/ directory.

5. Run the Application

streamlit run app.py

6. Access the App

Open your browser to http://localhost:8501

📱 Application Modes

🔍 Single Prediction
- Analyze individual articles with granular loading states
- AI explanations with LIME highlighting
- URL validation and content-type checking
- Export to PDF/CSV
📦 Batch Processing
- Upload CSV with multiple articles
- Bulk analysis with progress tracking
- Summary statistics and charts
⚖️ Comparison Mode
- Compare multiple articles side-by-side
- Visual probability distribution charts
📜 History
- View all past predictions with search and filter
- Export history as CSV
📊 Statistics
- Total predictions and class distribution
- User feedback analytics

🎓 How It Works

1. The Model

The system uses a Passive Aggressive Classifier calibrated for probability outputs. It is particularly well-suited for large-scale text classification.

2. TF-IDF Vectorization

Text is converted into numerical features using TF-IDF (Term Frequency - Inverse Document Frequency) with N-grams (1,2), capturing both individual words and common phrases.

3. AI Explainability

Using LIME, the system perturbs the input text to see which words most influence the model's decision, highlighting them in the UI (Red for Fake indicators, Green for Real).

4. URL Validation & Credibility

The system performs a HEAD request to verify content types (preventing non-HTML downloads) and evaluates domain reputation based on a curated list of credible and unreliable sources.

🛠️ Technologies Used

Python 3.x
Streamlit (Web UI)
Scikit-learn (Machine Learning)
LIME (AI Explainability)
Plotly (Visualizations)
SQLite (Data Persistence)
FPDF2 (PDF Generation)
BeautifulSoup4 (Web Scraping)

🤝 Contributing

Fork the repository
Create a feature branch
Commit your changes
Open a Pull Request

⭐ If you find this project useful, please give it a star!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fake News Detection System

🌟 Highlights

📌 Core Features

🔍 Prediction Engine

✨ Advanced Features

📂 Project Structure

Quick Start

1. Clone the Repository

2. Install Dependencies

3. Prepare Dataset

4. Train the Model

5. Run the Application

6. Access the App

📱 Application Modes

🎓 How It Works

1. The Model

2. TF-IDF Vectorization

3. AI Explainability

4. URL Validation & Credibility

🛠️ Technologies Used

🤝 Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
__pycache__		__pycache__
data		data
dataset		dataset
model		model
tests		tests
.gitignore		.gitignore
Fake_news_mockup.png		Fake_news_mockup.png
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
NEW_FEATURES.md		NEW_FEATURES.md
QUICK_START.md		QUICK_START.md
README.md		README.md
UI_ENHANCEMENTS.md		UI_ENHANCEMENTS.md
app.py		app.py
credibility.py		credibility.py
database.py		database.py
download_data.py		download_data.py
explainer.py		explainer.py
export_utils.py		export_utils.py
requirements.txt		requirements.txt
train_model.py		train_model.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

Fake News Detection System

🌟 Highlights

📌 Core Features

🔍 Prediction Engine

✨ Advanced Features

📂 Project Structure

Quick Start

1. Clone the Repository

2. Install Dependencies

3. Prepare Dataset

4. Train the Model

5. Run the Application

6. Access the App

📱 Application Modes

🎓 How It Works

1. The Model

2. TF-IDF Vectorization

3. AI Explainability

4. URL Validation & Credibility

🛠️ Technologies Used

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages