Skip to content

devanshcodesx/CNN_BiLSTM-Research-Paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CNN-BiLSTM-ATT for Sentiment Analysis

This repository contains the official implementation of the proposed CNN-BiLSTM-ATT architecture for sentiment analysis, as detailed in our research paper. The codebase facilitates comprehensive training, strictly structured evaluation against established baselines (CNN, LSTM, BiLSTM, CNN-LSTM), ablation studies, and automated generation of publication-ready visualizations.

The primary dataset used is the IMDb movie review dataset, with built-in extensions for the SST-2 benchmark.

Important Note on Operating System & Hardware

This implementation utilizes TensorFlow, which is optimally structured for deep learning acceleration. Starting with TensorFlow 2.11, native GPU support for Windows has been discontinued. Therefore, this codebase must be executed on a Linux environment or via Windows Subsystem for Linux (WSL2) to leverage NVIDIA GPU acceleration effectively. Execution on a standard Windows environment without WSL2 will default to severely restricted CPU processing.

Table of Contents

  1. Prerequisites & Environment Setup
  2. Dataset Pre-requisites (GloVe)
  3. Experiment Execution (Training)
  4. Statistical Analysis & Result Consolidation
  5. Visualization Generation
  6. Interactive Prediction & Attention Analysis
  7. Repository Structure

Prerequisites & Environment Setup

  • Operating System: Linux (Ubuntu 20.04/22.04 recommended) or Windows Subsystem for Linux (WSL2).
  • Python: 3.8 to 3.11.
  • CUDA & cuDNN: Ensure that CUDA ≥ 11.8 and a compatible cuDNN version are successfully configured in your environment path.

Virtual Environment Setup (Recommended): It is highly recommended to isolate dependencies using a Virtual Environment for reproducibility.

python -m venv venv
source venv/bin/activate  # On Linux/WSL
# or for Windows: .\venv\Scripts\activate

Dependencies: With the virtual environment activated, install the required libraries using pip:

pip install tensorflow numpy scikit-learn matplotlib seaborn tqdm scipy datasets

Dataset Pre-requisites (GloVe)

The architecture initializes its embeddings using pre-trained GloVe vectors (300-dimensional).

  1. Download the glove.6B.zip package from the Stanford NLP Group.
  2. Extract the archive and copy the glove.6B.300d.txt file directly into the code/ directory.

Experiment Execution (Training)

Navigate into the application directory to run the experiments:

cd code

The system automatically handles dataset downloads (via Keras and HuggingFace's datasets library), text preprocessing, and tokenization. To establish variance bounding, each model is trained across multiple random seeds (5 independent runs by default).

1. Train the Baseline Models:

python run_model.py --model cnn
python run_model.py --model lstm
python run_model.py --model bilstm
python run_model.py --model cnn_lstm

2. Train the Proposed Architecture:

python run_model.py --model proposed

3. Train Ablation Variants (Isolating Attention & CNN Contributions):

python run_model.py --model bilstm_att
python run_model.py --model cnn_bilstm

Note: You can target the SST-2 dataset parameter --dataset sst2 or alter evaluation iterations using --runs N. Example:

python run_model.py --model proposed --dataset sst2

Statistical Analysis & Result Consolidation

Once the isolated network trainings successfully conclude and populate the results/ folder, aggregate them to obtain the statistical spread (mean ± std) and perform paired t-testing for significance against baselines:

python combine_results.py

This script will output results/all_results.json and print a summary table directly to standard output.

Visualization Generation

To generate the publication-ready figures (e.g., accuracy comparisons, model architectures, ablation impacts, and training curves):

python visualize.py

Outputs will be securely deposited within the figures/ directory.

Interactive Prediction & Attention Analysis

To explicitly assess the model's localized decision-making, we provide an interactive prediction CLI. When supplied with raw text, the script computes predictability probabilities and highlights attention heatmaps mapping which specific tokens catalyzed the classification output.

python predict.py

Repository Structure

.
├── code/
│   ├── run_model.py        # Centralized workflow for model execution
│   ├── model.py            # Neural network topological definitions
│   ├── data.py             # Feature engineering, GloVe alignment
│   ├── config.py           # Global hyperparameter registry
│   ├── combine_results.py  # Results agglomeration and p-value generation
│   ├── visualize.py        # Generation of manuscript figures
│   ├── predict.py          # Interactive heuristic diagnostic CLI
│   └── gpu_setup.py        # Hardware instantiation logic
├── results/                # Output trajectory metrics logs
├── figures/                # Auto-generated visualization suite
└── saved_model/            # Keras operational artifacts (.h5)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages