CNN-BiLSTM-ATT for Sentiment Analysis

This repository contains the official implementation of the proposed CNN-BiLSTM-ATT architecture for sentiment analysis, as detailed in our research paper. The codebase facilitates comprehensive training, strictly structured evaluation against established baselines (CNN, LSTM, BiLSTM, CNN-LSTM), ablation studies, and automated generation of publication-ready visualizations.

The primary dataset used is the IMDb movie review dataset, with built-in extensions for the SST-2 benchmark.

Important Note on Operating System & Hardware

This implementation utilizes TensorFlow, which is optimally structured for deep learning acceleration. Starting with TensorFlow 2.11, native GPU support for Windows has been discontinued. Therefore, this codebase must be executed on a Linux environment or via Windows Subsystem for Linux (WSL2) to leverage NVIDIA GPU acceleration effectively. Execution on a standard Windows environment without WSL2 will default to severely restricted CPU processing.

Prerequisites & Environment Setup

Operating System: Linux (Ubuntu 20.04/22.04 recommended) or Windows Subsystem for Linux (WSL2).
Python: 3.8 to 3.11.
CUDA & cuDNN: Ensure that CUDA ≥ 11.8 and a compatible cuDNN version are successfully configured in your environment path.

Virtual Environment Setup (Recommended): It is highly recommended to isolate dependencies using a Virtual Environment for reproducibility.

python -m venv venv
source venv/bin/activate  # On Linux/WSL
# or for Windows: .\venv\Scripts\activate

Dependencies: With the virtual environment activated, install the required libraries using pip:

pip install tensorflow numpy scikit-learn matplotlib seaborn tqdm scipy datasets

Dataset Pre-requisites (GloVe)

The architecture initializes its embeddings using pre-trained GloVe vectors (300-dimensional).

Download the glove.6B.zip package from the Stanford NLP Group.
Extract the archive and copy the glove.6B.300d.txt file directly into the code/ directory.

Experiment Execution (Training)

Navigate into the application directory to run the experiments:

cd code

The system automatically handles dataset downloads (via Keras and HuggingFace's datasets library), text preprocessing, and tokenization. To establish variance bounding, each model is trained across multiple random seeds (5 independent runs by default).

1. Train the Baseline Models:

python run_model.py --model cnn
python run_model.py --model lstm
python run_model.py --model bilstm
python run_model.py --model cnn_lstm

2. Train the Proposed Architecture:

python run_model.py --model proposed

3. Train Ablation Variants (Isolating Attention & CNN Contributions):

python run_model.py --model bilstm_att
python run_model.py --model cnn_bilstm

Note: You can target the SST-2 dataset parameter --dataset sst2 or alter evaluation iterations using --runs N. Example:

python run_model.py --model proposed --dataset sst2

Statistical Analysis & Result Consolidation

Once the isolated network trainings successfully conclude and populate the results/ folder, aggregate them to obtain the statistical spread (mean ± std) and perform paired t-testing for significance against baselines:

python combine_results.py

This script will output results/all_results.json and print a summary table directly to standard output.

Visualization Generation

To generate the publication-ready figures (e.g., accuracy comparisons, model architectures, ablation impacts, and training curves):

python visualize.py

Outputs will be securely deposited within the figures/ directory.

Interactive Prediction & Attention Analysis

To explicitly assess the model's localized decision-making, we provide an interactive prediction CLI. When supplied with raw text, the script computes predictability probabilities and highlights attention heatmaps mapping which specific tokens catalyzed the classification output.

python predict.py

Repository Structure

.
├── code/
│   ├── run_model.py        # Centralized workflow for model execution
│   ├── model.py            # Neural network topological definitions
│   ├── data.py             # Feature engineering, GloVe alignment
│   ├── config.py           # Global hyperparameter registry
│   ├── combine_results.py  # Results agglomeration and p-value generation
│   ├── visualize.py        # Generation of manuscript figures
│   ├── predict.py          # Interactive heuristic diagnostic CLI
│   └── gpu_setup.py        # Hardware instantiation logic
├── results/                # Output trajectory metrics logs
├── figures/                # Auto-generated visualization suite
└── saved_model/            # Keras operational artifacts (.h5)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
code		code
figures		figures
results		results
2210990258_ResearchPaper.pdf		2210990258_ResearchPaper.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNN-BiLSTM-ATT for Sentiment Analysis

Important Note on Operating System & Hardware

Table of Contents

Prerequisites & Environment Setup

Dataset Pre-requisites (GloVe)

Experiment Execution (Training)

Statistical Analysis & Result Consolidation

Visualization Generation

Interactive Prediction & Attention Analysis

Repository Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CNN-BiLSTM-ATT for Sentiment Analysis

Important Note on Operating System & Hardware

Table of Contents

Prerequisites & Environment Setup

Dataset Pre-requisites (GloVe)

Experiment Execution (Training)

Statistical Analysis & Result Consolidation

Visualization Generation

Interactive Prediction & Attention Analysis

Repository Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages