PyTorch CNN Implementation: A Deep Dive into Custom BatchNorm & Dropout

This project is a hands-on exploration of key components in modern Convolutional Neural Networks. Using the STL-10 dataset, it follows a four-part journey: we begin with a simple baseline CNN, diagnose its weaknesses, implement foundational layers (BatchNorm, Dropout) from scratch to fix them, and finally, contrast this approach with the power of modern Transfer Learning (ResNet-18).

All code, analysis, and experiments are contained in the main.ipynb notebook.

The Project's Journey in Four Parts

This repository is structured as a progressive exploration, with each part building on the last.

Part I: The Baseline (ToyCNN)

We first implement a small, two-layer CNN (ToyCNN) to establish a performance baseline. The goal is to observe its raw performance and, more importantly, analyze its training curves to diagnose its primary weakness.

Key Result: The baseline model quickly overfits the training data, motivating the need for more advanced stabilization and regularization techniques.

Part II: Building the Toolkit (From Scratch)

Before improving our model, we build the tools. This section is a "deep dive" into two critical layers, implementing them from first principles to understand why they work.

Batch Normalization: Implemented from scratch (including forward and backward passes) to build a deep intuition for how it achieves faster, more stable training by correcting for internal covariate shift.
Dropout: Implemented the inverted dropout technique from scratch to fundamentally understand how it acts as a powerful regularizer and helps a model achieve better generalization by preventing neural co-adaptation.

Part III: Iterative Improvement (Building `ToyCNNModified`)

Armed with a deep, first-principles understanding of how these layers work (from Part II), we now apply these concepts iteratively. We will build our final ToyCNNModified in stages, adding one feature at a time to observe its specific impact. This methodical process is central to machine learning engineering.

Step 3.1: Tackling Overfitting with Dropout: First, we apply the library's nn.Dropout to our baseline model to control the severe overfitting diagnosed in Part I.
Step 3.2: Increasing Capacity by Going Deeper: With regularization in place, we increase the model's capacity by adding another convolutional block, allowing it to learn more complex features.
Step 3.3: Stabilizing the Deeper Network with Batch Norm: Finally, we add the library's nn.BatchNorm to stabilize the training of our new, deeper network, solving the internal covariate shift problem we studied in Part II.
Key Result: The final ToyCNNModified combines all three improvements, resulting in a model that successfully overcomes overfitting, has sufficient capacity, and trains stably, achieving a significant boost in validation accuracy.

Part IV: The State-of-the-Art (Transfer Learning with ResNet-18)

Finally, we explore the most common and powerful technique in applied computer vision. Instead of training from scratch, we adapt a ResNet-18 model pre-trained on ImageNet, leveraging its powerful learned features.

Key Result: By only fine-tuning the final classification layer, this model achieves state-of-the-art performance ( >90% accuracy) with minimal training, demonstrating the clear advantage of transfer learning for most practical tasks.

Performance Summary

This table, drawn from main.ipynb, summarizes the journey:

Stage	What was Implemented	Main Benefit	Typical Outcome (Val Acc)
I. ToyCNN (Baseline)	A basic CNN + training loop	Establish baseline & diagnose overfitting	>50%
II. Custom Layers	`BatchNorm` & `Dropout` from scratch	Understand internals by building them	(Implementations validated by tests)
III. ToyCNNModified	Iterative Improvement: 1. Add library `Dropout` 2. Go Deeper 3. Add library `BatchNorm`	Systematic improvement & generalization	> 60% (Final Model)
IV. ResNet-18	Head swap + fine-tuning	Leverage pre-trained models for SOTA results	> 90%

How to Run

Clone the project

git clone https://github.com/<your-user>/CNN-Custom-Layers.git
cd CNN-Custom-Layers

Create an environment & install dependencies
- Conda (recommended):
```
bash conda/install.sh
conda activate cnn_custom_layers
```
  The helper script provisions everything defined in conda/environment.yml and finishes with pip install -e . so the local cs6740 package is importable from anywhere.
- Plain pip (if you already have Python ≥3.9 available):
```
python -m venv .venv
source .venv/bin/activate
pip install torch torchvision numpy matplotlib pandas altair jupyter pytest
pip install -e .
```
  Feel free to mirror the conda dependencies if you need additional tooling such as bandit or mypy.
(Optional but recommended) Run the unit tests
```
pytest
```
These tests validate the from-scratch BatchNorm, Dropout, CNN architectures, and the ResNet fine-tuning wrapper before you start experimenting.
Launch the notebook
```
jupyter lab main.ipynb  # or `jupyter notebook main.ipynb`
```
Execute the notebook top to bottom. The first setup cell mirrors the steps above when running in Colab; on a local machine you can skip the duplicate installations if your environment is already prepared.
Data & checkpoints
- When you first instantiate cs6740.image_loader.ImageLoader, PyTorch will download STL-10 into ./data/ automatically (or whatever path you set in the setup cell).
- Model checkpoints are written to ./model_checkpoints/ by default; create the folder beforehand if you want to persist intermediate models.

Acknowledgements

This project was originally based on a course assignment. The foundational code (the Solver module, data loaders, and general project structure) was provided.

My core contributions and the focus of this repository are:

The complete from-scratch implementation (forward and backward pass) of the Batch Normalization and Dropout layers.
The architecture design and implementation of ToyCNN and ToyCNNModified.
The setup and fine-tuning logic for the MyResNet18 transfer learning model.
All experiment analysis, visualization, and discussion in the main.ipynb notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
conda		conda
cs6740		cs6740
tests		tests
.gitignore		.gitignore
README.md		README.md
main.ipynb		main.ipynb
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch CNN Implementation: A Deep Dive into Custom BatchNorm & Dropout

The Project's Journey in Four Parts

Part I: The Baseline (ToyCNN)

Part II: Building the Toolkit (From Scratch)

Part III: Iterative Improvement (Building `ToyCNNModified`)

Part IV: The State-of-the-Art (Transfer Learning with ResNet-18)

Performance Summary

How to Run

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PyTorch CNN Implementation: A Deep Dive into Custom BatchNorm & Dropout

The Project's Journey in Four Parts

Part I: The Baseline (ToyCNN)

Part II: Building the Toolkit (From Scratch)

Part III: Iterative Improvement (Building ToyCNNModified)

Part IV: The State-of-the-Art (Transfer Learning with ResNet-18)

Performance Summary

How to Run

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Part III: Iterative Improvement (Building `ToyCNNModified`)

Packages