Planar Data Classification with One Hidden Layer

Overview

This project implements a neural network with a single hidden layer to classify planar (2D) data that is not linearly separable. The implementation demonstrates how neural networks can learn complex decision boundaries that linear models like logistic regression cannot capture.

Project Structure

Planar_data_classification_with_one_hidden_layer/
|
|--- Planar_data_classification_with_onehidden_layer_v6c.ipynb  # Main Jupyter notebook
|--- planar_utils.py                                            # Utility functions
|--- testCases_v2.py                                            # Test cases for functions
|--- images/                                                     # Diagrams and visualizations
|    |--- classification_kiank.png                               # Neural network architecture
|    |--- grad_summary.png                                       # Gradient computation summary
|    |--- sgd.gif                                                # Gradient descent animation
|    |--- sgd_bad.gif                                            # Poor gradient descent example
|--- README.md                                                   # This file

Problem Statement

The dataset consists of 2D points arranged in a "flower" pattern with two classes (red and blue). This data is not linearly separable, meaning a simple line cannot separate the two classes effectively.

Dataset Characteristics

Input: 2 features (x1, x2 coordinates)
Output: Binary classification (0 = red, 1 = blue)
Samples: 400 training examples
Pattern: Circular/radial distribution

Neural Network Architecture

Architecture Overview

Input Layer (2 neurons) 
    |
    v
Hidden Layer (4 neurons) - tanh activation
    |
    v
Output Layer (1 neuron) - sigmoid activation

Mathematical Formulation

For each example $x^{(i)}$:

Forward Propagation:
- $z^{1} = W^{[1]}x^{(i)} + b^{[1]}$
- $a^{1} = \tanh(z^{1})$
- $z^{2} = W^{[2]}a^{1} + b^{[2]}$
- $\hat{y}^{(i)} = a^{2} = \sigma(z^{2})$
Cost Function: $$J = -\frac{1}{m}\sum_{i=1}^{m}\left[y^{(i)}\log(a^{2}) + (1-y^{(i)})\log(1-a^{2})\right]$$
Backward Propagation:
- $dZ^{[2]} = A^{[2]} - Y$
- $dW^{[2]} = \frac{1}{m}dZ^{[2]}A^{[1]T}$
- $db^{[2]} = \frac{1}{m}\sum_{i}dZ^{2}$
- $dZ^{[1]} = W^{[2]T}dZ^{[2]} * (1 - A^{[1]2})$
- $dW^{[1]} = \frac{1}{m}dZ^{[1]}X^T$
- $db^{[1]} = \frac{1}{m}\sum_{i}dZ^{1}$

Implementation Details

Key Functions Implemented

layer_sizes(X, Y)
- Defines the neural network architecture
- Returns: (n_x, n_h, n_y) = (2, 4, 1)
initialize_parameters(n_x, n_h, n_y)
- Initializes weights with small random values
- Initializes biases with zeros
- Uses seed for reproducibility
forward_propagation(X, parameters)
- Computes forward propagation through the network
- Returns: A2 (predictions) and cache (intermediate values)
compute_cost(A2, Y, parameters)
- Computes cross-entropy loss
- Returns: cost as float
backward_propagation(parameters, cache, X, Y)
- Implements backpropagation algorithm
- Returns: gradients for all parameters
update_parameters(parameters, grads, learning_rate)
- Updates parameters using gradient descent
- Returns: updated parameters
nn_model(X, Y, n_h, learning_rate, num_iterations)
- Integrates all functions into complete training loop
- Returns: trained parameters
predict(parameters, X)
- Makes predictions using trained model
- Returns: binary predictions (0 or 1)

Performance Results

Comparison with Baseline

Model	Accuracy	Decision Boundary	Performance
Logistic Regression	47%	Linear	Poor (worse than random)
Neural Network	90%	Complex/Non-linear	Excellent

Training Progress

Initial Cost: ~138.6
Final Cost: ~39.5 (after 9000 iterations)
Learning Rate: 1.2
Iterations: 10,000

Hidden Layer Size Analysis

Hidden Units	Accuracy	Observations
1	47%	Underfitting (same as logistic)
2	52%	Still underfitting
3	73%	Good improvement
4	90%	Excellent performance
5	91%	Optimal for this dataset
20	91%	Slight overfitting
50	90%	Overfitting begins

Key Insights

Why Neural Networks Work Better

Non-linearity: The tanh activation function introduces non-linearity
Complex Boundaries: Can learn circular/curved decision boundaries
Feature Learning: Hidden layer learns useful intermediate representations

Limitations of Linear Models

Can only learn linear decision boundaries
Cannot capture circular/radial patterns
Poor performance on non-linearly separable data

Optimal Architecture

Hidden Layer Size: 4-5 neurons optimal for this dataset
Learning Rate: 1.2 provides good convergence
Training Iterations: 10,000 sufficient for convergence

Usage Instructions

Running the Notebook

Prerequisites:

pip install numpy matplotlib scikit-learn jupyter

Launch Jupyter:

jupyter notebook Planar_data_classification_with_onehidden_layer_v6c.ipynb

Execute Cells:
- Run cells sequentially
- Each function is tested with provided test cases
- Final results show model performance

Custom Dataset Testing

The notebook includes support for multiple datasets:

noisy_circles: Circular patterns with noise
`noisy_moons**: Crescent moon patterns
blobs: Gaussian clusters
gaussian_quantiles: Quantized Gaussian distributions

To test a different dataset, modify this line in the notebook:

dataset = "noisy_moons"  # Change to desired dataset

Technical Specifications

Dependencies

Python: 3.7+
NumPy: For numerical computations
Matplotlib: For visualizations
Scikit-learn: For baseline comparison

Performance Metrics

Training Time: ~2 minutes for 10,000 iterations
Memory Usage: < 100MB for standard dataset
Convergence: Typically within 5,000 iterations

File Sizes

Main Notebook: ~800KB
Utility Files: < 10KB total
Images: ~500KB total

Future Improvements

Potential Enhancements

Regularization: L1/L2 regularization to prevent overfitting
Different Activations: ReLU, Leaky ReLU comparison
Optimization: Adam, RMSprop optimizers
Multiple Hidden Layers: Deep neural networks
Cross-validation: Better hyperparameter tuning

Extensions

Multi-class classification
Different dataset patterns
Real-world data applications
Performance benchmarking

Educational Value

This project serves as an excellent educational resource for:

Understanding neural network fundamentals
Implementing backpropagation from scratch
Comparing linear vs non-linear models
Visualizing decision boundaries
Hyperparameter tuning

Contributing

Feel free to:

Report issues or bugs
Suggest improvements
Add new datasets
Optimize performance
Enhance visualizations

License

This project is provided for educational purposes. Feel free to use and modify for learning and research.

Author: leadylearn
Date: April 2026
Framework: Pure NumPy (no deep learning libraries)
Purpose: Educational implementation of neural networks

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
__pycache__		__pycache__
images		images
Planar_data_classification_with_onehidden_layer_v6c.ipynb		Planar_data_classification_with_onehidden_layer_v6c.ipynb
README.md		README.md
planar_utils.py		planar_utils.py
testCases_v2.py		testCases_v2.py

Folders and files

Latest commit

History

Repository files navigation

Planar Data Classification with One Hidden Layer

Overview

Project Structure

Problem Statement

Dataset Characteristics

Neural Network Architecture

Architecture Overview

Mathematical Formulation

Implementation Details

Key Functions Implemented

Performance Results

Comparison with Baseline

Training Progress

Hidden Layer Size Analysis

Key Insights

Why Neural Networks Work Better

Limitations of Linear Models

Optimal Architecture

Usage Instructions

Running the Notebook

Custom Dataset Testing

Technical Specifications

Dependencies

Performance Metrics

File Sizes

Future Improvements

Potential Enhancements

Extensions

Educational Value

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages