Planar Data Classification with One Hidden Layer
This project implements a neural network with a single hidden layer to classify planar (2D) data that is not linearly separable. The implementation demonstrates how neural networks can learn complex decision boundaries that linear models like logistic regression cannot capture.
Planar_data_classification_with_one_hidden_layer/
|
|--- Planar_data_classification_with_onehidden_layer_v6c.ipynb # Main Jupyter notebook
|--- planar_utils.py # Utility functions
|--- testCases_v2.py # Test cases for functions
|--- images/ # Diagrams and visualizations
| |--- classification_kiank.png # Neural network architecture
| |--- grad_summary.png # Gradient computation summary
| |--- sgd.gif # Gradient descent animation
| |--- sgd_bad.gif # Poor gradient descent example
|--- README.md # This file
The dataset consists of 2D points arranged in a "flower" pattern with two classes (red and blue). This data is not linearly separable, meaning a simple line cannot separate the two classes effectively.
- Input: 2 features (x1, x2 coordinates)
- Output: Binary classification (0 = red, 1 = blue)
- Samples: 400 training examples
- Pattern: Circular/radial distribution
Input Layer (2 neurons)
|
v
Hidden Layer (4 neurons) - tanh activation
|
v
Output Layer (1 neuron) - sigmoid activation
For each example
-
Forward Propagation:
-
Cost Function: $$J = -\frac{1}{m}\sum_{i=1}^{m}\left[y^{(i)}\log(a^{2}) + (1-y^{(i)})\log(1-a^{2})\right]$$
-
Backward Propagation:
-
layer_sizes(X, Y)- Defines the neural network architecture
- Returns: (n_x, n_h, n_y) = (2, 4, 1)
-
initialize_parameters(n_x, n_h, n_y)- Initializes weights with small random values
- Initializes biases with zeros
- Uses seed for reproducibility
-
forward_propagation(X, parameters)- Computes forward propagation through the network
- Returns: A2 (predictions) and cache (intermediate values)
-
compute_cost(A2, Y, parameters)- Computes cross-entropy loss
- Returns: cost as float
-
backward_propagation(parameters, cache, X, Y)- Implements backpropagation algorithm
- Returns: gradients for all parameters
-
update_parameters(parameters, grads, learning_rate)- Updates parameters using gradient descent
- Returns: updated parameters
-
nn_model(X, Y, n_h, learning_rate, num_iterations)- Integrates all functions into complete training loop
- Returns: trained parameters
-
predict(parameters, X)- Makes predictions using trained model
- Returns: binary predictions (0 or 1)
| Model | Accuracy | Decision Boundary | Performance |
|---|---|---|---|
| Logistic Regression | 47% | Linear | Poor (worse than random) |
| Neural Network | 90% | Complex/Non-linear | Excellent |
- Initial Cost: ~138.6
- Final Cost: ~39.5 (after 9000 iterations)
- Learning Rate: 1.2
- Iterations: 10,000
Hidden Layer Size Analysis
| Hidden Units | Accuracy | Observations |
|---|---|---|
| 1 | 47% | Underfitting (same as logistic) |
| 2 | 52% | Still underfitting |
| 3 | 73% | Good improvement |
| 4 | 90% | Excellent performance |
| 5 | 91% | Optimal for this dataset |
| 20 | 91% | Slight overfitting |
| 50 | 90% | Overfitting begins |
- Non-linearity: The tanh activation function introduces non-linearity
- Complex Boundaries: Can learn circular/curved decision boundaries
- Feature Learning: Hidden layer learns useful intermediate representations
- Can only learn linear decision boundaries
- Cannot capture circular/radial patterns
- Poor performance on non-linearly separable data
- Hidden Layer Size: 4-5 neurons optimal for this dataset
- Learning Rate: 1.2 provides good convergence
- Training Iterations: 10,000 sufficient for convergence
-
Prerequisites:
pip install numpy matplotlib scikit-learn jupyter
-
Launch Jupyter:
jupyter notebook Planar_data_classification_with_onehidden_layer_v6c.ipynb
-
Execute Cells:
- Run cells sequentially
- Each function is tested with provided test cases
- Final results show model performance
The notebook includes support for multiple datasets:
noisy_circles: Circular patterns with noise- `noisy_moons**: Crescent moon patterns
blobs: Gaussian clustersgaussian_quantiles: Quantized Gaussian distributions
To test a different dataset, modify this line in the notebook:
dataset = "noisy_moons" # Change to desired dataset- Python: 3.7+
- NumPy: For numerical computations
- Matplotlib: For visualizations
- Scikit-learn: For baseline comparison
- Training Time: ~2 minutes for 10,000 iterations
- Memory Usage: < 100MB for standard dataset
- Convergence: Typically within 5,000 iterations
- Main Notebook: ~800KB
- Utility Files: < 10KB total
- Images: ~500KB total
- Regularization: L1/L2 regularization to prevent overfitting
- Different Activations: ReLU, Leaky ReLU comparison
- Optimization: Adam, RMSprop optimizers
- Multiple Hidden Layers: Deep neural networks
- Cross-validation: Better hyperparameter tuning
- Multi-class classification
- Different dataset patterns
- Real-world data applications
- Performance benchmarking
This project serves as an excellent educational resource for:
- Understanding neural network fundamentals
- Implementing backpropagation from scratch
- Comparing linear vs non-linear models
- Visualizing decision boundaries
- Hyperparameter tuning
Feel free to:
- Report issues or bugs
- Suggest improvements
- Add new datasets
- Optimize performance
- Enhance visualizations
This project is provided for educational purposes. Feel free to use and modify for learning and research.
Author: leadylearn
Date: April 2026
Framework: Pure NumPy (no deep learning libraries)
Purpose: Educational implementation of neural networks