Clarify that the `Softmax` derivative is good-enough by MadLittleMods · Pull Request #5 · SebLague/Neural-Network-Experiments

MadLittleMods · 2023-12-15T15:52:56Z

First, thank you so much for this amazing resource and video series! 🙇 Your videos are a gold-standard to understand concepts and polished end-products to impress everyone 🌠

While following along and writing my own implementation in Zig, I added some gradient check tests to ensure my backpropagation code/math was correct and saw that they were failing whenever I used Softmax. I banged my head against this for a long-while and even compared the network outputs to this implementation only to find it the exact same.

Finally after some external help, I realized the difference between single-input activation functions like Sigmoid, TanH, ReLU and the multi-input activation functions like Softmax which require more work to find the full derivative. I wrote some notes on the difference or perhaps the source code I ended up with is easier to understand.

Just wanted to add a note to the code here so others don't hit the same pitfall as hard.

It's really interesting how the "good-enough" derivative of Softmax using only the diagonal elements from the Jacobian matrix, empirically, still works so well for the neural network to converge. The best way I was able to understand this and relate this to a concept that has more research/documentation is stochastic gradient descent which trains with mini-batches to make quick, imperfect but good-enough steps down the cost gradient.

More context: https://github.com/MadLittleMods/zig-neural-networks/blob/main/dev-notes.md#activation-functions

Clarify that the Softmax derivative is good-enough

1b3b846

More context: https://github.com/MadLittleMods/zig-neural-networks/blob/main/dev-notes.md#activation-functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify that the `Softmax` derivative is good-enough#5

Clarify that the `Softmax` derivative is good-enough#5
MadLittleMods wants to merge 1 commit intoSebLague:mainfrom
MadLittleMods:madlittlemods/note-on-multi-input-softmax

MadLittleMods commented Dec 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MadLittleMods commented Dec 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant