-
Notifications
You must be signed in to change notification settings - Fork 2
Description
I added a dropout feature to the sequential model. Preliminary tests on it are a bit hard to asses.
I trained two equivalents networks for 800k steps with a learning rate of 1e-3. In orange there's a network with dropout = 0.3 for the linear layer and 0.1 for all conv and deconv layers except the last deconv. In blue is the same network without any dropout.
I think the sudden change in the orange one in the training SNR comes when I restarted the training with dropout = 0.3 for the linear layer (before it was 0.5, I'm not really sure)
It seems to work well since the performance on the validation test is better with dropout and worse on the training set.
What do you think? Should I run more tests? Are this parameters good for you? (30% on the linear layer and 10% on convs)
I also tried the same net w/only dropout=50% on convs (blue):


