But I was seeing the opposite effect.
My next attempt at understanding the observed behavior was to use a sufficiently high and low value, but not something as drastic as epsilon.
I chose 0.
99999 & 0.
Using these, my loss dropped to 0.
An improvement but still a lot higher than my unaltered predictions.
The AweIt was clear that my modification was worsening the error rather than improving it, but I didn’t understand why yet.
Mathematically, it should have improved.
I felt, the only way to get more clarity was to check what was really happening to my predictions.
Since I was using a Kaggle dataset, I didn’t have the labels for the test set.
I painstakingly labeled the first 500 images manually.
And then calculated the loss based on my predictions.
The loss was 0.
00960, an excellent score.
I then calculated the loss with my modified predictions, it was 0.
A significant increase, but this time, I had the data to identify the root of the problem.
On closer inspection of my prediction and the actual labels, I noticed that my model had made an error in prediction.
It had labelled 1 dog as a cat and with a prediction of 0.
03824, it was pretty confident that that image was indeed a cat.
My boosting logic had taken this value and pushed it closer to 0.
Therein was the source of the problem.
Log Loss error penalizes incorrect predictions heavilyMy 1 incorrect prediction was already costing me in my loss, but my alteration of the prediction exasperated the error causing it to increase by 0.
A good explanation of this is in this blog, excerpts of which I am mentioning below.
Let’s say, the actual value is 1.
If your model was unsure & predicted 0.
5, the loss would be;Loss =-(1 * log(0.
5)) = 0.
69314If your model was correctly confident & predicted 0.
9, the loss would be;Loss =-(1 * log(0.
9)) = 0.
10536The loss drops when the prediction is closer to the actual valueIf your model was incorrect, but also confident & predicted 0.
1, the loss would be;Loss = -(1 * log(0.
30258The loss gets much worseWhen dealing with Log Loss function, it is better to be doubtful of your prediction rather than confidently wrong.
This was my oversight about my model.
I was assuming that when my model predicted confidently, it was correct, always.
If it was getting some of the images wrong, it must be predicting in the neighborhood of 0.
I completely overlooked the case where my model was confidently predicting incorrectly.