๐ก๐ฅ๐ฒ๐ด๐๐น๐ฎ๐ฟ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป in ๐๐ฒ๐ฒ๐ฝ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด: Intuition behind.
Regularization 101 definition :ย Model does well on training data and not so well on unseen data. Overfitting ๐
But is there more to that, Letโs figure out.
Remember that one guy in school who memorized everything what is mentioned in books or uttered from teacherโs mouth.ย But didnโt perform well when questions were twisted a bit.
What happened?
He only memorised the lessons but didnโt understand the concept behind it to apply on previously unseen questions.
Thatโs ๐ผ๐๐ฒ๐ฟ๐ณ๐ถ๐๐๐ถ๐ป๐ด and to correct that we need regularization.
Regularization acts as that good teacher, guiding the student to focus on core concepts rather than memorizing irrelevant details.
Regularization essentially solves 3 problems.
1๏ธโฃ Overfitting: Prevents the model from fitting the noise or irrelevant details in the training data.
2๏ธโฃ Model Complexity: Reduces the complexity of the model by constraining its capacity, ensuring it doesnโt overlearn.
3๏ธโฃ Bias-Variance Tradeoff: Strikes a balance between underfitting (too simple) and overfitting (too complex).
So, how do we do regularization ?
Quite a few ways, actually.
Letโs see the most important ones. And letโs try to understand it without getting any maths involved. Shall we?
1๏ธโฃL1 and L2 Regularizationย – Way to discourage large weights.ย A penalty term ensures that large weights are dampened.ย Penalty addedย to abosolute weight (L1)ย squared weights (L2)
2๏ธโฃDropout – Randomly “drops out” (sets to zero) a fraction of neurons during training. This forces the network to not overly rely on specific neurons and promotes generalization.
3๏ธโฃ Data Augmentation – Why not give different variants of question to that friend so that he becomes really good at grasping concepts.
4๏ธโฃ Early stopping – Before it starts memorising, stop your training.
5๏ธโฃ Batch Norm – Normalise ( mean centre to 0 and variance 1) . Ensure, all neurons gets a fair chance in next layer.
6๏ธโฃ Elastic Net – Combination of L1 and L2.