Trade-offs: How to aim for the sweet spot.

Low Bias and high variance overfits data as the model pays way too much attention towards training data & doesn't generalize well.iii..When they both found middle ground or sweet spot the error is least.i..Left upper is what we aim.ii..Left down have a high bias so they are far from target but close to each other due to low variance.iii..Right upper points are spreaded due to high variance but close to target due to low bias.iv..Right down is far from the target and with points themselves too b’cos high variance & high bias.→ To build a good model, we need to find a good balance between bias and variance such that it minimizes the total error..An optimal balance of bias and variance would never overfit or underfit the model..This is achieved by hyper-parameter tunning on the basis of trial and errorPart 2: Precision Recall Trade-offs2.1..First thing first, What is Precision, What is Recall, Or any other terminology?They are the performance matrix of Classification problems..I think every concept of Classification is best explained using an example.Example: Let’s say you’re thinking about giving an extra sugar cube to customers who are likely to return..But of course you want to avoid giving out sugar cubes unnecessarily, so you only give them to customers that the model says are at least 30% likely to return.Confusion MatrixTP: Classified by the model as Will return and had in fact Returned in realityFP: Classified by the model as Will return but actually Didn’t return (Type 1 error or alarm)TN: Classified by the model as Won’t return and in fact Didn’t return in realityFN: Classified by the model as Won’t return but had actually Returned in reality..(Type 2 error or miss)(Note: I am assuming that the reader knows about ‘Confusion Matrix’, if not, please first go through it.)Venn Dig for Precision & Recall2.1.1 Precision:The fraction of relevant instances among the retrieved instances.→ Calculated as: TP / (TP + FP)→ From the above example, we can interpret it as, “Of those classified as Will return, what proportion actually did?”→ So, Precision expresses the proportion of the data points our model says was relevant actually were relevant.2.1.2 Recall (a.k.a Sensitivity):The fraction of relevant instances that have been retrieved over the total amount of relevant instances.→ Calculated as: TP /( TP + FN)→ From the above example, we can interpret it as, “ Of those that in fact Returned, what proportion were classified that way?”→ In general, Sensitivity tells us the percentage of the Positive target which was correctly identified.. More details

Leave a Reply