In the middle, around (0.3, 0.8), we’re correctly labeling about 80% of the poor care cases, with a 30% false positive rate.The ROC curve captures all thresholds simultaneously.The higher the threshold, or closer to (0, 0),the higher the specificity and the lower the sensitivity.The lower the threshold, or closer to (1,1),the higher the sensitivity and lower the specificity.So which threshold value one should pick?One should select the best threshold for the trade-off one wants to make.If you’re more concerned with having a high specificity or low false positive rate, pick the threshold that maximizes the true positive rate while keeping the false positive rate really low.A threshold around (0.1, 0.5) on this ROC curve looks like a good choice in this case.On the other hand, if one is more concerned with having a high sensitivity or high true positive rate, one should pick a threshold that minimizes the false positive ratePrediction on Test SetIn this particular example, we used a threshold value of 0.3 and we obtain the following confusion matrix.> predictTest = predict(QualityLog, type = "response", newdata = qualityTest)> table(qualityTest$PoorCare,predictTest >= 0.3) FALSE TRUE 0 19 5 1 2 6# Accuracy> (19+6)/32[1] 0.78125There are total 32 cases in test Set, out of which 24 of them are actually good care, and 8 of them are actually poor care.ConclusionThe model can accurately identify patients receiving low quality care with test set accuracy being equal to 78% which is greater than our baseline model.In practice, the probabilities returned by the logistic regression model can be used to prioritize patients for intervention.This was all about Logistic Regression in R..We studied the intuition and math behind it and also how Logistic regression makes it very easy to solve a problem with categorical outcome variable.Click here Guide to Machine Learning(in R) for Beginners : Linear Regression.Click here Guide to Machine Learning(in R) for Beginners : Decision Trees. More details