Scores correspond to each variation’s test set, of course.The Accuracy and averaged F1 score for the RF baselines are as follows:Performance of RF baseline model on all variations of the dataset.Class level scoresAs we can see, the best accuracy is obtained with the original variation closely followed by the oversampled one..However, the best F1 score is obtained with the oversampled variation.Also, by looking at the performance metrics at the class level, we clearly see that the average F1 score of the original variation is mostly due to the score for class N, while the scores for the other classes are much lower..On the other hand, the F1 score for the oversample variation is generated by F1 scores at the class level that are much more balanced between each other..So, the F1 score of 0.6879 for the oversampled variation is much more representative of what is going on at the class level.Moreover, for each class in the oversample variation, the F1 score turns out to be a very good balance between Precision and Recall..All these details make the oversample variation the best candidate on which to perform the grid search.Grid SearchNext, we present the results of doing a grid for the parameters max_depth, n_estimators and min_sample_split; together with their corresponding F1 scores..Here, the scores are average over a 3-fold cross validation process.The resulting best estimator is the one trained with parameters max_depth=70, n_estimators=200, and min_samples_split=2.As an additional insight, it’s interesting to note that for each n_estimator and max_depth that was considered, the highest scores are for min_samples_split=2..This becomes evident by looking at the color stripes from the viz..One could argue that if we consider only these three parameters, then the best way to increase performance is by adding more trees to the forest.Also, when we use the best estimator to make predictions on the test set , we get an accuracy of 0.7227 and an F1 score of 0.7231..Which are higher than their corresponding training values.ADA BoostAgain, first we present the results for the baseline model.Averaged scoresClass level scoresIn this case, both the best accuracy and averaged F1 score is obtained with the original variation..Nevertheless, as with RF, this is mostly driven by the score for class N..And since we want to have a score that is evenly representative of each class, we can’t choose this variation..Instead, we will choose the oversampled variation that, although has a much lower value, its behavior across classes is very similar to that of RF.Grid SearchFor this algorithm, we present the results of a grid search with parameters n_estimators and learning_rate.In this case, the interesting part comes from noticing that for each values of n_estimators, the best test score is achieved with a learning rate of 1.. More details