Can Machine Learning model simple Math functions?Modelling some fundamental mathematical functions using machine learningHarsh SahuBlockedUnblockFollowFollowingApr 30Photo Credits: UnsplashIt has become a norm these days to apply machine learning on every kind of task.
It is a symptom of the fact that Machine learning is seemingly a permanent fixture in Gartner’s hype cycle for emerging technologies.
These algorithms are being treated like figure-out-it-yourself models: Breakdown data of any sort into a bunch of features, apply a number of black-box machine learning models, evaluate each and choose the one that works out best.
But can machine learning really solve all those problems or is it just good for a small handful of tasks.
We tend to answer an even more fundamental question in this article, and that is, can ML even deduce mathematical relationships that very commonly occur in everyday life.
Here, I will try to fit few basic functions using some popular machine learning techniques.
Lets see if these algorithms can discern and model these fundamental mathematical relations.
Functions that we will try to fit:LinearExponentialLogarithmPowerModulusTrigonometricThe Machine learning techniques that will be used:XGBoostLinear RegressionSupport Vector Regression (SVR)Decision TreeRandom ForestMulti-layer perceptron (Feed-forward neural network)Preparing DataI am keeping dependent variables to be of size 4 (no reason for choosing this specific number).
So, the relation between X (Independent variables) and Y(dependent variable) will be:f :- function we are trying to fitEpsilon:- Random noise (To make our Y a bit more realistic because real-life data will always have some noise)Each function type is using a set of parameters.
These parameters are chosen via generating random numbers using below methods:numpy.
randint()randint() is used for getting parameters of Power function so as to avoid Y values getting very small.
normal() is used for all other cases.
Generate Independent variables (or X):For logarithmic, normal distribution with mean 5 (mean>>variance) is used to avoid getting any negative values.
Get Dependent variable (or Y):Noise is randomly sampled with 0 mean.
I have kept variance of noise equal to variance of f(X) to make sure ‘signal and noise’, in our data, are comparable and one does not get lost in the other.
TrainingNote: No hyperparamater tuning is done for any of the model.
The Rationale is to get a rough estimation of performance of these models on mentioned functions and, therefore, not much has been done to optimize these models for each of these cases.
ResultsResultsMost of the performance outcomes are much better than a mean baseline.
With an average R-squared coming out to be 70.
83%, we can say that machine learning techniques are indeed decent at modelling these simple mathematical functions.
But by this experiment, we not only get to know if machine learning can model these functions, but also how different techniques perform on these varied underlying functions.
Some results are surprising (at-least for me), while some are reassuring.
Overall, these outcomes reinstates some of our prior beliefs and also make some new ones.
In conclusion, we can say that:Linear Regression although a simpler model outperforms everything else on linearly correlated dataIn most of the cases, Decision Tree < Random Forest < XGBoost, according to performance (as is evident in 5 out of 6 cases above)Unlike the trend these days in practice, XGBoost (turning out best in just 2 out of 6 cases) should not be a one-stop-shop for all kinds of tabular data, instead a fair comparison should be madeAs opposed to what we might presume, Linear function may not necessarily be easiest to predict.
We are getting best aggregate R-squared of 92.
98% for Logarithmic functionTechniques are working (relatively) quite different on different underlying functions, therefore, technique for a task should be selected via thorough thought and experimentationComplete code can be found on my github.
Applaud, comment, share.
Constructive criticism and feedback is always welcome!.. More details