Why you should do Feature Engineering first, Hyperparameter Tuning second as a Data ScientistAdmond LeeBlockedUnblockFollowFollowingApr 21In fact, the realization that feature engineering is more important than hyperparameter tuning came to me as a lesson — an awakening and vital lesson — that drastically changed how I approached problems and handled data even before building any machine learning models.
When I first started my first full time job as a research engineer in machine learning, I was so excited and obsessed with building fancy machine learning models without really paying much attention to the data that I had.
As a matter of fact, I was impatient.
I wanted results so badly that I only cared about squeezing every single percent of performance out of my model.
Needless to say, I failed after so many attempts and wondered why.
“You should focus more on getting good features (Feature Engineering) instead of optimizing your model’s hyperparameters (Hyperparameter Tuning).
You see… If you don’t have good features that the model can learn from, it will not improve your model’s performance even though you have the optimum hyperparameter,”— one of my team members said this to me.
From that moment onward, I knew something had to be changed — my approach had to be change, my mindset had to be changed to accept others’ opinion, literally everything.
Once I tried to understand the real business problem that I was trying to solve and the data that I had, I added some new features for better representation of the problem so that the model could learn the underlying pattern effectively.
Results?I managed to improve the model’s AUC (it was a classification problem) significantly compared to the little or no improvement using hyperparameter tuning.
This is how I learned the importance of feature engineering, the hard way.
And I hope to share the importance of feature engineering and hyperparameter tuning with you.
By the end of this article, I hope you’ll understand why feature engineering is more important than hyperparameter tuning and use this approach before going into the tuning part to solve your problems.
Data will talk to you if you’re willing to listen.
Let’s get started!Importance of Feature EngineeringI remembered when I first started learning data science stuff, feature engineering was not always the topic included in books and online courses.
This gave me a false impression that perhaps feature engineering is not that important in applying machine learning to solve problems.
Before talking about what feature engineering is and its importance.
Let’s take a step back and try to understand how machine learning models work.
How machine learning models work?.Essentially, a machine learning model is just an algorithm that learns the “pattern” by being trained on the historical data to ultimately make prediction based on unseen testing data.
In other words, your model will not be able to learn the underlying “pattern” if the data isn’t representative enough to describe the problem that you’re trying to solve.
And this is where the role of feature engineering comes into play.
What is feature engineering?.Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data.
— Jason Brownlee, a brilliant machine learning practitioner who founded Machine Learning Mastery.
Although your model’s performance depends on several factors — data and features prepared, model’s used in training, problem statement, metrics to measure the model’s success etc — great features still play a crucial part to determine the success of a model.
What is the importance of feature engineering?.In my opinion, although you can aggregate data to generate additional features (mean and max etc.
), having a strong business domain knowledge will let you understand more about the data that you have and generate new features based on its relevance and relationship.
With great features, it gives more room when it comes to model’s selection.
You could choose a simpler model yet still be able to obtain good results as your data is now more representative and the less complex model can learn the underlying pattern easily.
At the end of the day, feature engineering boils down to problem representation.
If your data has great features that represent the problem well, chances are your model will give better results as it has learned the pattern well.
Personally, I find this article well-written and extremely helpful to get you started in learning and using feature engineering — Discover Feature Engineering, How to Engineer Features and How to Get Good at It.
Check it out and you’ll know what I mean.
Importance of Hyperparameter TuningHere I want to talk about the importance of hyperparameter tuning to give you an overall picture in comparison.
What is hyperparameter tuning?.In machine learning, hyperparameter tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm.
— WikipediaA model hyperparameter is a configuration that is external to the model and whose value cannot be estimated from data.
As a data scientist(or machine learning practitioner, whatever the name it is), model hyperparameters are unknown to us.
We can only find the optimum values based on the default values given, rules of thumb, or trial and error by searching around the discrete space of hyperparameters.
To give you a clearer picture, some of the hyperparameters are the learning rate for training a neural network, C and sigma values for Support Vector Machine (SVM), or the k value in k-nearest neighbours (KNN).
What is the importance of hyperparameter tuning?.Hyperparameters are crucial as they control the overall behaviour of a machine learning model.
The ultimate goal is to find an optimal combination of hyperparameters that minimizes a predefined loss function to give better results.
Failure to do so would give sub-optimal results as the model didn’t converge and minimize the loss function effectively.
It’s like exploring a range of possibilities and trying to locate the best combination that gives you the best results.
Some of the common techniques used to tune hyperparameters include Grid Search, Random Search, Bayesian Optimization and others.
Why Feature Engineering is more important than Hyperparameter Tuning for a Data Scientist?(Source)Now that we’ve understood the importance of both feature engineering and hyperparameter tuning, let’s dig deeper and see why the former is more important than the latter.
This isn’t to say that hyperparameter tuning is not important but rather this is a matter of priority when we talk about improving a model’s performance and the final results, especially in real life business scenarios (which I’ll explain later).
We have to understand that default values of hyperparameters in most machine learning libraries are sufficient to cater to most use cases.
Typically it’s hard to improve performance significantly.
At Kaggle, hyperparameter tuning matters a lot.
In real life, it hardly matters at all.
Let’s face it.
Hyperparameter tuning is time-consuming and computationally expensive.
It takes a lot of time to iterate different combination of hyperparameters to achieve a minor improvement.
Even worse, each iteration requires heavy resources if you have a massive amount of data and complex model.
In business context, time is money.
And if the effort and time needed in the search for optimum hyperparameters don’t justify the final ROI (Money!), chances are hyperparameter tuning is not needed at all if your model is good enough for deployment given a set of data with great features.
Third and the last.
It’s extremely difficult to achieve both optimum features and hyperparameters in real life given time constraints.
Therefore, to achieve great results with quantum leap of improvement at a shorter time period, a more intelligent choice is to first perform feature engineering to represent the problem well enough so that models can learn and predict accurately.
Only after we have great features, then can we consider tuning hyperparameters if time allows or required by business context.
This is the main reason why feature engineering should come first and hyperparameter tuning should come second.
Final Thoughts(Source)Thank you for reading.
By sharing my mistakes and learning experience, I hope you’ve understood the importance of both and why feature engineering should be the priority when it comes to improving your model’s performance.
If you want to learn more about feature engineering and how to apply it to your machine learning problems, then this book is for you — Feature Engineering for Machine Learning.
Feature engineering is not a formal topic in typical machine learning courses, and hence this book is meant to give you practical application with exercises throughout the book with several feature-engineering techniques.
Hope that helps!As always, if you have any questions or comments feel free to leave your feedback below or you can always reach me on LinkedIn.
Till then, see you in the next post!.????About the AuthorAdmond Lee is a Big Data Engineer at work, Data Scientist in action.
He is known as one of the highly sought-after data scientists and consultants in helping start-up founders and various companies tackle their problems through business and data strategy with deep data science and industry expertise.
He has been guiding aspiring data scientists from various background to learn data science skills effectively to ultimately land a job in data science through one-to-one mentorship and career coaching.
You can connect with him on LinkedIn, Medium, Twitter, and Facebook or book a call appointment with him here.
Admond Lee Kin Lim – Big Data Engineer – Micron Technology | LinkedInView Admond Lee Kin Lim's profile on LinkedIn, the world's largest professional community.
Admond has 12 jobs listed on…www.
com.. More details