How to explain the components of machine learning projects to anyone who’s ever cookedJason PanBlockedUnblockFollowFollowingMay 11As a machine learning team lead in the pharmaceutical industry, I often find myself educating non-technical audiences on how machine learning projects work.
This analogy to cooking has really resonated with people and helped them understand the role and importance of subject matter expertise, quality data, data engineering and why putting a successful proof-of-concept to production takes so long!My hope is that this helps you in your journey to understanding or explaining machine learning projects to others!Without further ado, here’s an executive one-slider that captures the story:Source of Pasta ImageData = IngredientsIngredients and data are the raw materials.
Consistent with the popular phrase, garbage in, garbage out, if the ingredients (data) are rotting and infested, nothing can be done to make the end results palatable.
Data Engineering = Ingredient PreparationThe fact that most of the time spent cooking is spent preparing and processing the ingredients will resonate with anyone who’s had a helping hand in the kitchen.
So it is with data engineering.
Instead of cooking’s slicing and marinating and chopping, we have feature engineering, data scrubbing and normalization.
Machine Learning Algorithms = Cooking TechniquesFor the most part, these things just happen on their own.
· A raw potato + a boiling pot of water + 15 minutes = soft potato.
· Labeled data + logistic regression + 15 minutes = coefficients and odds ratios.
Of course, chefs (and ML practitioners) who understand the proprieties of their ingredients and techniques are critical to a good result.
Hardware & Software Architecture = Cookware and UtensilsDifferent tools and different sized tools are needed for different problems.
A cozy recipe for two requires different skills and equipment than catering for 2,000.
Likewise, processing 1,000 rows of data or documents may run on a laptop, but processing a billion rows may require specialized distributed programming languages and servers.
Domain Expertise = Chef expertiseActually, any first-time amateur can cook!SourceThat said, having expert chefs to do the cooking or at least advise will greatly improve any dish.
Likewise, having expertise greatly improves machine learning projects and products.
There are actually two types of expertise here — business domain (e.
, financial market, clinical pathways) and technical domain (e.
, natural language processing).
The most successful projects require a close partnership between both types of experts.
~~~A plug for the in-house technical teams~~~If you work at a large company where there is a growing demand and reliance on machine learning to help the business grow or scale, it’s critical to invest in internal talent.
Outsourcing every single machine learning project to a third-party vendor is like a restaurant outsourcing every single order to other restaurants.
It’s costly, slow, inflexible and lacks consistency and scale.
ExtrasThere are a host of bonus analogies that can be drawn.
If there is interest, I can write a second post expanding these concepts:· Projects/platforms are typically made from individual components just like a dish is a combination of intermediate cooked or raw ingredients.
· Acquiring data can be a chore just like sourcing quality ingredients from various providers can be logistically challenging.
· Operating in the cloud is like renting kitchens (or specific utensils, stovetops, refrigerators, etc.
· APIs are waiters: https://www.