is this: five main skills are central to data science, but you don’t need them all.
They combine into roles (in the same way Brandon Rohrer’s pillars combined to form archetypes).
The Five Skills1) Data Bootstrap Skills — The BasicsThis includes not only data wrangling (as in Rohrer’s Data Mechanics pillar), but also enough knowledge of a domain, and an ability to summarize the data and get basic insights through Exploratory Data Analysis.
2) Information Design and Presentation — Visualize things the user didn’t expect to seeNot just beautiful charts, but ones that enable consumption of data.
UX skills are useful here, too, such as wire framing, modern design and pleasing aesthetics.
3) Statistics and Machine LearningStats, modeling and scripting to extract key value from data.
This is sort of a mashup of Rohrer’s Data Analysis and Data Modeling pillars, or in terms of his archetypes, a Detective who is also something of an Oracle.
4) Deep ProgrammingStrong skills in programming (such as a software developer who pivoted to data science) are needed for data engineering, at the beginning, and for the development of consumable data science applications, on the production end.
Brandon Rohrer put these skills in the Data Engineering archetype.
The smaller the organization, the more likely a small team of data scientists will be called upon to do all the work, from start to finish, of a data science project.
Not ideal, but it may still be the reality in some shops.
5) Domain ExpertiseAny data science project requires some understanding of the domain, often by communicating with subject matter experts.
And, for these SME’s, a path to becoming a data scientist begins by being a data-curious member of a domain.
They can then learn analysis techniques and how to work with data.
These are the “citizen data scientists” you may have heard about, the ones whom I predict will be either fighting or collaborating with AI to take over all the data scientists’ jobs!.Relax, I was just kidding, sort of.
The Data Science Roles, Created by Combining SkillsKesari sees the five skills combining, by pairs, into four main roles.
Each role as the Data Bootstrap Skills at the center along with one of the other four skills.
People working in his four data science roles may precede (or preclude) the data science unicorns.
Here are the four:1) Data Science Engineer or Specialist: Data Bootstrap skills combined with Deep Programming2) Data Analyst: Data Bootstrap + Stats & Machine Learning3) Information Designer or Data Visualization Designer: Data Bootstrap + Info Design & Presentation4) Functional Data Consultant: Data Bootstrap + Deep Domain Expertise.
Kesari says that people with various backgrounds can focus on Data Bootstrap and one other skill to become data scientist practitioners.
He goes on to elaborate on five versions of data science unicorns, each with Data Bootstrap plus two other main skills.
Some job titles with these combinations would include Master Data Storyteller, Data Architect, Application Architect and “that rare techno-functional data science expert”.
Wisely, Kesari points out that a combination of four or five skills is not likely to be found, any more than a unicorn can be found outside of fiction.
Ganes Kesari’s job advice: While you need a combination of skills to be successful, be aware that job descriptions asking for most or all five of the skills are unrealistic.
They are likely just trying to cast a wide net to pull in as many data scientists as possible (my addition: along with those who still believe in data science unicorns).
“The deeper a candidate’s competence in two or more of the above skill areas, better prepared will the person be to make impacting contributions…translates to higher perceived value and much better negotiating power in interviews.
”Jeremie Harris — Five Data Science Job TypesMoving on to the third approach to figuring out the combination of skill sets into job roles, we have an offering by Jeremie Harris, called Why you shouldn’t be a data science generalist.
In his experience, “companies don’t hire generic, jack-of-all-trades ‘data scientists’ (although Kesari points out that most job postings lead you to believe that they do), but rather individuals with very specialized skill sets.
I don’t completely agree with him, though.
I have been part of a team of data scientists, all of whom I would call generalists, but where each has greater strengths than the others in certain areas.
I believe that being a well-rounded data scientist is a worthwhile target, because the team members with enough skills in common can understand each other better and put their synergy to work.
Still, I agree that some specialized strengths that you could refer to as your “data science super power”, can get you noticed, and can significantly enhance your career rewards.
Here are Harris’ five data science roles (to which he refers as “problem classes” … uh, still scratching my head on that one).
For each role, Harris lists the technologies you would likely be working with.
I’ll let you read his post for those details.
Also, I find it very helpful that, for each role, he lists a couple of typical “questions you’ll be dealing with.
”1) Data Engineer — creating and managing Big Data pipelinesAlso, the data engineer would do ETL and preliminary cleaning of datasets, perhaps on the SQL side.
2) Data Analyst — a go-between for tech and business teamsHarris says the data analyst translates the business problem to the data scientists, and later uses visualization to “convert a trained and tested model and mounds of user data into a digestible format….
Feeding back the opposite way, data analysts help to make sure that data science teams don’t waste their time solving problems that don’t deliver business value.
” Great description!3) Data Scientist — Clean, Explore, Train, Optimize, even Deploy ModelsHis description here is pretty much what we have been told a data scientist does day in and day out.
He pictures the data scientist picking up a data set prepared by a data engineer, and doing the munging, EDA, model training, testing and optimizing, and sometimes deploying the models they create.
In Brandon Rohrer terms, this role seems to be a mix of the Oracle and Maker archetypes.
5) Machine Learning Researcher — finding “new ways to solve challenging problems”Perhaps this person would be so well-versed in math and statistics that they can develop their own algorithms that do more to solve a problem than one of the algorithms from popular packages, or maybe they are very creative in combining techniques.
Again, it’s a lot like Rohrer’s Oracle archetype.
Jeremie feels that most data science jobs will fit into these five categories, although he acknowledges that smaller companies do tend to look for a data scientist who can also fetch data and create visualizations.
Jeremie Harris’ job advice: You’re more likely to get hired, especially if applying to larger companies, if you build “a more focused skillset”.
He says it’s counterproductive to try and excel, for example, at both Tableau and TensorFlow.
I agree that would be stretching a human data scientist a bit too thin.
So, Where Are We After All of That? Stuffed?I hope this information wasn’t too much to take in.
I condensed the blog posts as much as I could, yet I do feel a little overwhelmed trying to taxonomize the output of these different approaches in my brain.
But this is why I mentioned Brandon Rohrer’s archetypes in the descriptions of the other two authors, and Ganes Kesari’s views mixed in with Jeremie Harris’ categorizations.
It will help you compare the approaches.
Do check out each of their articles to get a better understanding of each one’s approach to categorization.
I don’t doubt you have already begun mentally organizing these concepts to where they make the most sense to you.
Save this post and read it again a day or two later, as it helps to “sleep on it”.
I believe you will soon be able to pick out an archetype or role that speaks to you, one that will fit you like the mantle that has been waiting for you.
Use this understanding of roles, along with the tasks they do and the technologies they need to master, to help you plan, learn and practice your own way toward becoming the type of data scientist that best fits your talents and dreams.