Data science unicorns might be right under your noseAmelia Winger-BearskinBlockedUnblockFollowFollowingJan 24Our society produces data at an astounding rate.
By some estimates, as many as 2.
5 million terabytes of new information appear on servers around the world every day.
That’s as much data as could fit on a billion iPhones, a quantity of zeros and ones so large you need eighteen zeros just to count it.
In the Bestselling book 21 Lessons for the 21st Century author Yuval Noah Harari said “data will eclipse both land and machinery as the most important asset.
” This glut of information has produced a skyrocketing demand for people with the skills and sense to make use of it.
Data science is projected to be the fastest-growing field within the IT industry.
IBM predicts demand for data scientists will grow to the tune of 28 percent by 2020, to a projected total of some about 2.
7 million jobs in data science nationwide.
But the supply of data qualified data scientists lags behind.
According to a recent LinkedIn Workforce Report, there are over 150,000 job openings nationwide that remain unfilled for want of candidates with data science skills.
If IBM’s projections hold up, we can expect this figure to triple as 364,000 new roles in data science are added to the economy.
The gap between the number of openings for high-paying data scientist jobs and the number of qualified applicants to fill them has become so pronounced that employers, educators, and policy-makers are all scrambling to answer the same question: how can we train more data scientists?New degree programs, certifications, and bootcamps that abound in this space, and are helping to close the gap in data science hiring.
But in the push to generate more data scientists, many of these programs overlook one of the most effective and reliable sources of candidates to fill these roles — people who are already working in data-adjacent jobs.
To understand what I mean by this, it is helpful to get a clear definition of a data scientist.
A data scientist is someone who understands the principles of data engineering, who can confidently model data and parse the model for insights, and who can communicate these insights effectively.
In other words, a data scientist is a kind of generalist who can deploy the tools and methods from several disciplines at once.
Firstly, an effective data scientist needs to understand the processes by which information is collected, the architecture of where the data is stored and how it gets served — what is often referred to as “data engineering.
”Next, a data scientist needs to have a solid grasp of the statistical models required to parse the data, and be comfortable in the languages, libraries, and platforms that are used to run these models — Python, R, Hadoop, Tensorflow, and many more.
A few of my favorite projects using data science tools:Using Natural Language Processing to write poetry in PythonADAM is a library helps you to parallelize genomic data in SparkRspotify is a package that allows R programmers to connect to the Spotify APIThis project used Tensorflow to get Alexa to respond to sign languageFinally a data scientist should understand how the insights from data can be most impactful in a given project.
This involves developing a familiarity with the overall objectives of a business, exercising judgment in planning a project, and being able to communicate the stakes and results of their work in a variety of formats — written correspondence, in-person presentation, and/or visual representation of data.
a data scientist should understand how the insights from data can be most impactful in a given project.
An effective data scientist should be comfortable in all of these skill-sets.
Right now it is relatively rare to find someone who has this breadth and depth of knowledge; this is one reason why recruiters and employers struggle to identify qualified candidates in data science.
Data science is a new-ish field, and training programs have not yet caught up with market demand.
But in their search for the unicorn that is a Fully Formed Data Scientist, many companies are ignoring an obvious source of human potential right under their noses: current employees.
True, there may be relatively few people who possess all of the qualities and skills that make a successful data scientist, but many companies already have people on their payrolls who have many of these key skills.
Bringing them up to speed in the areas where they are less proficient could be accomplished with a smart and efficient retraining program.
But in their search for the unicorn that is a Fully Formed Data Scientist, many companies are ignoring an obvious source of human potential right under their noses: current employeesFor instance, a software developer may already have the technical skills to manipulate data, given a refresher course in statistical modeling.
A designer who already knows how to communicate ideas and is proficient in data visualization libraries like D3.
js could make a fine data scientist given the opportunity to learn a bit more about information systems architecture.
There are several benefits to this kind of approach.
Rather than sinking money and resources into an expensive search process, companies who invest in retraining their own employees can be certain that the candidates for their data scientist roles already understand the business, and are cultural fits within the organization.
Training up candidates internally can also reduce the amount of on the job training needed to familiarize new hires with the company’s software stack.
The prospect of on the job training lets employees know that their employers are committed to helping them build their careers, which gives them a sense of stability and boosts morale, improving overall productivity.
This also helps with retention, at the last year’s World Economic Forum in Davos, Mckinsey and Company reported that “80 percent of CEOs who were investing heavily in artificial intelligence also publicly pledged to retain and retrain existing employees.
”It may seem daunting to find the people who can help you enter an emerging field like data science, but the place to begin is with an honest evaluation of the abilities and knowledge already on your team.
You may find that you already have exactly what you’re looking for.