4 questions about data scientists, and the answers

52% of all data scientists have earned that title within the past four years. How do we know this? In our newest benchmark, The State of Data Science, our team took a look at 360 million LinkedIn profiles in order to answer four questions:

  1. How many data scientists are there?

  2. What companies do they work for?

  3. What is their academic background?

  4. What is their skill set?

To get these answers we analyzed millions of Linkedin profiles, 60,200 professional experiences records, 27,700 education records, and 254,600 skill records. Here are our findings.

1. There are 11,400 self-identified data scientists in the world today

This number is a modest estimate; not all data scientists are on LinkedIn after all. When this number gets really interesting is when you look at it over time. More than half of these individuals began identifying themselves as a “data scientist” after 2011.

Cumulative number of data scientists over time

To get a better sense of how incredible this growth is we compared to two other titles: Software Engineer and Analyst. Since 2012, the number of data scientists has increased at a rate that is consistently 50% higher than that for Software Engineers and Data Analysts.

Current data scientists starting their first job

This chart may even understate the difference between disciplines because we don’t have a way of determining how many people have left the field. If you were to take into account outflow, we believe the difference between data scientists and software engineers/data analysts would be even more pronounced.

2. Microsoft employs more data scientists than any other company

The software giant isn’t winning this battle by a small margin either; Microsoft employes nearly 2x the data scientists of the next company.

Data scientists per company

Most of the companies in this list are well-established, older companies, but relatively young brands like LinkedIn and Twitter are also investing in data science.

Microsoft has an impressive lead both in the number of data scientists it employs, and its hiring pace.

Net current data scientists at top 10 data scientist employers

While it may seem odd that Google is not in this list, we suspect this is more a reflection of how they refer to their data scientists, and not a lack of investment in data science.

3. A Master’s is the most popular degree for data scientists

Of the roughly 10,000 data scientists who listed a degree, 42% reported earning a Master’s.

Highest education level of data scientists

This preference towards MA Degrees shows just how in-demand specialists are in the field. We saw a similar distribution when we looked at degrees by seniority level, with the exception of Senior Data Scientists who actually have more PhDs than Master’s.

Highest education level of data scientists across seniority levels

When we took a look at what those data scientists studied when getting their graduate degrees, not surprisingly, STEM fields dominated the list.

Top 20 backgrounds of data scientists

4. The top five skills of a data scientist are data analysis, R, Python, data mining, and machine learning.

While technologies like Hadoop are exciting and generate a lot of buzz, skills in Hadoop are hardly the norm, not even cracking the top 10.

Top 20 skills of data scientists

These skill sets do change as people advance in their role. Chief Data Scientists were much more likely to list business intelligence, analytics, leadership, strategy and management among their skills than both Junior and Senior Data Scientists.

Skill differences of data scientists across seniority levels

Chief Data Scientists are also less likely to emphasize technical skills. On average: only 27% and 26% listed Python and R, respectively. Compare this to the corresponding 52% and 53% of Junior Data Scientists, along with 38% and 43% of senior practitioners.

The bottom line

Lillian Pierson, Founder and Chief Data Scientist at Data-Mania provided her own thoughts on these numbers:

This report shows consolidation among the skills of data scientists, coupled with a growth in people with this title. I expect to see this upward trend continue as pioneers realize that the blend of machine learning, Python, and deep domain expertise that they have already mastered is actually data science, and as they inspire others to acquire these same skills.
This is exactly how my own career has played out. I spent years in technical roles in engineering and analytics, but there was limited opportunity for me to use the breadth of my skills in a single job. The growth in demand for data scientists led me to round out my own skillset and pursue a role within the field.
My personal belief is that the next four years of growth in the profession will come from those who work in adjacent fields and now just need to sharpen their skills.

The data indicates that this is exactly what we can expect to see happen. As demand for people with the skills to wrangle massive volumes of data continues to grow, we can expect to see more and more people rounding out their skill set and landing data scientist jobs.