May 12th, 2022

Developing and Retaining Data Science Talent

RSS icon RSS Category: Company, Makers

It’s been almost a decade since the Harvard Business Review proclaimed that “Data Scientist” is the sexiest job of the 21st century. Since then, there has been an explosion of job opportunities and university degree programs claiming to give students all of the skills they need to accel in the field of data science. Yet, the scarcity of battle-hardened data science talent is as evident today as it was ten years ago.

This scarcity is certainly not for lack of interest: A quick scan of the “r/datascience” and “r/machinelearning” forums on Reddit reveals how many employees with any sort of technical background are keenly interested in ditching their own industries just to “get into data science”. In fact, these two subreddits alone each contain approximately 754k and 2.4M members, respectively.

Adequate pay and compensation are assuredly not to blame either; According to glassdoor.com the median salary for a data scientist with between 0 and 1 year of experience in the San Francisco area is about $128k. This relatively high level of compensation for junior talent is not limited to areas of the US known for their focus on technology and innovation either. The same estimate for a junior data scientist’s salary in Boise, Idaho is $126k, only 1.5% less than San Francisco. For comparison, a review of the US Bureau of Labor Statistics 2020 earnings data shows that even achieving a Doctoral degree yields annual earnings of only $98k, or about 22% less than an entry-level data scientist in Boise.

One reason for this apparent misalignment between supply and demand in the labor market might be the general miscommunication of, and confusion surrounding, the necessary skills required to be an effective data scientist. Like cooking a delicious meal, an effective data scientist can apply advanced analytical techniques to potentially large and disparate sets of collected data, in order to drive value for their organization. Typically, this value is clearly defined in a business case demonstrating a quantifiable return-on-investment (ROI) and, while being technical by its very nature, nevertheless remains critically dependent upon clear communication to stakeholders and decision makers.

Of course, the job of a data scientist doesn’t stop at achieving high accuracy from their favorite model, or at minimizing the rates of false positives and false negatives; It remains mission-critical to demonstrate the ongoing value of that great analytical or predictive model over time and across data.

Unfortunately, most data science candidates are led to believe that littering their resumes with high-accuracy projects or being an expert on every single modeling technique is what they need to be successful in data science. A model that shows great performance on data that doesn’t practically reflect the reality of the business is not useful and does not drive value for the organization. Additionally, can that model hold up under regulatory compliance? Are the predictions explainable even if the model is complex? Is it biased toward protected classes such as race, gender or religion? How about if the current data being captured begins to drift and reflect a completely different reality than the data it was trained upon? Unlike a research scientist, a data scientist’s job doesn’t stop when the experiment is done, and the report is written. Great data scientists and their team generate a living, breathing animal that is constantly providing transparent and consistent value to its organization.

From the business’ perspective, creating and retaining an all-star data science team is also mission-critical. The largest cost to any business is almost invariably labor and thus retaining the team that built that living, breathing value-generating machine should be a priority focus. It’s no secret that retaining good data science talent is at least two-fold: empowering the team with the tools they need to be successful and providing the coaching required for growth. H2O.ai has worked with thousands of data science teams across the globe and in our experience, the tools needed to be successful have some clearly identifiable properties:

  1. Model-agnostic;
  2. Provide for experimentation, repeatability and documentation;
  3. Facilitate trust with transparency;
  4. Reflect the dynamic nature of the world we live in;
  5. Can scale to meet the needs of the organization.

In a world where everyone is trying to be the smartest one in the room, the quality of emotional intelligence is often overlooked when hiring for leadership in data science. But it is exactly that ability to appreciate where a data scientist is in their career, understand where they want to go, and to provide the coaching needed to get there. There will always be another company that will offer more money, but it’s rare to find a leader in data science that generates the trust and vision required to grow a team past the current quarter or fiscal year.

Sign up for a complimentary use case consultation workshop with some of our top AI experts. Learn how you can drive AI transformation.

References:

About the Author

Jon Farland

Jon Farland is a senior data scientist on the H2O.ai Solutions Engineering team. He has spent the better part of the last decade building analytical solutions at the intersection of technology, finance and energy. He has used H2O extensively to develop high performing models, communicate findings across stakeholders and to lead ROI growth from data science initiatives.

Leave a Reply

+
Enhancing H2O Model Validation App with h2oGPT Integration

As machine learning practitioners, we’re always on the lookout for innovative ways to streamline and

May 17, 2023 - by Parul Pandey
+
Building a Manufacturing Product Defect Classification Model and Application using H2O Hydrogen Torch, H2O MLOps, and H2O Wave

Primary Authors: Nishaanthini Gnanavel and Genevieve Richards Effective product quality control is of utmost importance in

May 15, 2023 - by Shivam Bansal
AI for Good hackathon
+
Insights from AI for Good Hackathon: Using Machine Learning to Tackle Pollution

At H2O.ai, we believe technology can be a force for good, and we're committed to

May 10, 2023 - by Parul Pandey and Shivam Bansal
H2O democratizing LLMs
+
Democratization of LLMs

Every organization needs to own its GPT as simply as we need to own our

May 8, 2023 - by Sri Ambati
h2oGPT blog header
+
Building the World’s Best Open-Source Large Language Model: H2O.ai’s Journey

At H2O.ai, we pride ourselves on developing world-class Machine Learning, Deep Learning, and AI platforms.

May 3, 2023 - by Arno Candel
LLM blog header
+
Effortless Fine-Tuning of Large Language Models with Open-Source H2O LLM Studio

While the pace at which Large Language Models (LLMs) have been driving breakthroughs is remarkable,

May 1, 2023 - by Parul Pandey

Request a Demo

Explore how to Make, Operate and Innovate with the H2O AI Cloud today

Learn More