Minimising the causes and consequences of high job turnover in data science and analytics – a UCOVI White Paper written by Ned Stratton (August 2023)
As data engineers, analysts, and scientists we have a sweet deal. We work in well-paid, interesting, high-esteem jobs and are supported by thriving user communities. Our companies sing our praises and wage bidding wars for our services in a candidate-poor job market. The learning curve is steep and demands work and patience, but with everything on Youtube or a discounted £20 Udemy course and the expectation of on-the-job learning, it’s nothing like the time, expense, and slog of a medical or law degree and qualification programme. ChatGPT has even arrived on the scene to do our donkey work for us.
So why do we get so annoyed with our companies, our IT setup, the data we work with? Why do the most talented in our industry derive joy not from actually, you know, analysing data, but instead developing software tools that other data people might use? Or by writing blogs? (Guilty.) And why do we quit our jobs every 16 months?
Data analysis is a feeder mostly for business roles, and secondarily for IT, development, data science or other technical roles.
LinkedIn doesn't like web scraping, but does make it easy to build convincing burner profiles that you don’t mind getting blocked. Follow people, add education and work history, and pass a few skill assessments for good measure, and you will be fine.
Making Sankey diagrams for their own sake is a habit I need to kick. I don't know why I like them so much because they become visual spaghetti junctions with anything over 3 categories, and (probably because of this) they're not supported properly by most BI tools and data viz coding libraries.
But the real gold in the data I scraped was finding out the average tenure of a data analytics or science role - 16 months – as well as the year-by-year attrition funnel for these roles:
All data jobs in sample (n=2096) - 100%
Stayed at least 1 year - 54%
Stayed at least 2 years - 27%
Stayed at least 3 years - 13%
Stayed at least 4 years - 7%
Stayed at least 5 years - 4%
To try and ensure some fairness, I excluded junior data analysts from the sample (juniors naturally move onwards), and drilled the funnel down into job tenures that ended versus tenures still ongoing, so that I was satisfied that the two weren't different enough to skew the analysis (they weren't). I'm also aware of sampling bias in this; the job tenures relate only to data analysts who've made LinkedIn profiles and joined data-focussed LinkedIn groups, so arguably are more network-y and job-market active than the overall population of data analysts.
But three quarters leaving before two years aligns with what I've seen in my 8 years in data.
Why do people leave and why is this bad?
This made me reflect on the negative, non-financial common denominators that led me to resign from the data jobs I have left, which were:
Frustration at unaddressed yet eminently solvable technical dysfunction, slow IT, and poor-quality data.
A feeling that nothing significant would change at the company to advance the data team's interests and needs, and therefore that my role had run its course.
The desire to run away from a quicksand project I was on as well as my association with its inevitable failure. This was usually an all-things-to-all-people marketing analytics/UX mega dashboard that mixed high-volume but low-insight data from Google Analytics with vague requirements, too many stakeholders, and certainly no scope or brief.
As I reflected on these reasons for leaving, two other feelings bubbled up. First, about my manager, who'd be unfairly associated with another one biting the dust and would have to recruit and train someone less experienced and knowledgeable about the company. Second, my own sense of unfinished business. I've left every data job with a mental list of what I saw as the most underused high-value data, the most important problems holding the data team back, and my solutions for them if only data team had enough power in the business, IT backing, and time.
When a data analyst leaves their company after 18 months, they take with them the bank of business knowledge they have accrued in that time as well as the potential to combine it with their skill and ability to produce juicy business value. This is a loss to the business and the wider economy, which is fundamentally a collection of businesses. It's also a loss to data analysts themselves, who go back to square one in their new positions in terms of organisational knowledge, and who cheat themselves out of advanced and valuable accomplishments they can take pride in. It's even a loss to the data profession and user community in general, and it's one that is self-reinforcing.
The negative feedback loop around prematurely resigning data analysts and scientists is so broad, intertwined, and strong that it defies explanation by word or static diagram. So I've made it into an animation:
The free role
The most common methods used by companies and managers to un-resign a data person are pay rises, cosmetic promotions, and passionate sales pitches about the sunlit uplands of where the company and team will be in six months. My chart further up would seem to suggest that none of them work.
My solution takes two sources of inspiration. Firstly, Google's 20% Project, where developers were given a fifth of their working week to work on self-guided projects (Gmail and Google Maps came from this), and secondly, a setting on football-manager simulation video games that you can give to creative attacking players to alleviate them of any defensive responsibility or other tactical constraint. It is the free role.
The free role would be offered to high-performing data analysts and scientists after two years of employment, and would involve the following:
The data analyst agrees with their manager and director a project or corpus of projects that they believe would add value to the business and that they would like to undertake.
The data analyst works on the agreed projects full-time over the course of one year, save for some time in the week to answer questions from and provide support to less experienced team members, and check in with their manager.
At the end of the free-role year, the data analyst presents the value of their work to their manager and senior stakeholders in the business, and optionally ask for an expansion of scope dependent on success and viability.
Default ideas for a free-role project would be a series of analytical deep-dives based crucially on the analyst's null hypotheses, a full refactor and shakeup of the company's regular reporting suite, a process automation drive or data quality project, or a documentation project.
To explain the effects this could have, I've taken my negative feedback loop animation above and replaced the "data analyst resigns at 2 years" square with one for each of the above default free-role projects:
The offer of a free role has caveats. It should be made proactively based on experience, talent, good performance, and value of idea, not reactively after a resignation threat. The length of the free-role period and arrangements for its management should give the data analyst as much autonomy and time as needed to induce meaningful, complete, creative work without completely isolating the data analyst from their team or leaving the company with an unmitigable risk that a project gone sour could produce no positive impact (or even do damage).
But if done right, a free-role data analyst could turn out to be a business's most likely source of data-driven innovation, and the catalyst for the data science and analytics profession to gain the traction and definition it needs to do itself justice.