The UCOVI Blog

Welcome to UCOVI's repository of data discussions and interviews.

➡ Click here if you wish to contribute an article.

Latest Post - Stratton's number and the pursuit of fame in data (without too much work)

The Devops-ification of data analytics (Ned Stratton: 13^th June 2024)

Data beyond AI: Microsoft Fabric vs Data Contracts (Ned Stratton: 30^th January 2024)

No-code data part II (Ned Stratton: 22^nd November 2023)

White Paper: The free-role data analyst (Ned Stratton: 4^th September 2023)

Do data analysts need to read books? (Ned Stratton: 10^th May 2023)

No code data tools: the complexity placebo (Ned Stratton: 17^th March 2023)

The 2023 data job market with Jeremy Wyatt (Ned Stratton: 24^th January 2023)

Making up the Numbers - When Data Analysts Go Rogue (Ned Stratton: 2^nd December 2022)

Data in Politics Part 2 - Votesource (Ned Stratton: 12^th September 2022)

Data in Politics Part 1 - MERLIN (Ned Stratton: 2^nd September 2022)

Interview: Adrian Mitchell - Founder, Brijj.io (Ned Stratton: 28^th June 2022)

The Joy of Clunky Data Analogies (Ned Stratton: 14^th April 2022)

Event Review - SQLBits 2022, London (Ned Stratton: 17^th March 2022)

Interview: Susan Walsh - The Classification Guru (Ned Stratton: 21^st February 2022)

Upskilling as a data analyst - acquiring knowledge deep, broad and current (Ned Stratton: 31^st January 2022)

Beyond SIC codes – web scraping and text mining at the heart of modern industry classification: An interview with Agent and Field's Matt Childs (Ned Stratton: 8^th December 2021)

Debate: Should Data Analytics teams sit within Sales/marketing or IT? (Ned Stratton: 26^th October 2021)

Event Review: Big Data LDN 2021 (Ned Stratton: 27^th September 2021)

The Swiss Army Knife of Data - IT tricks for data analysts (Ned Stratton: 9^th September 2021)

UK Google Trends - Politics, Porn and Pandemic (Ned Stratton: 15^th October 2020)

How the UK broadcast media have misreported the data on COVID-19 (Ned Stratton: 7^th October 2020)

The Power BI End Game: Part 3 – Cornering the BI market (Ned Stratton: 21^st September 2020)

The Power BI End Game: Part 2 – Beyond SSAS/SSIS/SSRS (Ned Stratton: 28^th August 2020)

The Power BI End Game: Part 1 – From Data Analyst to Insight Explorer (Ned Stratton: 14^th August 2020)

Excel VBA in the modern business - the case for and against (Ned Stratton: 13^th July 2020)

An epic fail with Python Text Analysis (Ned Stratton: 20^th June 2020)

Track and Trace and The Political Spectrum of Data - Liberators vs Protectors (Ned Stratton: 12^th June 2020)

Defining the role of a Data Analyst (Slawomir Laskowski: 31^st May 2020)

The 7 Most Common Mistakes Made in Data Analysis (Slawomir Laskowski: 17^th May 2020)

COVID-19 Mortality Rates - refining media claims with basic statistics (Ned Stratton: 10^th May 2020)

Ned Stratton: 25th October 2024

There are several ways in which one can become famous in data. Or if not quite famous, at least a figure of moderate notoriety and esteem.

Undisputed legends of data are once in a century phenomena, and are usually legends of maths and/or computing too. I'm thinking of people like WWII computer and algorithms pioneer and Benedict-Cumberbatch-lookalike Alan Turing, or Thomas Bayes, the 18th century English clergyman with his eponymous theorem on conditional probability.

Moderate notoriety doesn't need game changing innovations or intellectual breakthroughs on a par with the above, but it does at least involve speed off the mark to corner a significant market. Examples are David McCandless in the field of infographic design, or Marco Russo and Alberto Ferrari in DAX (the query language of Power BI data models), who by dint of their training material and courses on SQLBI are known simply and reverentially among DAX users as "The Italians".

But they still need to make ongoing, relentless effort to maintain market share and relevance. The Italians can't stop sleuthing and retire on passive income because DAX constantly has new functions added to it by Microsoft that followers demand new expertise on. David McCandless needs to keep producing new graphics to stay afloat in the swimathon of showing the world's problems in glossy Voronoi charts. Respected local user group organisers or widely-seen bloggers and Youtubers would lose their Microsoft MVPs if they packed it in for six months when life got in the way.

Name (or number) on the trophy

The holy grail of data fame is an association or achievement that requires minimal or no ongoing maintenance work, but which makes your name talked about with desired frequency and breadth. Not Taylor Swift or anything, but, you know, a big deal in the world of financial reporting.

I think the easiest route to this is to have a law, coefficient, distribution, paradox, or number named after you, which has routes in research or analytics but is fundamentally titillating for dinner party companions and explainable to them without PowerPoint. Let's go through these one by one, with my recommendation for trying to come up with one of your own.

Law: Intimidating, but actually within reach. Take Moore's Law (the axiom that computer power doubles every two years). That's pithy, accessible, and cool, right? Think of the tennis, craft-beer, or Married-At-First-Sight equivalent of this, and you're rocking. Worth trying.

Coefficient: As in Pearson's correlation coefficient, which measures the strength of correlation between 2 variables. Coming up with one of these on your own will be difficult, and probably not worth the effort as coefficient has a vibe that's more stats nerd than erudite raconteur. Not worth trying.

Distribution: On the face of it, these are synonymous with the world of data given their importance to statistical probability. They are also common candidates to be named after their inventor; think Poisson, Bernoulli, Gaussian. But, they're best explained using an equation or drawing of the curve that represents them, therefore they lack dinner-party-anecdote potency. Inventing your own distribution could prove high-risk for low reward. Not worth trying.

Paradox: A paradox is a snappy moniker for "intriguing thing that might surprise your expectations". The obvious (and possibly only) example would be Simpson's Paradox. This is when you get one trend from your data as a whole, but when you break your data down into several categories, you get the reverse of that trend for most of those categories. Given this is paradoxical but certainly not poleaxing to the average intellect, I'd say that the paradox is a niche and untapped market that could bring untold wealth and fame to the switched-on data blogger. Definitely worth trying.

Number: Yes, everyone has a number. But not everyone has a number. Some are mathematical constants which were proven by their owners to have special properties and uses in mathematics and data science, such as Euler's Number (2.718). This is the elite, Nobel prize end of the market, and I wouldn't advise anybody – data blogger, accomplished liar, or otherwise – to claim to their mates that 7.984 carries unique algebraic properties and applications that they have discovered and is therefore Kevin's Number.

However, a number can also be the conclusion of a serious and interesting analysis that represents a limit or threshold at which something significant occurs or changes, such as Dunbar's Number. This is the upper limit of meaningful connections that a human can maintain at once, which is 150. Its beauty is in how on point it is in a social-media obsessed world where the laptop classes on LinkedIn claim to have 1000s of people in their networks. We can refer to it any time the topic crops up in everyday chatter, and revel in its clarity as a single, simple, divisible by 10 number. Yes, Robin Dunbar is an Oxford-educated PhD anthropologist who arrived at this figure after vast research and analysis, and all credit to him. But sound research isn't splitting an atom or inventing fire, and neither is a number which is interesting necessarily one that blows people's minds. With a bit of data gathering and sound stats around something you find interesting, you could have your own number. Worth trying, but best suited to someone with an interesting surname that's also not double-barrelled or excessively posh. Mansard's Number has a better ring to it than Smith's Number or Fortescue-Barrington's Number.

Stratton's Number

In the end, I decided to go with a number. With my interest in UK politics and Youtube-assembled collection of intermediate SQL, Python and Power BI as qualifications, I came up with 17.2 – Stratton's Number.

It represents – based on analysis in my Data Viz section on UK General Election data and House of Common's transcripts archives (see slide 9) – the minimum number of MPs that an upstart party needs to build a critical mass of attention, credibility and influence in Parliament. 17 MPs guarantees that at least one MP from that party will be called to ask a question at PMQs every session. There are also 17 Select Committees in the House of Commons that scrutinise major areas of government such as the treasury, foreign affairs, or education, which means that with 17 MPs, a minor political party can formally participate in detailed public questioning on every key political issue.

I haven't solved an earth-shatteringly acute social or scientific problem with this, but can enlighten conversations with otherwise amateur and baseless political commentary on how important the Greens, Reform, or the Lib Dems are. I'll count that as a result if it means I don't need to write variants of "10 things you never knew you could do with Power BI slicers" every week until I retire.

The UCOVI Blog

The UCOVI Blog

Latest Post - Stratton's number and the pursuit of fame in data (without too much work)

Previous Articles

Name (or number) on the trophy

Stratton's Number

Previous Articles