The UCOVI Blog

The UCOVI Blog



Welcome to UCOVI's repository of data discussions and interviews.

➡ Click here if you wish to contribute an article.

Latest Post: The Joy of Clunky Data Analogies

➡ Go to Previous Articles

Previous Articles

Event Review - SQLBits 2022, London (Ned Stratton: 17th March 2022)

Interview: Susan Walsh - The Classification Guru (Ned Stratton: 21st February 2022)

Upskilling as a data analyst - acquiring knowledge deep, broad and current (Ned Stratton: 31st January 2022)

Beyond SIC codes – web scraping and text mining at the heart of modern industry classification: An interview with Agent and Field's Matt Childs (Ned Stratton: 8th December 2021)

Debate: Should Data Analytics teams sit within Sales/marketing or IT? (Ned Stratton: 26th October 2021)

Event Review: Big Data LDN 2021 (Ned Stratton: 27th September 2021)

The Swiss Army Knife of Data - IT tricks for data analysts (Ned Stratton: 9th September 2021)

UK Google Trends - Politics, Porn and Pandemic (Ned Stratton: 15th October 2020)

How the UK broadcast media have misreported the data on COVID-19 (Ned Stratton: 7th October 2020)

The Power BI End Game: Part 3 – Cornering the BI market (Ned Stratton: 21st September 2020)

The Power BI End Game: Part 2 – Beyond SSAS/SSIS/SSRS (Ned Stratton: 28th August 2020)

The Power BI End Game: Part 1 – From Data Analyst to Insight Explorer (Ned Stratton: 14th August 2020)

Excel VBA in the modern business - the case for and against (Ned Stratton: 13th July 2020)

An epic fail with Python Text Analysis (Ned Stratton: 20th June 2020)

Track and Trace and The Political Spectrum of Data - Liberators vs Protectors (Ned Stratton: 12th June 2020)

Defining the role of a Data Analyst (Slawomir Laskowski: 31st May 2020)

The 7 Most Common Mistakes Made in Data Analysis (Slawomir Laskowski: 17th May 2020)

COVID-19 Mortality Rates - refining media claims with basic statistics (Ned Stratton: 10th May 2020)


Ned Stratton: 14th April 2022

My interview two blog posts ago with Susan Walsh – author of Between the Spreadsheets and creator of the COAT methodology as well as a metaphor for data maturity around the cycle of cleaning clothes (dirty laundry as dirty data all the way to wardrobe-arranged ironed, cleaned garments as insightful reports) – got me thinking about how awash (no pun intended) the conversation about data and data analytics is with analogies and metaphors, and whether or not this is useful.

Sometimes the oversimplification of an issue with a kindergarten analogy can be condescending and unhelpful. I had this a few years ago when I asked for flexibility in the monthly spend of a data research budget at work to account for slower months vs months with unforeseen urgent requirements, only to be told by a sales manager in temporary charge of the data team that "that's like a salesman saying 'I don’t need to hit my target this month because I sold loads last month'". Also consider this attempt by Computerworld's blog at condensing corporate data governance into the simple management of household finances (complete with headline straight from 1957).

But frequently in the world of data a good analogy comes in handy. Abstract and dry concepts such as normalisation, many-to-many relationships, statistical significance, and the Central Limit Theorem are as engrossing and intellectually satisfying to committed data nerds as they are utterly boring and remote to business stakeholders and non-data folk. In a situation like this as a data analyst, your aim is to get the message across in a way that maintains the level of complexity of the issue necessary to ensure its resolution, but without confusing, alienating or talking down to the message receiver. A pithy analogy achieves this by repackaging the abstractness of something like a warehouse load failure or a data science model with a high false positive rate into something every day, amusing and ideally visual.

This depiction that uses Lego bricks to visualise the process of converting raw data into charts and reports is a fun one, as is the David McCandless flow diagram of raw data to inter-connected knowledge from his book Knowledge is Beautiful, which he explains in a talk through the analogy of atoms, cells, organs and organisms.

The host of this podcast from Half Stack Data Science perfectly encapsulated how odd it is that data analysts and scientists currently seem to specialise by coding language and statistical approach rather than by industry (think SQL Data Analyst, Power BI Consultant etc), explaining 32 minutes in that it's rather like "a builder specialising in hammers".

Paul Daniel Jones, a former Data Governance Head at Nationwide and Barclays, has impressively managed to adapt the art of data analogy formulation into a full-length book - The Data Garden and Other Data Allegories. It features six short stories with morals on how to approach data-related challenges in businesses, including fables about a data literacy driving school and a data governance hospital.

Despite their inherent cheese factor, I think data analogies are good. Getting in the habit of using them is a genuinely effective way of engaging business users in what your analysis is saying, and explaining any limitations to your conclusions or the length and complexity of the process involved. They are a vital linguistic tool in the task of data translation.

But what is truly precious is when the moment arises to use SQL, databases and data as the analogy itself with which to explain something else.

I had this great fortune four years ago at a funeral of all occasions, when someone started a conversation about Christianity. He candidly owned up to "getting the whole God thing" but being confused about what - if there is a God that is omniscient and controls the universe – the point of Jesus Christ was. Since I knew he had a solid enough grasp of databases, I used that as the basis to make The Lord Our Saviour relevant to his line of work. I told him that just as large databases hosted in SQL Server have simple front-end web interfaces for non-SQL users to interact with in day-to-day tasks such as entering new records or downloading reports, Jesus Christ acts as the conduit by which Christians – the end users of Christianity – interact with God (the database) and receive salvation from sin and purgatory (complex queries and risky update/delete statements). Rather pleasingly he got the gist, only to then ask me on that basis "what are other religions for?". Less convincingly, I told him to think of those as MySQL, Oracle and PostgreSQL databases.




Previous Articles

Event Review - SQLBits 2022, London (Ned Stratton: 17th March 2022)

Interview: Susan Walsh - The Classification Guru (Ned Stratton: 21st February 2022)

Upskilling as a data analyst - acquiring knowledge deep, broad and current (Ned Stratton: 31st January 2022)

Beyond SIC codes – web scraping and text mining at the heart of modern industry classification: An interview with Agent and Field's Matt Childs (Ned Stratton: 8th December 2021)

Debate: Should Data Analytics teams sit within Sales/marketing or IT? (Ned Stratton: 26th October 2021)

Event Review: Big Data LDN 2021 (Ned Stratton: 27th September 2021)

The Swiss Army Knife of Data - IT tricks for data analysts (Ned Stratton: 9th September 2021)

UK Google Trends - Politics, Porn and Pandemic (Ned Stratton: 15th October 2020)

How the UK broadcast media have misreported the data on COVID-19 (Ned Stratton: 7th October 2020)

The Power BI End Game: Part 3 – Cornering the BI market (Ned Stratton: 21st September 2020)

The Power BI End Game: Part 2 – Beyond SSAS/SSIS/SSRS (Ned Stratton: 28th August 2020)

The Power BI End Game: Part 1 – From Data Analyst to Insight Explorer (Ned Stratton: 14th August 2020)

Excel VBA in the modern business - the case for and against (Ned Stratton: 13th July 2020)

An epic fail with Python Text Analysis (Ned Stratton: 20th June 2020)

Track and Trace and The Political Spectrum of Data - Liberators vs Protectors (Ned Stratton: 12th June 2020)

Defining the role of a Data Analyst (Slawomir Laskowski: 31st May 2020)

The 7 Most Common Mistakes Made in Data Analysis (Slawomir Laskowski: 17th May 2020)

COVID-19 Mortality Rates - refining media claims with basic statistics (Ned Stratton: 10th May 2020)