Kiliman-data: Explaining UCOVI with an analogy about mountain climbing

Wouldn't things be great if…?

The decision makers in your business or department had instant, secure access to stats and analysis relevant to all of the business's goals and KPIs. It was delivered in a clear, appealing way that allowed drillthrough and further investigation of interesting spikes or trends. The decision makers had every confidence that the data behind the stats and analytics was accurate, and that any assumtions or uncertainty around forecasts were disclosed. Furthermore, they themselves possessed the statistical nouse and acumen to actually draw the right conclusions from the analysis at their fingertips.

This is the paradigm of a data-driven business - the glorious summit of Mount Kiliman-data with all its panoramic views.

How you get there?

You get there with UCOVI, and to understand why, let's work backwards down the mountain right back to base-camp. Expand the tabs to take the descent.

Being at the top means your company is at one with its data. Your management team use it with ease to detect worrying downward trends or opportunities worth exploring, and can deploy project managers and data analysts or scientists to explore them. Every stat is set in its proper context, measured against an agreed SLA or using a test of statistical significance to see how good/bad it really is.

You are here because your data and business leaders have not only domain knowledge, but also statistical awareness. The CEO should know about sampling error so that he can judge the certainty of insights produced from customer satisfaction surveys. The CMO should have awareness of hypothesis testing methods to determine with certainty whether certain changes to email copy or landing pages had measurable and significant positive impact.

An organisation that uses data well needs leaders that embrace statistical concepts as indispensable tools, and who are data-savvy enough to interpret the stats they are given.

No leadership team – even one made up of Nobel-prize winning mathematicians - can derive insights or make decisions from reports they don't understand. The human mind has an acute sense for visual hierarchy of colour, typography and sizing. Cluttered or monochrome data visualisations, missing explanations of the stats they show and using the wrong visual for the story, make the brain work overtime. If this brain happens to be attached to your CEO, the report will go in the bin and the decision will go on gut feeling.

To avoid the mountain climb failing at the last hurdle, effective visualisation of data is crucial. To achieve this, an organisation needs the technological and human resource for this. Pick a BI tool that is adaptable around your source data, lets you build drillthrough pages and customise tool-tips, and provides the full range of chart types, not just tables, bar, pie and line charts. Skill your data team up in the art of design and visualisation best practice, and make the effort to correct spelling errors on report text and customise series colours and number formatting.

If presented stylishly and logically, the report will be a pleasure to use. If the report is a pleasure to use, then your leadership team will actually use it.

Okay, so you're presenting your data well, and people like it. But what if it takes a marathon data prep task in Excel to do so? And worse, what if the data you've artfully organised into Sankey flow diagrams and custom tool-tipped treemaps is horrendously inaccurate?

These pitfalls represent running out of supplies, energy or morale mid-way up the mountain. It's something you can and must prepare for, and the way you prepare is by organising your data well.

All steps of UCOVI are equally important, but this one is the longest and most expensive. It's twofold: strategic and technical. The strategy is in defining the architecture and holding structure for your data. Usually this will be in a data warehouse with Fact and Dimension tables along the Kimball methodology, and the goal is to design tables and relationships between them that store your data in a way that is memory-efficient, easy to query and protected by constraints on data-type and uniqueness so as to minimise error and duplication. The technical side is in using computers to materialise this strategy. It involves hardware procurement, setup and maintenance, coding up ETL procedures and automations, and designing queries and validation rules to keep stock of the health and trustworthiness of your data.

If you do the organisation stage well, without cutting corners, your result will be an accurate, integral and readily available dataset that a report developer can plug into their BI tool of choice with minimal effort and doubt. They can then design the interpretable reports to be found at the mountain's summit.

In the mountain-climbing analogy, you the climber drew on every sinew of material resource, skill, and stamina both physical and mental to get through the mid-way ascent. But you couldn't have done this without the right climbing kit, food supplies and clothing, and such resource would have been depleted if you covered unnecessary kilometres getting lost lower down the mountain.

In the parallel data strategy story, your crack data team took several months building the Large Hadron Collider to store and transform your data. It does the trick and runs automatically, but with vast complexity and expense, both computational and financial.

This is why good data collection practice is a vital pre-requisite for organising your data. The data itself is the food the goes in the climber's mouth, the boots he walks in, and the clothes that protect and insulate him. They better quality they are, the easier the climb.

Likewise, the cleaner and richer the data collected at source, the lesser the labour in organising it. You ensure maximum cleanness and richness at point of collection with strict input validation rules, such as limiting free-text input where possible. Use appropriate file formats (if your columns contain commas as text, avoid CSV files), and be 'future-proof'. You may just need date now, but if exact timestamp is available from the source data and you decide you don't need it, you may have some explaining to do later…

A crucial element of this is intelligent survey design, which is a delicate trade-off between collecting as many variables as possible about your samples (by asking more questions), and collecting as many samples as possible (by ensuring people actually complete the survey, which might mean asking less questions!). A good data gathering strategy includes meticulous survey design, with questions that only the subjects can answer themselves (such as motivation for buying the product), and not asking them things which could be calculated by other means (such as their geographical region, which you could work out from their postcode. We've seen these mistakes!).

If the data going into your database is clean and logical to start with, your database can run simpler, cheaper and quicker.

Your climbing route, the time you spend on each stage, and the kit you take with you is worked out and planned with your support team at base-camp at the foot of the mountain. In the same way, what data you will bring with you all the way up to the board-level insight stage is informed by what your organisation actually needs data for.

If as a business you don't agree on or understand what you need to know from your data, you won't know what data to collect. The base-camp stage is therefore understanding and formulating a strategy.

You need to undergo a discovery phase where you agree on what KPIs and metrics you need to track, what drives your business (price-point, product-category, customer age-range, industry-sector), and how your data maps these. By doing so you will find out what data you have and need, what you have and don't need, and what you don't have but need (which is what you need to collect).

First, get your business users to articulate their existing assumptions are, and what more they'd like to know. From this, evaluate whether reports or analysis already exist, then see if the organisation has the data to create or improve them. From there, decide what to collect and how.

The data mountain is scaled by a company when the data is correctly interpreted.

It can't be interpreted without being visualised.
It can't be visualised if it isn't organised.
It can't be organised without it first being collected.
It can't be collected if the purpose for doing so isn't understood.

So Understand, Collect, Organise, Visualise, Interpret.