UCOVI Stage 2: Collect

Business Strategy & People Management: ★★★★☆ Technical Work: ★★☆☆☆ Analytical/Scientific: ★★★★☆

Discipline Overview

The Collect stage is where you address any weaknesses and gaps in your data that you find in the Understand phase in two ways.

First, ensure that analytics requirements and auditability are considered when designing production systems that gather and create data. Make sure that historical logs are captured in enough detail to support behavioural analysis that your business can act on, that primary key identifiers from other databases the production system will integrate with are stored in the data, and that appropriate column datatypes are chosen, null/uniqueness constraints are enforced, and character encoding is used so as to create a tough first line of defence against poor data quality.

Second, source and create new datasets that complement the analysis you can run from existing databases. Success at this covers several strategic and technical disciplines, including API querying, research for relevant and up-to-date reference datasets from trustworthy online sources, website scraping, and effective survey and user form design. Building surveys and forms that optimise data capture is often an afterthought in companies, but carefully considering the Q & A structures on your key customer data capture points could not be more important when to enriching your data and reliably segmenting your audience. To achieve the holy grail of a form that's quick enough to minimise drop offs whilst also scoring trebles on the dartboard of data gathering, cut out questions that your analysts could work then answers out for using existing data, don't ask the same question twice in different formats (free text Job Title field followed by Job Function from an options picklist), and put response limits on multi-select response questions. If your customer says they are interested in all eight of your interest options, they might as well be interested in none of them.

Sample questions and considerations

What does our survey sample size need to be to gain reliable insights from it?
Where can we find reliable statistics data on open schools in the UK by age range and geo location?
Does the UTM tracking on our corporate website give us enough detail on where our leads and inbound traffic comes from?

People and Teams to Involve

Core data asset owners (internal or external), market research and survey design experts, compliance teams.

Discipline Overview

Collect the raw data for your analysis project or other purpose in the most efficient, clean and automatable way possible.

Common situations faced

Connecting to and calling data from third-party APIs, designing and optimising booking and enquiry forms, facing resistance from internal stakeholders when trying to access internal production databases for analytics.

Technologies to learn and specialisms to hone

Web Scraping, Survey Design, Awareness of Data Privacy Law, Working with APIs, Communication skills with IT and Security Teams

