Applied Data · Module 6

Inference, sampling, and experiments

Inference is the art of learning about a bigger reality from limited observations.

20 min 4 outcomes Data Intermediate

Previously

Probability and distributions (uncertainty without the panic)

Data work is mostly uncertainty management.

This module

Inference, sampling, and experiments

Inference is the art of learning about a bigger reality from limited observations.

Next

Modelling basics (regression, classification, and evaluation)

Modelling is not magic.

Progress

Mark this module complete when you can explain it without rereading every paragraph.

Why this matters

You analyse only customers who completed a journey because that is what is easy to track.

What you will be able to do

  • 1 Explain inference, sampling, and experiments in your own words and apply it to a realistic scenario.
  • 2 Inference works when you design what you can claim, and how you will test it.
  • 3 Check the assumption "Sampling is honest" and explain what changes if it is false.
  • 4 Check the assumption "Claims are bounded" and explain what changes if it is false.

Before you begin

  • Foundations-level vocabulary and concepts
  • Confidence with basic diagrams and section terminology

Common ways people get this wrong

  • P-hacking behaviour. Fishing for significance creates false stories. Decide analysis before looking.
  • Ignoring confounders. If confounders are ignored, you claim causation where there is none.

Inference is the art of learning about a bigger reality from limited observations. This matters because most datasets are not the full world. They are a sample, often a biased one.

Worked example. The “successful customers” dataset that hides the problem

Worked example. The “successful customers” dataset that hides the problem

You analyse only customers who completed a journey because that is what is easy to track. Your dashboard shows high satisfaction. The people who dropped off never appear, so the system looks healthier than it is.

Common mistakes in inference

Inference failure patterns

Inference breaks when sample limitations are hidden.

  1. Observed treated as universal truth

    Sample evidence does not automatically generalise to all users or contexts.

  2. Point estimate without uncertainty

    No interval or sample size means no basis for confidence.

  3. A/B tests treated as truth machines

    Instrumentation gaps and assignment bias can invalidate conclusions.

Verification. Ask the sampling questions

Sampling integrity checklist

Use this before publishing any experiment result.

  1. Inclusion and exclusion map

    State clearly who is represented and who is missing.

  2. Dropout mechanism check

    Document what causes people or events to disappear from the dataset.

  3. Instrumentation drift detector

    Define how you will detect measurement process changes over time.

Mental model

Inference needs design

Inference works when you design what you can claim, and how you will test it.

  1. 1

    Hypothesis

  2. 2

    Design

  3. 3

    Collect

  4. 4

    Analyse

  5. 5

    Claim

Assumptions to keep in mind

  • Sampling is honest. If sampling is biased, inference is biased. Design matters.
  • Claims are bounded. Make claims that your data collection supports. Overclaiming breaks trust.

Failure modes to notice

  • P-hacking behaviour. Fishing for significance creates false stories. Decide analysis before looking.
  • Ignoring confounders. If confounders are ignored, you claim causation where there is none.

Check yourself

Quick check. Inference and experiments

0 of 5 opened

What is inference

Learning about a bigger reality from limited observations.

Scenario. You analyse only customers who completed a journey. What is the risk

Survivorship bias. You miss the people who dropped out, so you overestimate success and satisfaction.

Why does sample size matter

Small samples produce noisy estimates and can make random variation look like a pattern.

What is one question you ask before trusting a result

Who is included, who is missing, and why.

What breaks an A and B test quietly

Bias in who gets assigned, changes in instrumentation, and differences in data capture between groups.

Artefact and reflection

Artefact

A one-page decision note with assumption, evidence, and chosen action

Reflection

Where in your work would explain inference, sampling, and experiments in your own words and apply it to a realistic scenario. change a decision, and what evidence would make you trust that change?

Optional practice

Work through one scenario and justify the decision with evidence

Source DAMA DMBOK 2 (Data Management Body of Knowledge, 2nd Edition)
Source ISO/IEC 11179 metadata registries
Source ISO/IEC 27701:2025 privacy information management
Source ICO data protection principles and UK GDPR guidance