Analysis and insight
By the end of this module you will be able to:
- Distinguish descriptive, diagnostic, predictive, and prescriptive analytics
- Explain feature engineering in machine learning contexts
- Connect analytics outputs to business decisions

Real-world application · 2020
Tesco's Clubcard data predicted panic-buying three days before shelves emptied.
In March 2020, Tesco's data science team analysed Clubcard transaction data and detected a shift in purchasing patterns three days before widespread panic-buying emptied shelves. Predictive models flagged unusual increases in long-life food, cleaning products, and medicine purchases.
The insight allowed supply chain teams to adjust stock allocations before the surge hit. Descriptive analytics would have told them what happened after the fact. Predictive analytics gave them lead time to act.
Descriptive analytics showed what customers bought last week. Predictive analytics showed what they would buy next. The difference was three days of preparation time.
Analytics transforms data into decisions. The four levels of analytics maturity answer progressively more valuable questions: what happened, why, what will happen, and what should we do. Most organisations operate at levels one and two. The competitive advantage lies in levels three and four.
With the learning outcomes established, this module begins by examining four levels of analytics in depth.
15.1 Four levels of analytics
Descriptive analytics answers "what happened?" using historical data: reports, dashboards, KPIs. This is where most organisations spend 80% of their analytics effort.
Diagnostic analytics answers "why did it happen?" by drilling into root causes: segmentation, cohort analysis, anomaly investigation. Revenue dropped 12% in Q3; diagnostic analysis reveals a product recall drove 80% of the decline.
Predictive analytics answers "what will happen?" using statistical models and machine learning: forecasting, classification, regression. Based on current trends, Q4 revenue is forecast at £2.1M.
Prescriptive analytics answers "what should we do?" using optimisation and simulation: resource allocation, scenario planning, decision support. To maximise Q4 revenue, shift 60% of marketing budget to digital channels.
“The goal of analytics is not to produce reports. It is to produce decisions.”
Thomas Davenport, 'Competing on Analytics' (2007) - Chapter 1
Davenport's framing shifted analytics from a reporting function to a decision-support function. The distinction matters: a dashboard that nobody acts on is cost, not value.
Common misconception
“Our organisation does predictive analytics because we have a machine learning model.”
A machine learning model that sits in a Jupyter notebook and is never integrated into a decision process is not predictive analytics. It is an experiment. Predictive analytics means the model's output systematically informs decisions: routing, pricing, inventory allocation, risk scoring. The gap between a working model and an operational decision tool is where most ML projects fail.
With an understanding of four levels of analytics in place, the discussion can now turn to feature engineering, which builds directly on these foundations.
15.2 Feature engineering
Feature engineering is the process of creating input variables (features) for machine learning models from raw data. A raw dataset might contain a timestamp. Feature engineering extracts: day of week, hour of day, is_weekend, days_since_last_purchase, rolling_7_day_average. These derived features often determine model performance more than algorithm choice.
Good feature engineering requires domain knowledge. A data scientist working on fraud detection needs to understand transaction patterns. A feature like "number of transactions in the last 15 minutes from the same card" captures a meaningful pattern that raw transaction records do not directly expose.
“Applied machine learning is basically feature engineering.”
Andrew Ng, Stanford CS229 lecture notes - Lecture on practical advice for ML
Ng's observation reflects industry experience: algorithm selection matters less than feature quality. A simple logistic regression with well-engineered features often outperforms a deep learning model with raw inputs.
Common misconception
“Deep learning eliminates the need for feature engineering.”
Deep learning can learn features from raw data (images, text, audio) in some domains. But for tabular business data (the vast majority of enterprise ML), feature engineering remains critical. A 2021 Kaggle survey found that top competitors in tabular data competitions spent more time on feature engineering than on model architecture.
A retail chain produces monthly sales reports showing revenue by store and product category. The reports are emailed to regional managers who review them in their next meeting. Which analytics level is this?
A data scientist builds a churn prediction model that achieves 92% accuracy in testing. Six months after deployment, the model's accuracy drops to 71%. What is the most likely cause?
A feature engineer creates 'rolling_90d_spend' from raw order data. This feature calculates the total spent by each customer in the last 90 days. Why is this more useful than raw order amounts for predicting churn?
Key takeaways
- Four analytics levels answer progressively more valuable questions: descriptive (what happened), diagnostic (why), predictive (what will happen), and prescriptive (what should we do).
- Feature engineering transforms raw data into model inputs. For tabular business data, feature quality determines model performance more than algorithm choice.
- A model in a notebook is not analytics. Analytics means the output systematically informs decisions. The gap between a working model and an operational decision tool is where most ML projects fail.
- Model drift degrades prediction accuracy over time as real-world patterns change. Models need periodic retraining on recent data.
Standards and sources cited in this module
Davenport, T.H. (2007). Competing on Analytics
Chapter 1
Foundational text framing analytics as a competitive advantage. Shifts analytics from reporting to decision support.
Andrew Ng, Stanford CS229 lecture notes
Practical advice for ML
Source for 'applied ML is basically feature engineering' principle.
Kaggle State of Data Science Survey (2021)
Feature engineering section
Evidence that top competitors in tabular data competitions prioritise feature engineering over model architecture.
DAMA-DMBOK2 (2017)
Chapter 14, Data Science and Business Intelligence
Industry framework for analytics capability and maturity assessment.
Tesco Clubcard analytics case study, reported by The Guardian (March 2020)
Full article
Opening case study: predictive analytics detecting panic-buying patterns three days before shelves emptied.
Module 15 of 26 · Applied Data

