CPD timing for this level
Foundations time breakdown
This is the first pass of a defensible timing model for this level, based on what is actually on the page: reading, labs, checkpoints, and reflection.
What changes at this level
Level expectations
I want each level to feel independent, but also clearly deeper than the last. This panel makes the jump explicit so the value is obvious.
Correct mental models for data, training, evaluation, and common pitfalls.
Not endorsed by a certification body. This is my marking standard for consistency and CPD evidence.
- A one page vocabulary map: features, labels, training, validation, testing, inference, and what can go wrong if each is misunderstood.
- A model failure note: one example of leakage or overfitting and the specific check you used to spot it.
- A simple risk statement for one AI use case: who is harmed if it is wrong, how you would notice, and your fallback.
AI Foundations
CPD tracking
Fixed hours for this level: 8. Timed assessment time is included once on pass.
View in My CPDThese notes are a calm path into AI. I focus on meaning first, numbers second, and practice always. The goal is not buzzwords. The goal is to build judgement you can use when you meet real systems.
This Foundations level is written to be CPD-friendly and beginner-safe, while still being technically honest. It covers skills that map well to:
- BCS Foundation Certificate in Artificial Intelligence: correct terms, responsible practice, and basic model thinking.
- CompTIA Data+ (as a supporting foundation): data basics, quality, and interpretation habits.
- NIST AI RMF 1.0 (foundational alignment): risk awareness and clear boundaries for use.
What AI is and why it matters now
🧠What AI is and why it matters now
AI is a way of learning patterns from data so a system can make predictions, rank options, or automate decisions.
AI is powerful because it can learn patterns too complex to write as hand made rules. It is also fragile because it can learn the wrong pattern. A model can look clever while failing quietly. The skill is to ask what it is really using as evidence.
AI matters now because systems touch decisions that used to be manual. Hiring screens, fraud checks, support routing, and medical triage all use models to move faster. That speed is useful, but it can also amplify mistakes at scale. This is why foundations matter. I need to know what the model is doing before I trust it.
Imagine a support team that uses an AI model to route urgent messages. If the model learns that certain phrases usually mean “urgent”, it may quietly miss urgent messages written in a different style. The risk is not only wrong answers. The risk is wrong priorities at speed.
In practice, the first useful question is “What happens on the model’s bad day?” If the answer is “we do not know”, the system is not ready to be trusted for anything high impact.
My calm view on hype is this. AI is not magic. It is applied statistics plus engineering. It will keep changing how work is done, and it will keep producing failures that look silly in hindsight. If you learn the basics, you can use it well without believing the marketing.
Rules based software vs model based systems
Rules are explicit. Models are learned from data.
Quick check. What is AI
Scenario: Someone says 'the AI decided'. What is the most accurate way to describe what a model actually does
Scenario: A model flags a legitimate invoice email as spam. Give one likely data reason
Scenario: You are building a spam filter. What is 'training' in plain terms
Scenario: The spam filter is live and scoring new emails. What is 'inference'
Scenario: Your model is 98% accurate but still causes harm. How can both be true
Scenario: In most products, what does an AI system actually output into the wider workflow
Why do models need monitoring after launch
What is one reason a rule based system may be preferred
What is a practical habit when reading AI claims
Scenario: A model correctly identifies 99% of emails as spam but the 1% error rate includes critical customer messages. Why is accuracy alone insufficient
Scenario: Your model performs perfectly in testing but fails on real user inputs. What is one likely cause
Why should you ask 'what happens on the model's bad day' before trusting an AI system
🧪Worked example. A spam filter that looks good on paper and fails in real life
Let’s build a mental model you can reuse. Imagine we want to classify emails as spam or not spam. We choose features that feel sensible: subject length, number of links, the sender domain, a few keywords. We train, it looks great, and everyone celebrates.
Then it ships. A week later, someone complains that customer invoices are being flagged as spam. Not because the model is “stupid”, but because it learned a shortcut. In training, spam often had many links, and invoices also have many links. So the model did what you rewarded it for: it used link count as a strong signal.
The lesson: the model does not understand what spam is. It understands what was correlated with the label in your training set. If your training set had different kinds of invoices, or if you measured the wrong thing, the model will happily optimise the wrong target.
⚠️Common mistakes I see (and how to spot them early)
🔎Verification. Can you explain the system, not just the model
- Write one sentence each for: input, output, and what a mistake costs.
- Write two examples of “bad day” inputs your training set might miss.
- Decide what the system does with low confidence outputs. Escalate to a human, ask for more information, or refuse to decide.
📝Reflection prompt
Think of one automated decision you have experienced in real life. A fraud check, a job filter, a content feed, a customer service bot. What do you think its “bad day” looks like. Who pays for it.
✅After this section you should be able to:
- Explain what a model is and why it exists in a system
- Explain what breaks when training data and real inputs diverge
- Explain the trade off between automation speed and human judgement
🧾CPD evidence prompt (copy friendly)
If you are logging CPD, keep the entry short and specific. If you can attach an artefact, do it.
Data and representation
📊Data and representation
Some data is structured. That means it fits neatly into rows and columns. Think customer age, number of failed logins, or time since last password reset. Other data is unstructured. That means it looks like raw text, images, audio, or long logs. It still has structure, but you have to extract it.
Models cannot understand raw text or images the way humans do. They do not see meaning. They see numbers. If you give a model a photo, it will be turned into numbers first. If you give it an email, it will be turned into numbers first. The model learns patterns in those numbers.
The intuition is simple. If two pieces of text are used in similar ways, they often end up with similar numbers. A model can then treat closeness as a hint that the meaning is related. It is not perfect. It is a useful shortcut.
In a real system, representation choices show up as behaviour. If you represent a customer only by “spend last month”, the model may miss that a loyal customer is having a temporary issue. If you represent them by richer behaviour signals, the model may be more useful but also harder to explain.
Suppose you build a model using “postcode” as a feature because it predicts outcomes well. In practice, that can become a proxy for protected attributes. The representation can silently encode social patterns you did not intend to automate.
Bad data creates bad models. If the labels are wrong, the model learns the wrong lesson. If the data is missing whole groups of people, the model will fail on those groups. If the data reflects old behaviour, the model will struggle when the world changes. This is why data work is not busywork. It is the foundation.
Splitting data matters because we want honest feedback. Training data is what the model learns from. Validation data is what you use to make choices during building. Test data is the final check you keep separate until the end. If you test on the same data you trained on, you are grading your own homework with the answer sheet open.
From raw data to numbers a model can learn from
Turn messy inputs into numeric representation.
Quick check. Data and representation
Scenario: A team says 'we have loads of data' but it is mostly outdated logs. In AI terms, what does data mean
Scenario: Your dataset is a mix of spreadsheet rows and customer emails. What is the difference between structured and unstructured data
Scenario: In a spam filter, 'number of links' is used by the model. What is that
Scenario: In training, each email is marked spam or not spam. What is that mark called
Why can models not understand raw text or images directly
Scenario: A model only accepts numbers. After processing, your email becomes a list of numbers. What is that representation called
Scenario: You want similar documents to sit near each other for search. What is an embedding for
Why do similar things often end up with similar numbers
Why do we split data into training, validation, and test sets
Scenario: The model performs well in a demo but fails for a real user group. Name one way bad data creates bad models
What is data leakage in simple terms
Why is a single metric rarely enough
🧪Worked example. When “postcode” becomes a shortcut feature
I want you to feel this in your bones because it shows up everywhere. Imagine we build a model to predict whether a customer will miss a payment. Someone suggests adding postcode because it improves the score. The model gets better on the spreadsheet, and the temptation is to ship it.
Here is the uncomfortable truth: postcode can act as a proxy for things we should not automate in a crude way. You might not be explicitly using protected attributes, but proxies still bake in social history. If you do not check this carefully, you are not “data driven”, you are laundering old bias through a new system.
⚠️Common mistakes in data and representation
- Treating identifiers as numbers. If it is a label, store it as a label.
- Confusing “missing” with “zero”. In many datasets, missingness is its own signal.
- Letting the model see the future by accident (leakage). It makes you feel clever right before it humiliates you in production.
- Using one representation because it is easy, then being surprised when the model cannot learn what you care about.
🔎Verification. A small checklist before you trust any dataset
- What does each feature mean in plain English, and how is it measured.
- What is missing, and who is missing.
- What is the label, and who decided it.
- What would make the label wrong. Be specific.
- If the world changes, which features will drift first.
📝Reflection prompt
Write down one feature from your work or life that feels “predictive”. Now write down the uncomfortable question: what is it a proxy for.
✅After this section you should be able to:
- Explain why representation choices change what a model can learn
- Explain what breaks when labels, groups, or context are missing from data
- Explain the trade off between simple features and richer embeddings
Supervised and unsupervised learning
🎓Supervised and unsupervised learning
When we say a model learns, we mean it changes its internal settings so it can make better guesses. It is not learning like a person learns. It is closer to practice. You show examples, it adjusts, and it gets less wrong over time.
In supervised learning, you give the model an input and an answer. The model tries to guess the answer, then it is corrected. Over many examples, it learns a pattern that can generalise to new cases.
Email spam filtering is a classic supervised example. You have emails, and you have labels like spam and not spam. Image classification is another. You have images, and you have labels like cat, dog, or receipt. House prices are supervised too, but the answer is a number. The same pattern applies. Inputs in, answer attached, model learns to predict.
There are two common supervised shapes.
The difference matters because the mistakes feel different. A wrong category can block a real email. A wrong price can cost real money.
Instead of asking "what is the right label", you ask "what patterns exist". This is useful when labels are missing, expensive, or not even well defined.
Grouping customers by behaviour is a common unsupervised example. You might discover that one group buys weekly and another group buys once a year. Topic discovery in documents is another. You might find clusters of themes in support tickets without anyone labelling them by hand. Anomaly detection is a third. You look for unusual behaviour that might signal fraud or intrusion.
Unsupervised learning is harder to evaluate because there is no single correct answer waiting in a spreadsheet. If you change your settings, the groupings can change. Sometimes both results are reasonable. You have to judge usefulness, not just score points.
Imagine a bank clustering transactions to find “normal” behaviour. If the system learns that weekend spending is “unusual” for a certain group, it might flag normal customers as fraud. Unsupervised results still need human judgement and context.
In practice, teams use clustering to create segments and then make decisions based on those segments. That means errors in the clustering can become policy, pricing, or access decisions. Treat cluster labels as hypotheses, not truth.
Here are a few beginner misconceptions to avoid. First, more data is not always better data. If it is biased or messy, you scale the problem. Second, unsupervised learning is not a free shortcut. It still needs careful interpretation. Third, a model learning a pattern does not mean it understands a reason. It means it found a shortcut that worked on the training data.
Two ways models learn from data
Supervised has answers. Unsupervised searches for structure.
🧪Worked example. A classifier that is “accurate” and still useless
Suppose only 1% of transactions are fraud. A lazy model that always predicts “not fraud” gets 99% accuracy. It is also worthless. This is why I keep saying: one metric can lie to you.
In real systems, you usually care about questions like: how many real fraud cases did we catch (recall), how many innocent people did we annoy (false positives), and what is the operational cost of review.
⚠️Common mistakes in learning paradigms
- Thinking supervised learning means “truth” because labels exist. Labels can be wrong.
- Treating cluster names as real categories. They are just a grouping choice.
- Forgetting that the model learns what you reward. If you reward one metric, it will optimise that metric even if it harms everything else.
🔎Verification. Prove you understand the evaluation question
- For a chosen example (spam, fraud, triage), write which error hurts more and why.
- Decide what should happen with borderline cases.
- Explain the difference between “the model is uncertain” and “the model is confidently wrong”.
📝Reflection prompt
Pick a decision you would never fully automate. Why. What information would you want a model to provide to help a human decide instead.
Quick check. Supervised and unsupervised learning
Scenario: After training, the model gets better at predicting from examples. In plain language, what changed
Scenario: You accidentally train and test on the same emails. Why is that a problem
Scenario: You have thousands of emails labelled spam or not spam. What makes this supervised
Scenario: You have no labels, but you want to group customers by behaviour. What makes this unsupervised
Scenario: A model predicts 'spam' vs 'not spam'. What type of task is that
Scenario: A model predicts a delivery time in minutes. What type of task is that
Scenario: Why is unsupervised learning harder to evaluate
Scenario: A clustering tool produces five clusters with neat names. What is a practical risk
Scenario: You train a classification model and it learns patterns. What does 'training' actually do in technical terms
Scenario: A model that predicts fraud has high precision but low recall. What does that mean in plain terms
Scenario: Why do we use validation data separately from training data
Scenario: You discover your unsupervised clustering groups customers differently each time you run it. Is this normal
✅After this section you should be able to:
- Explain when supervised learning is appropriate and what it optimises for
- Explain when unsupervised learning is useful and why evaluation is harder
- Explain what breaks when people treat model outputs as understanding
Responsible AI basics and limitations
⚖️Responsible AI basics and limitations
AI systems can cause harm even when everybody is trying to do the right thing. The harm is not always dramatic. Sometimes it is quiet and personal. Someone is incorrectly flagged as suspicious. Someone does not get offered an opportunity. Someone is pushed into a bubble of content that makes them angrier and more certain.
The core reason is simple. Models learn patterns, not truth. They learn what tends to follow what in the data they were given. They do not know what is fair. They do not know what is lawful. They do not know what is kind. They only know what was rewarded during training.
Bias can enter through the data, through the labels, and through the choices we make. If your training data under represents some faces, facial recognition errors show up first in those groups. If your labels reflect past human decisions, you teach the model those decisions as if they were objective truth. If you optimise only for speed, you often trade away care.
In a real system, a “small” bias can become a big harm because automation runs every day. Imagine an AI triage tool that consistently underestimates risk for one group. Even if the average performance looks fine, the harm concentrates on real people.
In practice, the fix is not only better metrics. The fix is process: clear accountability, human review for high impact cases, and monitoring that triggers action when outcomes drift or complaints rise.
Fairness is not a single magic number. It is a set of decisions. What harm do we want to prevent. Which groups matter in this context. What trade offs are acceptable. It belongs with humans, not only with metrics.
Explainability matters most when decisions affect real lives. An automated decision system that cannot explain itself becomes hard to challenge. People then learn to distrust the whole process, or worse, they stop trying to challenge it at all.
Bigger models do not automatically mean better or safer. They can be more capable and still be more confusing. They can also be more expensive to run, which encourages shortcuts. If it costs too much to monitor and evaluate, teams quietly stop doing it.
🎯What this optimises for
- Automation speed and consistency for low risk tasks
- Scaled assistance when humans have time and context
⚠️What this makes harder
- Accountability if oversight is vague
- Safe rollback when models drift or fail
AI confidence is not the same as correctness. A model can be very confident and still be wrong. This shows up most painfully when people over trust AI answers. A model that sounds fluent can still invent details. It is good at sounding plausible. It is not automatically good at being true.
This is why human judgement must stay in control. Use AI to help you think, draft, or explore. Do not let it quietly become the decision maker for high impact outcomes. If a system can deny a benefit, flag a person, or change a life, it needs clear oversight and a way to appeal.
Drift is not rare. It is normal. People change behaviour. Fraud patterns adapt. Language evolves. Even a simple recommendation system can drift into a bubble because the system is changing the data it then learns from. You cannot evaluate once and declare victory.
The AI lifecycle and where risks appear
Risk shows up at every step. Oversight wraps the loop.
🧪Worked example. Hallucination meets a real policy decision
Imagine a team uses a chat assistant to draft a customer policy response. The model writes something confident, polite, and wrong. It invents a rule that does not exist. A tired human skims it, copies it into an email, and the customer now has written confirmation of a policy the company does not have.
This is not a “prompt engineering” problem first. This is a governance problem. Who owns the output. What gets checked. What is the escalation path when the model produces something risky.
⚠️Common mistakes in responsible AI
- Calling something “fair” because you computed one fairness metric once.
- Treating a model’s confidence as permission to stop thinking.
- Shipping without monitoring, then acting surprised when drift happens.
- Using a model for a high impact decision without an appeal path.
🔎Verification. A lightweight governance checklist you can actually run
- Define the use case and the harm you are trying to prevent.
- Define who can approve changes, and who is accountable when it goes wrong.
- Decide what evidence you will keep: evaluation results, monitoring signals, and user feedback.
- Decide how you roll back. Not “if needed”, but “how, in practice”.
Quick check. Responsible AI basics and limitations
Scenario: A well-meaning automation quietly denies opportunities to a subset of people. Why can AI cause harm even when well intentioned
Scenario: A model repeats what historically happened, even if it was unfair. What does it mean that models learn patterns, not truth
Scenario: Name two places bias can enter an AI system
Scenario: Facial recognition fails more often for some groups. Why is that a fairness issue
Scenario: You upgrade to a bigger model and stop doing monitoring because it is expensive. Why is bigger not automatically safer
Scenario: The model sounds certain but is wrong. Why is AI confidence not the same as correctness
Scenario: The system invents a policy that does not exist and presents it as fact. What is that called
Scenario: Performance drops over three months as users change behaviour. What is model drift
Scenario: When must humans stay in control
Scenario: You want a safe default for high impact decisions. What should it be
Scenario: An incident happens and you need to explain what the model was for and where it fails. What is one governance artefact you should keep
Scenario: Why is it important to define who is accountable when an AI system causes harm
✅After this section you should be able to:
- Explain why models do not understand intent or truth and what breaks if you trust them blindly
- Explain why evaluation and monitoring are continuous, not a one time step
- Explain the trade off between AI automation and human oversight in high impact decisions
You now have the foundations to talk clearly about data, learning, and limits. Intermediate will build on this with evaluation, deployment, and system thinking. Foundations are about how you think, not which tool you use.
