CPD timing for this level

Foundations time breakdown

This is the first pass of a defensible timing model for this level, based on what is actually on the page: reading, labs, checkpoints, and reflection.

Reading
42m
6,332 words · base 32m × 1.3
Labs
135m
9 activities × 15m
Checkpoints
20m
4 blocks × 5m
Reflection
32m
4 modules × 8m
Estimated guided time
4h 49m
Based on page content and disclosed assumptions.
Claimed level hours
8h
Claim includes reattempts, deeper practice, and capstone work.
The claimed hours are higher than the current on-page estimate by about 4h. That gap is where I will add more guided practice and assessment-grade work so the hours are earned, not declared.

What changes at this level

Level expectations

I want each level to feel independent, but also clearly deeper than the last. This panel makes the jump explicit so the value is obvious.

Anchor standards (course wide)
NIST AI Risk Management Framework (AI RMF 1.0)ISO/IEC 23894 (AI risk management)
Assessment intent
Foundations

Correct mental models for data, training, evaluation, and common pitfalls.

Assessment style
Format: mixed
Pass standard
Coming next

Not endorsed by a certification body. This is my marking standard for consistency and CPD evidence.

Evidence you can save (CPD friendly)
  • A one page vocabulary map: features, labels, training, validation, testing, inference, and what can go wrong if each is misunderstood.
  • A model failure note: one example of leakage or overfitting and the specific check you used to spot it.
  • A simple risk statement for one AI use case: who is harmed if it is wrong, how you would notice, and your fallback.

AI Foundations

Level progress0%

CPD tracking

Fixed hours for this level: 8. Timed assessment time is included once on pass.

View in My CPD
Progress minutes
0.0 hours

These notes are a calm path into AI. I focus on meaning first, numbers second, and practice always. The goal is not buzzwords. The goal is to build judgement you can use when you meet real systems.

CPD and certification alignment (guidance, not endorsed):

This Foundations level is written to be CPD-friendly and beginner-safe, while still being technically honest. It covers skills that map well to:

  • BCS Foundation Certificate in Artificial Intelligence: correct terms, responsible practice, and basic model thinking.
  • CompTIA Data+ (as a supporting foundation): data basics, quality, and interpretation habits.
  • NIST AI RMF 1.0 (foundational alignment): risk awareness and clear boundaries for use.
How to use Foundations
If you are non-technical, I want you to understand what AI is without being talked down to. If you are technical, I want you to respect the definitions.
Good practice
After each section, explain it out loud in your own words. If you cannot, it is not learned yet. It is only recognised.
Bad practice
Best practice

What AI is and why it matters now

🧠

What AI is and why it matters now

Concept block
AI system boundary
A model is one component inside a system. Decisions, guardrails, and feedback live outside the model.
A model is one component inside a system. Decisions, guardrails, and feedback live outside the model.
Assumptions
The model is not the product
There is a fallback path
We can observe outcomes
Failure modes
Fluent but wrong outputs
Automation bias
Silent drift

AI is a way of learning patterns from data so a system can make predictions, rank options, or automate decisions.

When people say AI, they often mean a system that takes input, applies a learned pattern, and produces an output. The learned pattern is the . The act of learning that pattern is . Using the trained model to produce results is .
The difference between training and inference matters because the risks are different. Training is where you bake in assumptions from the . Inference is where the model meets reality. If reality changes, the model can behave badly even if training looked perfect.

AI is powerful because it can learn patterns too complex to write as hand made rules. It is also fragile because it can learn the wrong pattern. A model can look clever while failing quietly. The skill is to ask what it is really using as evidence.

AI matters now because systems touch decisions that used to be manual. Hiring screens, fraud checks, support routing, and medical triage all use models to move faster. That speed is useful, but it can also amplify mistakes at scale. This is why foundations matter. I need to know what the model is doing before I trust it.

Imagine a support team that uses an AI model to route urgent messages. If the model learns that certain phrases usually mean “urgent”, it may quietly miss urgent messages written in a different style. The risk is not only wrong answers. The risk is wrong priorities at speed.

In practice, the first useful question is “What happens on the model’s bad day?” If the answer is “we do not know”, the system is not ready to be trusted for anything high impact.

My calm view on hype is this. AI is not magic. It is applied statistics plus engineering. It will keep changing how work is done, and it will keep producing failures that look silly in hindsight. If you learn the basics, you can use it well without believing the marketing.

Rules based software vs model based systems

Rules are explicit. Models are learned from data.

Rules: if condition then action
Model: input -> model -> output
Training builds the model. Inference uses it.
Rules are easier to explain. Models need monitoring and care.

Quick check. What is AI

Scenario: Someone says 'the AI decided'. What is the most accurate way to describe what a model actually does

Scenario: A model flags a legitimate invoice email as spam. Give one likely data reason

Scenario: You are building a spam filter. What is 'training' in plain terms

Scenario: The spam filter is live and scoring new emails. What is 'inference'

Scenario: Your model is 98% accurate but still causes harm. How can both be true

Scenario: In most products, what does an AI system actually output into the wider workflow

Why do models need monitoring after launch

What is one reason a rule based system may be preferred

What is a practical habit when reading AI claims

Scenario: A model correctly identifies 99% of emails as spam but the 1% error rate includes critical customer messages. Why is accuracy alone insufficient

Scenario: Your model performs perfectly in testing but fails on real user inputs. What is one likely cause

Why should you ask 'what happens on the model's bad day' before trusting an AI system

🧪

Worked example. A spam filter that looks good on paper and fails in real life

Let’s build a mental model you can reuse. Imagine we want to classify emails as spam or not spam. We choose features that feel sensible: subject length, number of links, the sender domain, a few keywords. We train, it looks great, and everyone celebrates.

Then it ships. A week later, someone complains that customer invoices are being flagged as spam. Not because the model is “stupid”, but because it learned a shortcut. In training, spam often had many links, and invoices also have many links. So the model did what you rewarded it for: it used link count as a strong signal.

The lesson: the model does not understand what spam is. It understands what was correlated with the label in your training set. If your training set had different kinds of invoices, or if you measured the wrong thing, the model will happily optimise the wrong target.

⚠️

Common mistakes I see (and how to spot them early)

Common mistakes (and what I do instead)
Mistake: worshipping accuracy
High accuracy can hide the mistakes you actually care about. I always ask what the false positives and false negatives cost in real life.
Mistake: no bad day plan
Mistake: treating the model output as the decision
Mistake: pretending errors are symmetric

🔎

Verification. Can you explain the system, not just the model

  • Write one sentence each for: input, output, and what a mistake costs.
  • Write two examples of “bad day” inputs your training set might miss.
  • Decide what the system does with low confidence outputs. Escalate to a human, ask for more information, or refuse to decide.

📝

Reflection prompt

Think of one automated decision you have experienced in real life. A fraud check, a job filter, a content feed, a customer service bot. What do you think its “bad day” looks like. Who pays for it.

After this section you should be able to:

  1. Explain what a model is and why it exists in a system
  2. Explain what breaks when training data and real inputs diverge
  3. Explain the trade off between automation speed and human judgement

🧾

CPD evidence prompt (copy friendly)

If you are logging CPD, keep the entry short and specific. If you can attach an artefact, do it.

CPD note template
What I studied
Core AI vocabulary, the difference between training and inference, and why AI systems fail when data and reality diverge.
What I practised
What changed in my practice
Evidence artefact

Data and representation

📊

Data and representation

Concept block
From event to feature
Data becomes a model input through choices about meaning, encoding, and measurement.
Data becomes a model input through choices about meaning, encoding, and measurement.
Assumptions
Features have stable meaning
Training data matches use
We treat leakage as a defect
Failure modes
Proxy features
Encoding surprises
Data drift
In AI, the word sounds fancy, but it is usually boring. It is clicks, purchases, support tickets, photos, sensor readings, and text. Data always comes with context. Where did it come from. Who produced it. What is missing. What was measured badly. If you ignore that, you build a confident model on shaky ground.

Some data is structured. That means it fits neatly into rows and columns. Think customer age, number of failed logins, or time since last password reset. Other data is unstructured. That means it looks like raw text, images, audio, or long logs. It still has structure, but you have to extract it.

To train a model, we usually separate inputs from the answer we want. A is an input signal. A is the outcome we want the model to learn to predict.

Models cannot understand raw text or images the way humans do. They do not see meaning. They see numbers. If you give a model a photo, it will be turned into numbers first. If you give it an email, it will be turned into numbers first. The model learns patterns in those numbers.

The simplest numeric form is a . For text, we first break it into a . Then we map those tokens into an .

The intuition is simple. If two pieces of text are used in similar ways, they often end up with similar numbers. A model can then treat closeness as a hint that the meaning is related. It is not perfect. It is a useful shortcut.

In a real system, representation choices show up as behaviour. If you represent a customer only by “spend last month”, the model may miss that a loyal customer is having a temporary issue. If you represent them by richer behaviour signals, the model may be more useful but also harder to explain.

Suppose you build a model using “postcode” as a feature because it predicts outcomes well. In practice, that can become a proxy for protected attributes. The representation can silently encode social patterns you did not intend to automate.

Bad data creates bad models. If the labels are wrong, the model learns the wrong lesson. If the data is missing whole groups of people, the model will fail on those groups. If the data reflects old behaviour, the model will struggle when the world changes. This is why data work is not busywork. It is the foundation.

Splitting data matters because we want honest feedback. Training data is what the model learns from. Validation data is what you use to make choices during building. Test data is the final check you keep separate until the end. If you test on the same data you trained on, you are grading your own homework with the answer sheet open.

From raw data to numbers a model can learn from

Turn messy inputs into numeric representation.

Raw text or images
Cleaning and preparation
Features chosen from the data
Numeric representation as vectors and embeddings
Model input

Quick check. Data and representation

Scenario: A team says 'we have loads of data' but it is mostly outdated logs. In AI terms, what does data mean

Scenario: Your dataset is a mix of spreadsheet rows and customer emails. What is the difference between structured and unstructured data

Scenario: In a spam filter, 'number of links' is used by the model. What is that

Scenario: In training, each email is marked spam or not spam. What is that mark called

Why can models not understand raw text or images directly

Scenario: A model only accepts numbers. After processing, your email becomes a list of numbers. What is that representation called

Scenario: You want similar documents to sit near each other for search. What is an embedding for

Why do similar things often end up with similar numbers

Why do we split data into training, validation, and test sets

Scenario: The model performs well in a demo but fails for a real user group. Name one way bad data creates bad models

What is data leakage in simple terms

Why is a single metric rarely enough

🧪

Worked example. When “postcode” becomes a shortcut feature

I want you to feel this in your bones because it shows up everywhere. Imagine we build a model to predict whether a customer will miss a payment. Someone suggests adding postcode because it improves the score. The model gets better on the spreadsheet, and the temptation is to ship it.

Here is the uncomfortable truth: postcode can act as a proxy for things we should not automate in a crude way. You might not be explicitly using protected attributes, but proxies still bake in social history. If you do not check this carefully, you are not “data driven”, you are laundering old bias through a new system.

⚠️

Common mistakes in data and representation

  • Treating identifiers as numbers. If it is a label, store it as a label.
  • Confusing “missing” with “zero”. In many datasets, missingness is its own signal.
  • Letting the model see the future by accident (leakage). It makes you feel clever right before it humiliates you in production.
  • Using one representation because it is easy, then being surprised when the model cannot learn what you care about.

🔎

Verification. A small checklist before you trust any dataset

  • What does each feature mean in plain English, and how is it measured.
  • What is missing, and who is missing.
  • What is the label, and who decided it.
  • What would make the label wrong. Be specific.
  • If the world changes, which features will drift first.

📝

Reflection prompt

Write down one feature from your work or life that feels “predictive”. Now write down the uncomfortable question: what is it a proxy for.

After this section you should be able to:

  1. Explain why representation choices change what a model can learn
  2. Explain what breaks when labels, groups, or context are missing from data
  3. Explain the trade off between simple features and richer embeddings

Supervised and unsupervised learning

🎓

Supervised and unsupervised learning

Concept block
Choosing a learning setup
The learning paradigm follows from what you have, what you want, and what error costs you.
The learning paradigm follows from what you have, what you want, and what error costs you.
Assumptions
Success can be defined
You can tolerate mistakes
Failure modes
Optimising the wrong target
False certainty

When we say a model learns, we mean it changes its internal settings so it can make better guesses. It is not learning like a person learns. It is closer to practice. You show examples, it adjusts, and it gets less wrong over time.

In supervised learning, you give the model an input and an answer. The model tries to guess the answer, then it is corrected. Over many examples, it learns a pattern that can generalise to new cases.

Email spam filtering is a classic supervised example. You have emails, and you have labels like spam and not spam. Image classification is another. You have images, and you have labels like cat, dog, or receipt. House prices are supervised too, but the answer is a number. The same pattern applies. Inputs in, answer attached, model learns to predict.

There are two common supervised shapes.

The difference matters because the mistakes feel different. A wrong category can block a real email. A wrong price can cost real money.

Instead of asking "what is the right label", you ask "what patterns exist". This is useful when labels are missing, expensive, or not even well defined.

Grouping customers by behaviour is a common unsupervised example. You might discover that one group buys weekly and another group buys once a year. Topic discovery in documents is another. You might find clusters of themes in support tickets without anyone labelling them by hand. Anomaly detection is a third. You look for unusual behaviour that might signal fraud or intrusion.

Unsupervised learning is harder to evaluate because there is no single correct answer waiting in a spreadsheet. If you change your settings, the groupings can change. Sometimes both results are reasonable. You have to judge usefulness, not just score points.

Imagine a bank clustering transactions to find “normal” behaviour. If the system learns that weekend spending is “unusual” for a certain group, it might flag normal customers as fraud. Unsupervised results still need human judgement and context.

In practice, teams use clustering to create segments and then make decisions based on those segments. That means errors in the clustering can become policy, pricing, or access decisions. Treat cluster labels as hypotheses, not truth.

Here are a few beginner misconceptions to avoid. First, more data is not always better data. If it is biased or messy, you scale the problem. Second, unsupervised learning is not a free shortcut. It still needs careful interpretation. Third, a model learning a pattern does not mean it understands a reason. It means it found a shortcut that worked on the training data.

Two ways models learn from data

Supervised has answers. Unsupervised searches for structure.

Supervised: inputs + labels -> predict a known outcome
Unsupervised: inputs only -> group, compress, or flag unusual patterns
Supervised output: category or number
Unsupervised output: clusters, topics, or anomaly scores

🧪

Worked example. A classifier that is “accurate” and still useless

Suppose only 1% of transactions are fraud. A lazy model that always predicts “not fraud” gets 99% accuracy. It is also worthless. This is why I keep saying: one metric can lie to you.

In real systems, you usually care about questions like: how many real fraud cases did we catch (recall), how many innocent people did we annoy (false positives), and what is the operational cost of review.

⚠️

Common mistakes in learning paradigms

  • Thinking supervised learning means “truth” because labels exist. Labels can be wrong.
  • Treating cluster names as real categories. They are just a grouping choice.
  • Forgetting that the model learns what you reward. If you reward one metric, it will optimise that metric even if it harms everything else.

🔎

Verification. Prove you understand the evaluation question

  • For a chosen example (spam, fraud, triage), write which error hurts more and why.
  • Decide what should happen with borderline cases.
  • Explain the difference between “the model is uncertain” and “the model is confidently wrong”.

📝

Reflection prompt

Pick a decision you would never fully automate. Why. What information would you want a model to provide to help a human decide instead.

Quick check. Supervised and unsupervised learning

Scenario: After training, the model gets better at predicting from examples. In plain language, what changed

Scenario: You accidentally train and test on the same emails. Why is that a problem

Scenario: You have thousands of emails labelled spam or not spam. What makes this supervised

Scenario: You have no labels, but you want to group customers by behaviour. What makes this unsupervised

Scenario: A model predicts 'spam' vs 'not spam'. What type of task is that

Scenario: A model predicts a delivery time in minutes. What type of task is that

Scenario: Why is unsupervised learning harder to evaluate

Scenario: A clustering tool produces five clusters with neat names. What is a practical risk

Scenario: You train a classification model and it learns patterns. What does 'training' actually do in technical terms

Scenario: A model that predicts fraud has high precision but low recall. What does that mean in plain terms

Scenario: Why do we use validation data separately from training data

Scenario: You discover your unsupervised clustering groups customers differently each time you run it. Is this normal

After this section you should be able to:

  1. Explain when supervised learning is appropriate and what it optimises for
  2. Explain when unsupervised learning is useful and why evaluation is harder
  3. Explain what breaks when people treat model outputs as understanding

Responsible AI basics and limitations

⚖️

Responsible AI basics and limitations

Concept block
Responsible use is a loop
Responsibility is not a document. It is an operating loop with measurement and correction.
Responsibility is not a document. It is an operating loop with measurement and correction.
Assumptions
Someone owns the decision
The system can pause
Failure modes
No monitoring
Policy lives only in words

AI systems can cause harm even when everybody is trying to do the right thing. The harm is not always dramatic. Sometimes it is quiet and personal. Someone is incorrectly flagged as suspicious. Someone does not get offered an opportunity. Someone is pushed into a bubble of content that makes them angrier and more certain.

The core reason is simple. Models learn patterns, not truth. They learn what tends to follow what in the data they were given. They do not know what is fair. They do not know what is lawful. They do not know what is kind. They only know what was rewarded during training.

Bias can enter through the data, through the labels, and through the choices we make. If your training data under represents some faces, facial recognition errors show up first in those groups. If your labels reflect past human decisions, you teach the model those decisions as if they were objective truth. If you optimise only for speed, you often trade away care.

In a real system, a “small” bias can become a big harm because automation runs every day. Imagine an AI triage tool that consistently underestimates risk for one group. Even if the average performance looks fine, the harm concentrates on real people.

In practice, the fix is not only better metrics. The fix is process: clear accountability, human review for high impact cases, and monitoring that triggers action when outcomes drift or complaints rise.

Fairness is not a single magic number. It is a set of decisions. What harm do we want to prevent. Which groups matter in this context. What trade offs are acceptable. It belongs with humans, not only with metrics.

Explainability matters most when decisions affect real lives. An automated decision system that cannot explain itself becomes hard to challenge. People then learn to distrust the whole process, or worse, they stop trying to challenge it at all.

Bigger models do not automatically mean better or safer. They can be more capable and still be more confusing. They can also be more expensive to run, which encourages shortcuts. If it costs too much to monitor and evaluate, teams quietly stop doing it.

🎯

What this optimises for

  1. Automation speed and consistency for low risk tasks
  2. Scaled assistance when humans have time and context

⚠️

What this makes harder

  1. Accountability if oversight is vague
  2. Safe rollback when models drift or fail

AI confidence is not the same as correctness. A model can be very confident and still be wrong. This shows up most painfully when people over trust AI answers. A model that sounds fluent can still invent details. It is good at sounding plausible. It is not automatically good at being true.

This is why human judgement must stay in control. Use AI to help you think, draft, or explore. Do not let it quietly become the decision maker for high impact outcomes. If a system can deny a benefit, flag a person, or change a life, it needs clear oversight and a way to appeal.

Drift is not rare. It is normal. People change behaviour. Fraud patterns adapt. Language evolves. Even a simple recommendation system can drift into a bubble because the system is changing the data it then learns from. You cannot evaluate once and declare victory.

The AI lifecycle and where risks appear

Risk shows up at every step. Oversight wraps the loop.

Data collection -> Training -> Evaluation
Deployment -> Monitoring -> Improvement
Human oversight wraps the lifecycle with review, policy, and clear accountability.

🧪

Worked example. Hallucination meets a real policy decision

Imagine a team uses a chat assistant to draft a customer policy response. The model writes something confident, polite, and wrong. It invents a rule that does not exist. A tired human skims it, copies it into an email, and the customer now has written confirmation of a policy the company does not have.

This is not a “prompt engineering” problem first. This is a governance problem. Who owns the output. What gets checked. What is the escalation path when the model produces something risky.

⚠️

Common mistakes in responsible AI

  • Calling something “fair” because you computed one fairness metric once.
  • Treating a model’s confidence as permission to stop thinking.
  • Shipping without monitoring, then acting surprised when drift happens.
  • Using a model for a high impact decision without an appeal path.

🔎

Verification. A lightweight governance checklist you can actually run

  • Define the use case and the harm you are trying to prevent.
  • Define who can approve changes, and who is accountable when it goes wrong.
  • Decide what evidence you will keep: evaluation results, monitoring signals, and user feedback.
  • Decide how you roll back. Not “if needed”, but “how, in practice”.

Quick check. Responsible AI basics and limitations

Scenario: A well-meaning automation quietly denies opportunities to a subset of people. Why can AI cause harm even when well intentioned

Scenario: A model repeats what historically happened, even if it was unfair. What does it mean that models learn patterns, not truth

Scenario: Name two places bias can enter an AI system

Scenario: Facial recognition fails more often for some groups. Why is that a fairness issue

Scenario: You upgrade to a bigger model and stop doing monitoring because it is expensive. Why is bigger not automatically safer

Scenario: The model sounds certain but is wrong. Why is AI confidence not the same as correctness

Scenario: The system invents a policy that does not exist and presents it as fact. What is that called

Scenario: Performance drops over three months as users change behaviour. What is model drift

Scenario: When must humans stay in control

Scenario: You want a safe default for high impact decisions. What should it be

Scenario: An incident happens and you need to explain what the model was for and where it fails. What is one governance artefact you should keep

Scenario: Why is it important to define who is accountable when an AI system causes harm

After this section you should be able to:

  1. Explain why models do not understand intent or truth and what breaks if you trust them blindly
  2. Explain why evaluation and monitoring are continuous, not a one time step
  3. Explain the trade off between AI automation and human oversight in high impact decisions

You now have the foundations to talk clearly about data, learning, and limits. Intermediate will build on this with evaluation, deployment, and system thinking. Foundations are about how you think, not which tool you use.


Quick feedback

Optional. This helps improve accuracy and usefulness. No accounts required.