Foundations · Module 1

Understanding AI

Let me start with what AI is not.

35 min 3 outcomes Foundations

Previously

Start with Foundations

No technical experience required. Build your understanding from the ground up.

This module

Understanding AI

Let me start with what AI is not.

Next

From LLMs to agents

A Large Language Model (LLM) is like an incredibly well-read assistant who has consumed most of human knowledge available on the internet.

Progress

Mark this module complete when you can explain it without rereading every paragraph.

Why this matters

Artificial Intelligence.

What you will be able to do

  • 1 Define artificial intelligence, machine learning, and deep learning in plain English.
  • 2 Explain what neural networks do, without pretending they think like humans.
  • 3 Spot common hype and make calmer, more accurate claims about capability.

Before you begin

  • No previous technical background required
  • Read the section explanation before using tools

Common ways people get this wrong

  • Spurious patterns. A model can learn shortcuts that look correct in training data, then fail badly in the real world.
  • Distribution shift. When the world changes, yesterday's patterns stop being reliable.

Main idea at a glance

The Three Layers of AI

From broadest to most specific

Stage 1

Artificial Intelligence

Any system that appears to exhibit intelligence. This includes everything from simple rule-based systems to systems that generate creative content.

I think AI is fundamentally about systems that can make decisions or generate outputs based on learned patterns rather than hard-coded rules.

1.1.1 What is Artificial Intelligence?

Let me start with what AI is not. AI is not a conscious being. It does not think the way you think. It does not have feelings, desires, or goals of its own. Understanding this from the start will help you build better systems and avoid common mistakes.

The Human Analogy

Think of AI like teaching a child to recognise cats. You do not give the child a rulebook saying "cats have four legs, whiskers, and pointy ears." Instead, you show them thousands of pictures of cats and non-cats until they develop an internal understanding of what makes something a cat.

This is precisely how modern AI works. Instead of programming explicit rules (if whiskers AND four legs AND pointy ears THEN cat), we show the AI millions of examples and let it discover the patterns itself.

Artificial Intelligence

Any system that appears to exhibit intelligence. This includes everything from simple rule-based systems ("if temperature above 25 degrees, turn on air conditioning") to systems that can generate creative content like essays and images.

Machine Learning

A subset of AI where systems improve through experience. Instead of explicit programming, we provide data and the system learns patterns from it.

Deep Learning

A subset of machine learning that uses artificial neural networks with multiple layers. These "deep" networks can learn complex patterns in large datasets.

What AI can and cannot do

I think it is important to be honest about current capabilities. AI has made remarkable progress, but it also has significant limitations.

1.1.2 Key Terminology

Before we go further, let me define some terms you will encounter throughout this course.

Model

The trained AI system itself. Think of it as a brain that has learned from examples. When you use a chat assistant, you are interacting with a model.

Training

The process of teaching an AI using examples. During training, the model adjusts millions (or billions) of internal numbers to get better at its task.

Inference

When the AI uses what it learned to respond to new inputs. This is what happens when you ask a model a question. The training is already done. The model is now making inferences.

Parameters

The internal "knowledge" of a model, stored as numbers. Large models can contain billions of parameters. These numbers collectively encode the patterns the model has learned.

Context Window

How much information the AI can consider at once. If a model has a 100,000 token context window, it can "see" roughly 75,000 words of text at a time.

Token

A piece of text, roughly 4 characters or 3/4 of a word on average. AI models process text as tokens, not as letters or words.

Hallucination

When AI generates plausible-sounding but incorrect information. The model is not lying. It genuinely does not know the difference between what it invented and what is true.

1.1.3 Common Misconceptions

I want to address some things you may have heard about AI that are not quite accurate.

Common mistake

AI understands what it reads

Reality. AI identifies patterns in text. It processes language mathematically, not semantically. When you ask an AI about a book, it is not recalling the experience of reading. It is predicting what text should come next based on patterns.

Common mistake

AI is conscious or sentient

Reality. AI is sophisticated pattern matching. It has no subjective experience, desires, or consciousness. It may simulate emotions in text, but it does not feel them.

Common mistake

AI will replace all human jobs

Reality. AI augments human capabilities. Tasks, not entire jobs, get automated. New roles emerge. The Industrial Revolution did not eliminate work. It transformed it. AI will do the same.

Common mistake

More parameters equals better AI

Reality. Architecture, training data quality, and fine-tuning matter more than raw size. A well-trained smaller model can outperform a poorly trained larger one on specific tasks.

1.1.4 Choosing a model without fooling yourself

The field moves too fast for memorising a leaderboard to be a sensible strategy. A better habit is to compare models against the same questions every time a new release appears.

What to compare when model names change

Start with task fit and risk, not marketing.

  1. Task fit

    Check what the model must actually do: summarise, code, retrieve, classify, reason, or use tools.

  2. Data sensitivity

    Decide whether you can send the data to a hosted API or whether you need stronger control over where it runs.

  3. Latency and cost

    Measure the real request path under your workload. Do not trust headline numbers without your own test cases.

  4. Operational controls

    Check logging, rate limits, tool permissions, model versioning, and rollback options before you build around a provider.

Open source vs closed source models

Open-weight or self-hosted models give you more control over deployment, privacy boundaries, and fine-tuning. Hosted proprietary models can reduce operational burden and may offer stronger managed tooling. Neither is inherently better. The right choice depends on your privacy requirements, budget, latency needs, and operational maturity.

Mental model

Learning from patterns

Modern AI mostly learns patterns from data, then uses them to predict what comes next.

  1. 1

    Training data

  2. 2

    Objective and loss

  3. 3

    Model

  4. 4

    Predictions

  5. 5

    Feedback loop

Assumptions to keep in mind

  • Data reflects reality. If the training data is biased or stale, the model learns the wrong lesson with great confidence.
  • The metric matches the goal. If you optimise the wrong metric, you get the wrong behaviour faster.

Failure modes to notice

  • Spurious patterns. A model can learn shortcuts that look correct in training data, then fail badly in the real world.
  • Distribution shift. When the world changes, yesterday's patterns stop being reliable.

Key terms

Agent
An agent is a loop that can plan, act with tools, observe what happened, then decide what to do next. The key idea is the loop, not the chat.
Large language model (LLM)
A large language model predicts the next token of text based on patterns in training data. It can be useful without being a mind.
Token
A token is a chunk of text. Models process tokens, not words, so the same sentence can take different space depending on its wording.
Context window
The context window is how much text the model can consider at once. Exact limits vary by provider and release, and long contexts still do not guarantee consistent attention across the whole input.
Tool
A tool is a controlled capability the agent can call. Tools are where real actions happen, so they need clear inputs, safe defaults, and strong validation.
Verification
Verification is how I check whether an output is safe and correct enough to use. It can be a test, a source, a calculation, or a second opinion.
Mixture of Experts (MoE)
An architecture where only a fraction of the model's parameters activate for each input. This lets a model keep large total capacity without paying the full compute cost on every token.
Context engineering
The discipline of designing systems that give the model the right information at the right time. It is broader than writing one clever prompt because it includes retrieval, memory, summaries, tool outputs, and token budgeting.

Check yourself

Quick check. Understanding AI

0 of 4 opened

What is the simplest definition of modern AI in one sentence

A system that learns patterns from data so it can make predictions, rankings, or decisions.

Why does it matter that a model can sound confident even when it is wrong

Because people trust fluent language. A confident tone can hide weak evidence, which is risky when the stakes are real.

Scenario. A tool claims it 'understands' your document. What is the safer interpretation

It matches patterns in text and predicts a response. It does not have human style understanding or lived experience.

Name two common limitations you should assume by default

It can hallucinate and it can fail in novel situations where training data did not cover the edge case.

Artefact and reflection

Artefact

A short glossary in your own words.

Reflection

Where in your work would define artificial intelligence, machine learning, and deep learning in plain english. change a decision, and what evidence would make you trust that change?

Optional practice

Translate an AI headline into a concrete description of inputs and outputs.