Data Foundations · Module 3
Units, notation, and the difference between percent and probability
Data work goes wrong when people are casual about units.
Previously
Data, information, knowledge, judgement
I want a simple model in your head that stays useful even when the tools change, and DIKW works because it forces you to separate raw observations from meaning before.
This module
Units, notation, and the difference between percent and probability
Data work goes wrong when people are casual about units.
Next
Data representation and formats
Computers store everything using bits (binary digits) because hardware can reliably tell two states apart.
Progress
Mark this module complete when you can explain it without rereading every paragraph.
Why this matters
If one dataset records energy in kWh and another records energy in MWh, then the same physical quantity will appear with numbers that differ by a factor of 1000.
What you will be able to do
- 1 Explain units, notation, and the difference between percent and probability in your own words and apply it to a realistic scenario.
- 2 Units and notation are how you stop data from lying through ambiguity.
- 3 Check the assumption "Units are written where used" and explain what changes if it is false.
- 4 Check the assumption "Conversions are controlled" and explain what changes if it is false.
Before you begin
- No previous technical background required
- Read the section explanation before using tools
Common ways people get this wrong
- Unit mismatch. Two systems can be correct locally and wrong together. Units are a common cause.
- Ambiguous notation. If notation is inconsistent, people misread values and build wrong logic.
Data work goes wrong when people are casual about units. Units are not decoration. Units are the meaning. This is why I teach it early and I teach it bluntly.
Worked example. kWh and MWh are both “energy” and still not the same number
Worked example. kWh and MWh are both “energy” and still not the same number
If one dataset records energy in kWh and another records energy in MWh, then the same physical quantity will appear with numbers that differ by a factor of 1000. A join can be perfectly correct and the final answer can be perfectly wrong.
A small cheat sheet you can reuse
Notation cheat sheet
Keep this close when you compare dashboards or datasets.
-
Percent
Out of 100. Example: 12% means 12 out of 100.
-
Probability
Out of 1. Example: 0.12 means 12 out of 100.
-
Rate
Per unit time. Example: 3 requests per second.
-
Count
How many. Example: 3 outages.
-
Amount
Quantity with a unit. Example: 3 kWh.
Verification. Spot the three most common confusion traps
Unit and notation checks
Run this before accepting any trend claim.
-
Check percentage storage
Confirm whether percentages are stored as 12 or 0.12 and document it.
-
Check timestamp standard
Confirm whether timestamps are UTC or local time and state the time zone.
-
Check magnitude against unit
If a number looks wrong, validate unit conversion before debating the trend.
Mental model
Units protect meaning
Units and notation are how you stop data from lying through ambiguity.
-
1
Value
-
2
Unit
-
3
Context
-
4
Meaning
Assumptions to keep in mind
- Units are written where used. Units should be visible in dashboards, schemas, and docs, not hidden in a meeting note.
- Conversions are controlled. Conversions should be deliberate and tested. Silent conversions create drift.
Failure modes to notice
- Unit mismatch. Two systems can be correct locally and wrong together. Units are a common cause.
- Ambiguous notation. If notation is inconsistent, people misread values and build wrong logic.
Check yourself
Quick check. Units and notation
0 of 4 opened
Why are units not decoration
Units are the meaning. Without them, a number cannot be interpreted safely.
What is the difference between 12% and 0.12
They represent the same proportion, but one is written out of 100 and the other is written out of 1. Mixing them causes errors.
Give one common timestamp trap
Time zones. UTC and local time can shift day boundaries and make numbers disagree.
What is a quick first check when a value looks wrong
Confirm the unit and definition before arguing about trends or blaming the pipeline.
Artefact and reflection
Artefact
A short module note with one key definition and one practical example
Reflection
Where in your work would explain units, notation, and the difference between percent and probability in your own words and apply it to a realistic scenario. change a decision, and what evidence would make you trust that change?
Optional practice
Complete one guided exercise and explain your decision in plain language