Data Foundations · Stage test

Data Foundations stage test

No governed timed route exists for this stage yet, so this page gives you an honest untimed stage-end check built from the published bank.

Format Untimed self-check
Questions 20
Best time to use it After the stage modules and practice

Question 1

What is the most accurate definition of data?

  1. Data is any spreadsheet stored on a computer
  2. Data is recorded observations such as numbers, text, timestamps, or images that need context to have meaning
  3. Data is the same as information once it is written down
  4. Data only exists in digital form
Reveal answer

Correct answer: Data is recorded observations such as numbers, text, timestamps, or images that need context to have meaning

Question 2

In the DIKW hierarchy, what separates information from knowledge?

  1. Information is digital and knowledge is analogue
  2. Information is structured data with context, and knowledge is information enriched by experience and reasoning
  3. Knowledge is always more accurate than information
  4. Information is raw and knowledge is a chart
Reveal answer

Correct answer: Information is structured data with context, and knowledge is information enriched by experience and reasoning

Question 3

Why does the difference between a kilobyte (1,000 bytes) and a kibibyte (1,024 bytes) matter?

  1. It only matters for storage manufacturers who want to advertise larger drives
  2. Because mixing up base-10 and base-2 units causes capacity planning errors in real systems
  3. There is no practical difference between the two
  4. Kibibytes are an outdated standard that nobody uses
Reveal answer

Correct answer: Because mixing up base-10 and base-2 units causes capacity planning errors in real systems

Question 4

Which statement best compares JSON and CSV as data formats?

  1. JSON is always better than CSV because it supports nesting
  2. CSV is better because it uses less storage space
  3. JSON supports hierarchical structures and mixed types while CSV is flat and tabular, and the best choice depends on the use case
  4. CSV and JSON store data in exactly the same way
Reveal answer

Correct answer: JSON supports hierarchical structures and mixed types while CSV is flat and tabular, and the best choice depends on the use case

Question 5

What problem does character encoding (like UTF-8) solve?

  1. It compresses files to save storage space
  2. It maps characters to numbers so that different systems represent text the same way
  3. It encrypts sensitive text data
  4. It converts images into text
Reveal answer

Correct answer: It maps characters to numbers so that different systems represent text the same way

Question 6

Why is ISO 8601 preferred for representing dates in data systems?

  1. Because it was the first date standard ever created
  2. Because it is shorter than other date formats
  3. Because the YYYY-MM-DD format sorts correctly, avoids day-month ambiguity, and works across time zones with offset notation
  4. Because it is required by law in all countries
Reveal answer

Correct answer: Because the YYYY-MM-DD format sorts correctly, avoids day-month ambiguity, and works across time zones with offset notation

Question 7

What do the FAIR principles stand for?

  1. Fast, Accurate, Integrated, Reliable
  2. Findable, Accessible, Interoperable, Reusable
  3. Free, Available, Independent, Reproducible
  4. Formal, Automated, Indexed, Regulated
Reveal answer

Correct answer: Findable, Accessible, Interoperable, Reusable

Question 8

What is the key difference between open data and FAIR data?

  1. Open data must be FAIR, but FAIR data does not have to be open
  2. They are the same thing with different names
  3. Open data is about access rights, while FAIR data is about discoverability and reuse regardless of access restrictions
  4. FAIR data is always free, but open data can cost money
Reveal answer

Correct answer: Open data is about access rights, while FAIR data is about discoverability and reuse regardless of access restrictions

Question 9

When is a bar chart a better choice than a pie chart?

  1. When you want to show exactly two categories
  2. When you need to compare values across more than three categories and the differences are small
  3. Pie charts are always better because they show proportions
  4. Bar charts are only for time-series data
Reveal answer

Correct answer: When you need to compare values across more than three categories and the differences are small

Question 10

Which of these is NOT one of the six core dimensions of data quality?

  1. Accuracy
  2. Completeness
  3. Speed
  4. Consistency
Reveal answer

Correct answer: Speed

Question 11

A database records a customer address as '123 Main St' in one table and '123 Main Street' in another. Which data quality dimension is violated?

  1. Accuracy
  2. Timeliness
  3. Consistency
  4. Completeness
Reveal answer

Correct answer: Consistency

Question 12

What are the typical stages in a data lifecycle?

  1. Create, Store, Delete
  2. Collect, Process, Analyse, Archive, Destroy
  3. Plan, Collect, Process, Store, Share, Analyse, Archive, Destroy
  4. Input, Output, Feedback
Reveal answer

Correct answer: Plan, Collect, Process, Store, Share, Analyse, Archive, Destroy

Question 13

Why is a data retention policy important?

  1. It makes databases run faster
  2. It defines how long data is kept and when it must be deleted, helping comply with regulations like GDPR
  3. It prevents anyone from ever deleting data
  4. It is only relevant for government organisations
Reveal answer

Correct answer: It defines how long data is kept and when it must be deleted, helping comply with regulations like GDPR

Question 14

What is the primary role of a data steward?

  1. Writing SQL queries for the analytics team
  2. Ensuring data quality, enforcing standards, and acting as a bridge between business and technical teams
  3. Approving all data access requests personally
  4. Building data pipelines and ETL processes
Reveal answer

Correct answer: Ensuring data quality, enforcing standards, and acting as a bridge between business and technical teams

Question 15

Under GDPR, what is the difference between a data controller and a data processor?

  1. The controller owns the hardware, the processor writes the software
  2. The controller decides why and how personal data is processed, the processor handles it on the controller's behalf
  3. They are different names for the same role
  4. The processor has more legal responsibility than the controller
Reveal answer

Correct answer: The controller decides why and how personal data is processed, the processor handles it on the controller's behalf

Question 16

Why is informed consent important in data ethics?

  1. It is a legal formality with no practical impact
  2. Because people have the right to know what data is collected about them, why, and how it will be used before agreeing
  3. It only applies to medical data
  4. Consent is only needed when selling data to third parties
Reveal answer

Correct answer: Because people have the right to know what data is collected about them, why, and how it will be used before agreeing

Question 17

What is algorithmic bias in the context of data ethics?

  1. When an algorithm runs slowly on certain hardware
  2. When the data used to train a model contains patterns that lead to unfair or discriminatory outcomes
  3. When an algorithm produces different results each time it runs
  4. When programmers intentionally write biased code
Reveal answer

Correct answer: When the data used to train a model contains patterns that lead to unfair or discriminatory outcomes

Question 18

Which of these is the best example of structured data?

  1. A customer review paragraph on a product page
  2. A relational database table with defined columns for name, email, and order date
  3. A PDF scan of a handwritten letter
  4. An audio recording of a meeting
Reveal answer

Correct answer: A relational database table with defined columns for name, email, and order date

Question 19

A temperature sensor records 38.5 degrees Celsius. What level of the DIKW hierarchy does this represent?

  1. Data, because it is a raw recorded value with no interpretation
  2. Information, because the number is meaningful
  3. Knowledge, because we know the temperature
  4. Wisdom, because we can act on it
Reveal answer

Correct answer: Data, because it is a raw recorded value with no interpretation

Question 20

What is the primary risk of using a truncated Y-axis on a bar chart?

  1. The chart becomes harder to print
  2. It exaggerates small differences and can mislead the audience about the true scale of change
  3. The chart loads more slowly in a browser
  4. There is no risk as long as you label the axis
Reveal answer

Correct answer: It exaggerates small differences and can mislead the audience about the true scale of change