1. Computing Machinery and Intelligence
October 1950Artificial intelligenceParadigm shiftEvent page
There was no rigorous framework for discussing machine intelligence. The question 'Can machines think?' seemed philosophical rather than scientific. No criteria existed for evaluating claims about machine intelligence.
Alan Turing proposed the 'imitation game' (later called the Turing Test) as an operational definition of machine intelligence. Rather than asking 'Can machines think?', Turing reframed the question in terms of observable behaviour: can a machine's responses be indistinguishable from a human's?
Turing published 'Computing Machinery and Intelligence' in the journal Mind in October 1950. The paper addressed objections to machine intelligence and proposed the imitation game as a practical test. This paper is considered one of the founding documents of artificial intelligence.1
2. Dartmouth Summer Project
18 June 1956 to 17 August 1956Artificial intelligenceParadigm shiftEvent page
Research on machine intelligence was scattered across different disciplines with no unifying identity. There was no common terminology, no shared research agenda, and no recognition of a distinct field.
The Dartmouth Summer Research Project on Artificial Intelligence established AI as a distinct academic discipline. The term 'artificial intelligence' was coined. Key researchers gathered to define the field's scope and approach, creating a research community that would shape the next decades.
John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon proposed a two-month workshop at Dartmouth College. The proposal, submitted in August 1955, outlined ambitious goals including language use, abstraction, and self-improvement. The workshop ran in summer 1956, though attendance was sporadic.2, 1
3. Lighthill report, first winter
1974 to 1993Artificial intelligenceParadigm shiftEvent page
Early AI research promised rapid progress towards human-level intelligence. Government and industry invested heavily based on optimistic predictions. Initial successes in narrow domains fuelled expectations of general breakthroughs.
Two major 'AI winters' saw dramatic reductions in funding and interest. The first (mid-1970s) followed the Lighthill Report and DARPA cuts. The second (late 1980s-early 1990s) followed the collapse of the expert systems market. Research continued but with reduced resources and tempered expectations.
The 1973 Lighthill Report criticised AI's failure to achieve ambitious goals, leading to UK funding cuts. DARPA reduced AI funding after projects failed to meet milestones. The second winter followed the collapse of specialised AI hardware companies (Lisp machines) and disillusionment with expert systems' limitations.3, 4
4. Backpropagation revival
9 October 1986Artificial intelligencePublicationEvent page
The Perceptrons book (1969) had demonstrated limitations of single-layer neural networks, contributing to reduced interest in connectionist approaches. Multi-layer networks could theoretically overcome these limitations but there was no efficient training algorithm.
Rumelhart, Hinton, and Williams published a clear description of backpropagation for training multi-layer neural networks. While the algorithm had been discovered earlier, this paper made it accessible and demonstrated its power, reviving interest in neural networks.
The paper 'Learning representations by back-propagating errors' was published in Nature in October 1986. It showed how to efficiently compute gradients through multiple layers using the chain rule, enabling networks to learn internal representations. The clear exposition and compelling results sparked renewed interest in connectionism.5, 6, 7
5. Deep Blue defeats Kasparov
11 May 1997Artificial intelligenceMajor incidentEvent page
Chess had been considered a benchmark for machine intelligence since the field's inception. Despite decades of progress, no computer had defeated a reigning world champion in a match. Many believed human intuition and creativity gave an insurmountable advantage.
IBM's Deep Blue defeated Garry Kasparov, the reigning world chess champion, in a six-game match. This was the first time a computer beat a world champion under standard tournament conditions. The victory demonstrated that machines could outperform humans in complex cognitive tasks.
Deep Blue was a specialised chess computer capable of evaluating 200 million positions per second. It used alpha-beta search with sophisticated evaluation functions refined by grandmasters. After losing to Kasparov in 1996, the team improved both hardware and software. The 1997 rematch ended 3.5-2.5 in Deep Blue's favour.8
6. AlexNet wins ImageNet
30 September 2012Artificial intelligenceParadigm shiftEvent page
Computer vision relied on hand-crafted features (SIFT, HOG) combined with classifiers. Progress on image recognition had plateaued. Neural networks were considered too slow to train on large datasets. The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) error rates had stagnated.
AlexNet, a deep convolutional neural network, won the 2012 ImageNet challenge with a top-5 error rate of 15.3%, compared to 26.2% for the second-place entry. This dramatic improvement demonstrated the power of deep learning and GPU-accelerated training, triggering a revolution in AI research.
Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton trained a deep CNN on 1.2 million images using two GTX 580 GPUs. Key innovations included ReLU activations (avoiding vanishing gradients), dropout regularisation, and data augmentation. The clear victory forced the computer vision community to adopt deep learning.9, 10
7. Attention Is All You Need
12 June 2017Artificial intelligenceParadigm shiftEvent page
Sequence models (RNNs, LSTMs) processed input sequentially, limiting parallelisation and making it difficult to capture long-range dependencies. Training on long sequences was slow and gradient flow was problematic. Machine translation quality had plateaued.
The Transformer architecture replaced recurrence with self-attention, enabling parallel processing of entire sequences. This dramatically improved training speed and model quality. The architecture became the foundation for GPT, BERT, and virtually all modern large language models.
Researchers at Google published 'Attention Is All You Need' in June 2017. The paper introduced multi-head self-attention, positional encoding, and the encoder-decoder Transformer structure. The model achieved state-of-the-art translation quality while training faster than RNN-based systems.11, 12
8. Generative pre-training
June 2018 to November 2022Artificial intelligenceParadigm shiftEvent page
NLP systems required task-specific architectures and training. Transfer learning was limited. No single model could handle diverse language tasks. Conversational AI remained stilted and narrow.
Large language models (LLMs) demonstrated that scaling Transformer models on vast text corpora yields emergent capabilities. GPT-3 (2020) showed few-shot learning across diverse tasks. ChatGPT (2022) made conversational AI accessible to the public, triggering widespread AI adoption and debate.
OpenAI released GPT (2018), GPT-2 (2019), and GPT-3 (2020), each dramatically larger. GPT-3's 175 billion parameters showed remarkable few-shot capabilities. Google's BERT (2018) demonstrated bidirectional pretraining. ChatGPT (November 2022) combined GPT-3.5 with RLHF, achieving unprecedented public adoption and sparking global conversation about AI.13