Name: How LLMs Actually Work
Price: 29 USD
Availability: InStock

Question 1

Do I need a maths background to understand how LLMs work?

Accepted Answer

No — this course builds intuition for how transformers and attention work without requiring calculus or linear algebra. We use diagrams, analogies, and plain-language explanations. You will come away with accurate mental models of what is happening rather than mathematical proofs. If you want to go deeper into the maths after completing this course, the academic papers we link to will be much more accessible once the intuitions are in place.

Question 2

What is a context window and why does it matter?

Accepted Answer

A context window is the maximum amount of text (measured in tokens) that a language model can process in a single interaction. Models with larger context windows can see more of a conversation, document, or codebase at once. Context size affects both capability (can the model see the whole document?) and cost (longer contexts cost more to process). Understanding context limits is essential for building reliable AI applications.

Question 3

Why do AI models hallucinate?

Accepted Answer

Hallucination happens because language models are fundamentally pattern completion systems, not knowledge retrieval systems. They generate the most statistically likely next token given the context, without checking against ground truth. This makes them excellent at fluent text generation but unreliable for facts they were not strongly trained on. Understanding this distinction helps you design prompts and applications that work around hallucination rather than fall prey to it.

Question 4

What is the difference between fine-tuning and RAG?

Accepted Answer

Fine-tuning means continuing training a base model on your specific data to alter its behaviour and knowledge. RAG (Retrieval Augmented Generation) means retrieving relevant documents at inference time and including them in the prompt context. Fine-tuning is expensive and requires significant data; RAG is cheaper, more updatable, and better for precise factual recall. Most production applications use RAG for domain knowledge and fine-tuning for style or format consistency.

Question 5

What is tokenisation and how does it affect AI behaviour?

Accepted Answer

Tokenisation is how models split text into chunks (tokens) before processing. A token is roughly 3–4 characters of English text. Model costs are calculated per token, context limits are measured in tokens, and unusual words (code, foreign languages, rare terms) often tokenise inefficiently. Understanding tokenisation helps you write more cost-effective prompts and understand why models sometimes struggle with character counting or unusual word forms.

How LLMs Actually Work

What you'll learn

Course outline

Get the full course

About this course

Frequently asked questions