nov 2023 · andrej karpathy · 1-hour talk · distilled

How to think
about LLMs.

First principles, in metaphors. The mental models that still hold up — even after the field tripled in scale.

Source: [1hr Talk] Intro to Large Language Models · pairs with the architecture of intelligence for the 2026 picture.

01 / two files

An LLM is two files.

That's it. A blob of weights and a few hundred lines of inference code. No internet required.

That's the entire deliverable. Llama 2 70B = 140 GB of float16 weights + ~500 lines of C. Karpathy's point: the inference layer is trivial. The cost and the magic both live in how you got the weights.

first principle Inference is cheap and well-understood. Training is expensive and partly mysterious. When you complain about an LLM, you're complaining about choices made during training — not run.c.

02 / compression

The weights are a zip of the internet.

Lossy compression, not a database. Roughly 70× smaller, with everything fuzzy.

Lossy, not lossless. Karpathy: "you're never 100% sure if what it comes up with is hallucination or correct." The weights remember the gestalt of the training set, not its bytes. When the model has to reconstruct a fact, it's interpolating.

interpretation Treat the model's "knowledge" the way you'd treat a colleague's recall of a paper they read once five years ago. Often right, occasionally inventive, never the source of truth.

03 / the dream

The base model dreams the internet.

No questions, no answers — just plausible documents. Form is real, content is invented.

The base model is a "document simulator." It learned the shape of code, of product listings, of encyclopedia entries — and fills in the slots with plausible noise. Sometimes the noise is real (memorized); sometimes it's invented. There's no flag distinguishing the two.

why this matters The format-confidence trap: a model that produces a perfectly formatted citation, ISBN, function signature, or court ruling can still be making it up. Format ≠ truth.

04 / inscrutable

Knowledge is stored weirdly.

The reversal curse: A → B works, but B → A may fail. Direction matters.

The reversal curse, viral 2023. Models trained on "A is B" don't automatically learn "B is A." Knowledge in the weights is direction-sensitive — a clue that what we call "knowledge" inside an LLM is closer to "patterns of co-occurrence" than to a database.

first principle LLMs are mostly inscrutable artifacts — empirical, not engineered. We can measure behavior; we can't open the black box and read the facts.

05 / the tutor

From document simulator to assistant.

Same algorithm, different data. Quality replaces quantity. ~$2M becomes ~one day.

Stage 2 hires labelers (Scale AI etc.) to write ~100k high-quality Q&A pairs following detailed labeling guidelines. The base model's knowledge stays — fine-tuning just changes what shape it outputs in (helpful-assistant format).

first principle Stage 1 is rare and expensive — typically once a year. Stage 2 is cheap and iterative — weekly. When a model "improves overnight," it's almost always Stage 2.

06 / rlhf

Compare beats generate.

Stage 3 of fine-tuning swaps writing answers for picking the better of two — and that small move unlocks a lot.

Reinforcement Learning from Human Feedback. Karpathy's pithy version: "compare > generate." Writing the perfect answer is hard; ranking four candidates is easy. The policy gets nudged toward whatever the reward model learned humans prefer.

interpretation Most of what you experience as model "personality" — terseness, hedging style, helpfulness, refusal behavior — is shaped here. Same base, different RLHF run, different model.

07 / tools

Don't ask the model to multiply.

Give it Python. The model decides when to call out.

The model is not the system; the model orchestrates the system. Karpathy's framing: imagine an LLM with arms — calculator for arithmetic, Python for transformations, browser for fresh facts, vision for charts, file system for memory. Modern frontier models are trained to route to tools rather than answer from weights.

for your work For tasks the model fails at (large arithmetic, current news, exact data lookups), the answer is usually "give it the tool" — not "find a stronger model." Tool use is cheaper than a 10× scaling step.

08 / system 1 vs 2

One forward pass vs a tree of thoughts.

Kahneman's frame, applied to neural networks. Today's LLMs are System 1. The frontier wants System 2.

Karpathy's prediction (2023): the field's next frontier is letting models think slowly. He was right — by 2024–2026, "reasoning models" with internal chain-of-thought (o1, Claude extended thinking, DeepSeek R1, Gemini Deep Think) are now the default for hard problems.

for your work Match the system to the task. For codegen of a CRUD endpoint: System 1 is plenty. For "find the bug in this 800-line function": pay for System 2 — it's worth the latency.

09 / the LLM os

Imagine an operating system emerging.

Karpathy's central analogy. The LLM is the CPU. Everything else — context, tools, memory — is the architecture around it.

The hardware analogy isn't loose. Context window = RAM (fast but limited). Embedding store = disk (slow but unbounded). Tools = peripherals (extend capability beyond CPU). Multimodal encoders = video/sound cards. The kernel orchestrates which tool to call when.

first principle Designing an LLM application is computer architecture, not API integration. Where do you put working memory? When do you page out to disk? When do you call out to a peripheral? These are old questions in new clothes.

10 / security

Three doors. Watch all three.

A new computing paradigm comes with a new attack surface.

Three different attackers, three different mitigations. Jailbreaks: better RLHF + classifiers. Prompt injection: treat retrieved content as untrusted; whitelist tools per request; strip Markdown in egress. Data poisoning: provenance audits on training data; trigger-detection probes.

for your work If your app retrieves anything from the web or accepts file uploads, prompt injection is your default threat model. The fix isn't a smarter model — it's treating retrieved tokens as code, not data. Sandbox the actions the model can take.

11 / takeaways

The mental models that still hold.

Eight things to remember, even after the field tripled in scale.

substrate An LLM is two files. The cost is in producing parameters.bin. The interesting question is always "what data shaped these weights?"
compression The weights are a lossy zip of the internet. Memorized facts and confabulated facts share the same surface. Format ≠ truth.
stages Pre-training is once a year. Fine-tuning is weekly. When a model gets better, it's usually post-training. Same for "personality" changes.
rlhf Compare beats generate. Most of what you experience as model "voice" came from someone ranking 4 candidates against a rubric.
tools Don't ask the model to do what a tool can do. Calculator for math, browser for facts, Python for transformations. Tool-use is cheaper than a 10× scaling step.
system 2 Match speed to problem. Hard problems get reasoning models; easy ones don't. Pay for slow thinking only when it pays back.
os You're doing computer architecture. Context window = RAM. Embeddings = disk. Tools = peripherals. The system around the model matters more than the model.
security Treat retrieved content as code, not data. Prompt injection is the default threat model the moment your app browses, reads files, or processes uploads.