← Back to systems

LLM Attention And Inference

6 questions · ~5 min · intermediate

Six questions on the mechanics that matter most in practice: tokenisation, attention, position, KV cache, quantisation, and why plausible text is not the same thing as guaranteed truth.

0 / 6

Why do modern LLMs usually use subword tokenisation instead of raw characters or whole words only?

Press 1 to 4 to pick an answer