Attention — Lumo glossary

Attention is how a transformer decides which parts of its input matter for which part of its output. Each token computes how much it should listen to each other token — a learned, soft, weighted glance.

In plain language

In AI and machine learning, you will run into this term whenever someone talks about how a model is built or used. Attention is how a transformer decides which parts of its input matter for which part of its output. Each token computes how much it should listen to each other token — a learned, soft, weighted glance. If you are new to the field, the simplest mental model is this: the mechanism that lets a model focus selectively. Read it once with that frame in mind, then come back and read it again — that is usually enough for the rest of the entry to make sense.

Inline editorial illustration evoking Attention: the mechanism that lets a model focus selectively. — FIG. 1Attention, seen from a second angle — the mechanism that lets a model focus selectively.

An everyday picture

Think of Attention less like a thinking person and more like someone who has read an enormous amount and now finishes other people's sentences for a living. They have absorbed the shape of the work; they have not memorised any one page.

Where it shows up

Attention tends to sit inside products that need to read, write, or recognise without a hard-coded rule — assistants, search, document tools, voice apps. It is rarely the only moving part, but it is often the part the user feels.

A small example

Imagine the scene above. The role Attention plays is the one its blurb describes — The mechanism that lets a model focus selectively. When a chatbot in a customer service portal reads a question and returns a draft reply, several of these AI ideas — model, prompt, context — are at work behind the single button you saw.

Common misunderstanding

MYTH

It is easy to assume Attention 'understands' the way a person does. It does not. It learns patterns, and patterns can be fooled — confident answers are not the same thing as correct ones.

One line to take with you

Attention is statistics worn well. Useful for patterns; double-check it for facts.

In plain language

An everyday picture

Where it shows up

A small example

Common misunderstanding

One line to take with you

One letter a week, lasting understanding.

One letter a week,
lasting understanding.