Plain-English definitions for the terms that keep coming up. 34 entries, sorted alphabetically.
The cycle where an AI model takes an action, observes the result, then decides what to do next.
The core operation inside a transformer model that lets it weigh how much each word relates to every other word.
A standardised test used to compare model performance.
A prompting approach where the model reasons step by step before giving a final answer.
The maximum amount of text a model can hold in its working memory at once.
A training approach that lets AI models learn new tasks over time without overwriting previously acquired knowledge.
A way of representing text as numbers so a computer can work with meaning, not just characters.
Training an existing model on new data to change its behaviour.
A large model trained on massive data that can be adapted for many tasks.
Connecting model outputs to real, verifiable information.
When a model generates confident-sounding text that is factually wrong.
Teaching a model how to behave by giving examples in the prompt itself, without changing any weights.
Running a trained model to generate outputs.
A structured representation of facts as entities and relationships.
An open-source framework for building AI agents as explicit state graphs, making control flow visible and debuggable.
A neural network trained to predict and generate text at scale.
How long it takes from sending a prompt to receiving the first token.
An architecture where only a subset of the model's parameters activate for each input.
An open standard for connecting AI models to external tools, databases, and data sources through a universal interface.
A model that handles more than one type of input or output, like text plus images.
The practice of crafting inputs to get better outputs from a model.
Compressing a model by reducing the precision of its numbers.
A training method where humans rate model outputs to guide the model toward better behaviour.
A technique that grounds LLM responses in external data by retrieving relevant documents before generating a reply.
The process of picking the next token during generation.
A transformer efficiency technique where each token attends only to a relevant subset of other tokens, enabling near-linear scaling with context length.
Instructions given to a model before the user's message, usually hidden from the user.
A parameter that controls how random model outputs are.
A method where a model runs gradient updates during inference, using its current input as training data to adapt on the fly.
The process of splitting text into tokens, the chunks a model actually processes.
The ability of a model to call external functions or APIs to complete a task.
A database built for storing and searching embeddings.
Writing software by describing what you want to an AI and iterating on the output.
Asking a model to do something without any examples in the prompt.