Address
7 Bell Yard, London, WC2A 2JR
Work Hours
Monday to Friday: 8AM - 6PM
A context window refers to the amount of text (or tokens) that an AI language model can process and “remember” at one time. It defines how much surrounding information the model considers when generating a response, influencing both accuracy and coherence.
Think of it as the short-term memory of the model, the larger the context window, the more the model can recall and relate to when forming an answer.
When you interact with an AI system, your input is broken down into tokens, which are small units of text. The model processes these tokens within its defined context window. If the total number of tokens in a conversation exceeds this limit, older parts of the conversation are “forgotten” as new ones are added.
For example, a model with an 8,000-token window can process roughly 6,000 words at once. Modern models such as GPT-4 Turbo now support over 128,000 tokens, allowing them to reference dozens of pages of context in a single interaction.
The size of the context window affects how accurately a model can generate, summarise, or analyse information. A small window might miss earlier references, while a larger one helps preserve continuity and accuracy across extended text.
Despite larger capacities, context windows still have constraints that affect model behaviour and performance.
Developers use techniques like Retrieval-Augmented Generation (RAG) and memory persistence to simulate longer-term recall. These methods allow models to access relevant external data while keeping token usage efficient.
Learn more: Understanding context windows helps businesses design AI systems that balance performance, accuracy, and cost. Shipshape Data helps organisations fine-tune context length and optimise retrieval workflows for real-world efficiency.