AI Context Limits and Human Working Memory

What the Research Shows

Oct 04, 2025

Large language models proces information through context windows. These windows function as temporary storage, similar to human working memory. When either system receives more information than it can actively process, performance degrades in measurable ways.

Research from Stanford University and Carnegie Mellon reveals patterns in how AI systems handle information overload. The patterns mirror what cognitive scientists observe in human attention and memory systems. Both hit limits. Both show predictable failure modes when those limits are exceeded.

How Context Windows Work

A context window measures how much information an AI can consider at once. The measurement uses tokens roughly equivalent to words or word fragments. Current models vary widely in capacity: some smaller or older models handle around 8,000-32,000 tokens, whilst advanced models like Claude offer 200,000 tokens (about 150,000 words). Google’s Gemini models represent the current frontier, with Gemini 1.5 Pro and 2.0 offering 1 million token context windows as standard, and experimental versions tested at 2 million tokens. Google Gemini context window: token limits, memory policy, and 2025 rules. Google has even successfully tested up to 10 million tokens in research settings.

Every word in your conversation occupies space in this window. Your questions, the AI’s responses, your instructions, all of it adds up. The window has a hard limit.

Human working memory operates under similar constraints. Studies show that working memory holds about 4-7 items at once. When you try to remember a phone number while following directions and listening to a conversation, something gets dropped. The mental scratchpad fills up.

Students with ADHD experience smaller working memory capacity than their peers. Give them too many instructions simultaneously, and their system overloads. They forget steps, miss details, provide generic responses.

The Lost in the Middle Effect

Stanford researchers documented what they call the “lost in the middle” effect. AI systems struggle with information placed in the middle of long contexts, even when total information stays well within their context window capacity.

Teachers observe the same pattern in students with ADHD. First instruction? Remembered. Last instruction? Usually remembered. Everything between? Fuzzy.

2023 research showed AI models with significantly degraded performance when important information sat in the middle of long contexts. Total information volume wasn’t the issue. Position was.

Say you’re asking Claude to plan a business presentation. You explain company background, target audience, main message, presentation style, time constraints, slide preferences, budget limitations, then ask for recommendations. By the time the AI generates your outline, details about audience or message may have faded, just from being buried in the middle of everything else.

Context Overload Symptoms

When AI systems receive more information than they can effectively process (even within technical limits), they produce what users call “generic garbage”. The term is harsh. The pattern is real.

LangWatch, which monitors AI performance for businesses, sees this consistently. Customer support chatbots handle simple questions well. Ask them to consider multiple customer issues, company policies, and conversation history simultaneously, and responses become unhelpful and vague.

The symptoms match what educators observe in overwhelmed ADHD students:

Earlier instructions or information forgotten
Shorter, less detailed responses
Focus on irrelevant details whilst missing important ones
Contradictory information
Generic responses instead of specific, helpful ones

Why Multiple Questions Create Problems

AI attention mechanisms decide which parts of input to focus on. The term “attention mechanism” describes a technical process, not understanding.

Think of it as a spotlight. One clear question allows intense focus on relevant information. Multiple complex questions split that spotlight across many areas. Everything gets dimmer.

The mathematics behind this involves quadratic scaling. Double the information, and computational effort increases by four times. Triple the information, effort increases by nine times. The exponential growth quickly overwhelms the system’s ability to maintain focus.

Research analysing over 1,500 peer-reviewed papers found context overload degrades AI performance by up to 70%. Users across platforms report identical experiences: AI systems work well for simple tasks, become unreliable for complex, multi-part requests.

Cognitive Load Parallels

People with ADHD experience cognitive overload faster than neurotypical individuals. Too much information or too many simultaneous demands overwhelm their working memory systems, leading to decreased performance.

Research in Frontiers in Human Neuroscience shows high cognitive load worsens attention problems in people with ADHD. When students must remember information whilst filtering distractions, their performance drops more significantly than non-ADHD peers. The working memory system can’t handle both storage and attention management simultaneously.

AI systems show nearly identical patterns. Given large contexts to maintain whilst generating coherent responses, performance degrades predictably. The errors match those of overwhelmed students: forgotten details, generic rather than specific responses, lost track of overall task goals.

Teachers learn to break complex instructions into smaller, manageable chunks for students with ADHD. Users need to structure AI interactions similarly. Instead of overwhelming the system with everything at once, effective AI use requires working within these attention limitations.

Practical Examples

A marketing manager asks Claude: “Analyse our competitor’s pricing strategy, suggest improvements to our product positioning, create three campaign concepts, estimate budget requirements for each concept, identify our target audience segments, and write sample copy for our best concept.”

This single request contains at least six complex tasks. Each requires focus on different information types and reasoning approaches.

The typical result: Claude provides generic overviews instead of detailed, actionable insights. Bullet points about competitor analysis without specific data. Vague positioning improvements without concrete recommendations. Campaign concepts lacking creative detail.

Educational settings show the same pattern. Teachers ask ChatGPT to “create a lesson plan for 8th grade students on the Revolutionary War that includes learning objectives, activities for different learning styles, assessment rubrics, homework assignments, and technology integration options, whilst also considering students with special needs and English language learners.”

This request consistently produces generic, template-like responses rather than detailed, personalised lesson plans.

Software developers report AI coding assistants solve specific programming problems well. Ask them to consider multiple files, dependencies, and requirements simultaneously, and they struggle.

Healthcare professionals find AI systems provide excellent information about individual medical conditions. Ask them to consider multiple symptoms, patient history, and treatment interactions all at once, and reliability drops.

Research Evidence

Carnegie Mellon researchers found AI agents complete focused, single-hour tasks with reasonable reliability. Performance drops dramatically when handling longer, more complex workflows requiring attention maintenance across multiple steps and information sources.

Studies published in Nature and other journals consistently show AI systems exhibit “attention bias”. Like humans with ADHD who struggle to filter relevant from irrelevant information, AI systems often focus on wrong details when given too much context. A due diligence assistant processing a long financial document might focus on minor formatting whilst missing critical risk information buried in the middle.

The “lost in the middle” phenomenon has been replicated across multiple studies and AI systems. Even when models have context windows large enough to theoretically handle vast amounts of information, practical performance degrades significantly when important details aren’t placed at the beginning or end of input. This pattern holds across different task types: question answering, document summarisation, creative writing.

Data Science researchers found AI systems designed to handle million-word contexts often perform worse on complex reasoning tasks when given just a few thousand words of mixed, multi-topic information.

Practical Strategies

Break complex requests into smaller, focused chunks. Instead of asking your AI to handle multiple tasks simultaneously, try sequential requests that build on each other.

For the marketing manager example: “Analyse our main competitor’s pricing strategy for their premium product line. Focus specifically on how their pricing compares to ours and what advantages or disadvantages this creates.”

Once you get a detailed response to this focused question, follow up with: “Based on that competitor analysis, suggest three specific improvements to our product positioning that would address the pricing disadvantages you identified.”

This sequential approach allows the AI to use its full attention capacity for each individual task rather than splitting focus across multiple demands. Teachers use exactly the same strategy with ADHD students breaking complex assignments into smaller steps and ensuring each step is completed before moving to the next.

Context pruning helps. Just as teachers remove distracting information from students’ workspaces, you can help your AI by being selective about included information. Instead of providing complete background documents, company histories, and detailed specifications all at once, focus on giving the AI only information directly relevant to the specific task

Timing matters. Research shows AI systems, like students with ADHD, perform better when they’re not juggling multiple information threads simultaneously. Instead of one long conversation covering many topics, consider starting fresh conversations for different task types. This gives the AI a clean mental workspace and prevents earlier conversations from creating cognitive interference.

Current Limitations and Future Directions

Researchers are actively working on solutions to these attention limitations. Some promising approaches draw directly from ADHD research and educational psychology: better working memory management, attention filtering mechanisms, and the ability to maintain focus on relevant information whilst ignoring distractors.

For users today, the practical approach involves working with these limitations rather than against them. This means adopting strategies that successful teachers and ADHD specialists have been using: clear, focused instructions; breaking complex tasks into manageable steps; providing relevant information without overwhelming detail; and checking for understanding before moving to the next step.

Understanding these limitations helps explain many frustrating experiences users have with AI systems. The generic responses, forgotten instructions, and inconsistent performance that many users report aren’t necessarily signs that AI is “getting worse”is they’re often signs that the system is experiencing context overload. Recognising these symptoms and adjusting interaction strategies accordingly produces much better, more reliable results.

AI attention, like human attention, is a limited resource that must be managed carefully. The most successful AI users are those who learn to work within these constraints, just as the most successful teachers are those who understand and accommodate their students’ cognitive limitations.

Phil

References

AI Context Windows and Technical Details:

[1] Nelson Liu et al. (2023). “Lost in the Middle: How Language Models Use Long Contexts” - Stanford University study documenting the “lost in the middle” effect where LLMs struggle with information placed in middle of long contexts https://arxiv.org/abs/2310.16450

[2] LangWatch (2024). “The 6 Context Engineering Challenges Stopping AI From Scaling in Production” - Analysis of context overload patterns in production AI systems https://langwatch.ai/blog/the-6-context-engineering-challenges-stopping-ai-from-scaling-in-production

[3] Data Science Dojo (2024). “The LLM Context Window Paradox” - Research on performance degradation despite large context windows https://datasciencedojo.com/blog/the-llm-context-window-paradox/

[4] Google Cloud Documentation. “Long Context | Gemini” - Technical specifications showing Gemini’s 1-2 million token context windows https://cloud.google.com/vertex-ai/generative-ai/docs/long-context

[5] IBM Research. “Attention Mechanism in AI” - Explanation of how transformer attention mechanisms work https://www.ibm.com/think/topics/attention-mechanism

[6] GeeksforGeeks. “Transformer Attention Mechanism in NLP” - Technical overview of attention mechanisms and quadratic scaling https://www.geeksforgeeks.org/nlp/transformer-attention-mechanism-in-nlp/

ADHD and Working Memory Research:

[7] Pievsky MA, McGrath RE (2018). “The Neurocognitive Profile of Attention-Deficit/Hyperactivity Disorder: A Review of Meta-Analyses” - Review showing working memory deficits in ADHD https://pmc.ncbi.nlm.nih.gov/articles/PMC6688548/

[8] Kofler MJ et al. (2020). “Working Memory and Information Processing in ADHD” - Research on working memory capacity limitations in ADHD https://pmc.ncbi.nlm.nih.gov/articles/PMC7483636/

[9] Orban SA et al. (2020). “Working Memory Training for ADHD” - Overview of working memory limitations and interventions https://pmc.ncbi.nlm.nih.gov/articles/PMC7318097/

[10] ATTN Center NYC. “Understanding Working Memory in ADHD” - Clinical explanation of how working memory overload affects ADHD students https://attncenter.nyc/understanding-working-memory-in-adhd-how-to-remember-not-to-forget/

[11] Bredemeier K, Berenbaum H (2021). “Cross-sectional and longitudinal relations between working memory, cognitive load, and psychopathology” - Study on cognitive load effects in ADHD https://www.frontiersin.org/journals/human-neuroscience/articles/10.3389/fnhum.2021.771711/full

[12] Pelletier MF et al. (2020). “Cognitive Load and ADHD” - Research on how high cognitive load worsens attention problems in ADHD https://onlinelibrary.wiley.com/doi/10.1111/ejn.16201

User Reports and Industry Analysis:

[13] Reddit ClaudeAI Community. “Claude’s Context Length Limit Issues” - User reports of context overload symptoms https://www.reddit.com/r/ClaudeAI/comments/1jjskcp/claudes_context_length_limit_has_been_massively/

[14] OpenAI Community Forum. “Experiencing Decreased Performance with ChatGPT-4” - User experiences with performance degradation https://community.openai.com/t/experiencing-decreased-performance-with-chatgpt-4/234269

[15] Ferguson N (2024). “Even AI Gets Bored at Work: The Attention Issue” - LinkedIn analysis of AI attention limitations in enterprise settings https://www.linkedin.com/pulse/even-ai-gets-bored-work-attention-issue-nick-ferguson-x42qc

Erik Bjurström

Great post! A question: You wrote “Researchers are actively working on solutions to these attention limitations. Some promising approaches draw directly from ADHD research and educational psychology: better working memory management, attention filtering mechanisms, and the ability to maintain focus on relevant information whilst ignoring distractors.” Can you direct me to further reading about the “promising approaches”? Or is this to be found in the sources in the post?

Expand full comment

huong | go with the flaws

This is such a comprehensive post — from context, narratives, technical explanations and solution suggestions! Thank you for crafting this. :)

Artificial Intelligence (AI) for Beginners

Discussion about this post