Make Stateful AI Great Again: The Context Layer I’m Building in 2026

Every conversation with an AI starts the same way.

A blank screen. A blinking cursor. And a model that has no idea who you are, what you were working on yesterday, or why this conversation even matters.

You start fresh. Again. Every time.

That reset button – the one Andrej Karpathy called “a developer convenience that shouldn’t still exist” – is arguably the biggest unsolved problem in production AI right now. And nobody has fixed it. Not really.

This is the problem EvaanNucleus is designed to solve.

–

The Realisation

I Thought Routing Was Enough. I Was Wrong.

Building EVAAN – a multi-LLM workspace where Claude, Grok, and Gemini work together under an intelligent routing layer – I started with a simple belief: if I could get the routing right, everything else would follow.

Route the right prompt to the right model. Score the response. Reroute if quality falls short. The Conductor handles all of it automatically.

It works. But two weeks into using it properly, something uncomfortable surfaced.

The moment I switched models – even mid-project, even with clear intent – context collapsed. The new model had no idea what we’d just been building together. Every handoff was a cold start. The routing was elegant. The memory was nonexistent.

I went deep into the research. 76+ sources: academic papers, VC theses, developer surveys, and competitive data as of April 2026. What I found wasn’t reassuring.

“Active, agentic context management – something I haven’t really seen at all so far.”
– Andrej Karpathy

The sharpest minds in AI are converging on the same unsolved problem. And the existing solutions are, for the most part, workarounds dressed up as answers.

–

The Problem

Three Ways Context Fails – and Why Bigger Windows Don’t Fix Them

Weaviate mapped the three critical failure modes as context scales. They’re worth naming clearly, because each one is distinct and each one is quietly destroying production agents right now.

Context poisoning. Irrelevant or contradictory information gets stuffed into the window and degrades response quality. The model doesn’t know what to ignore – so it doesn’t.

Context distraction. Even accurate information, if poorly ordered or weighted, pulls the model’s attention away from what actually matters. Signal drowns in noise.

Context clash. When multiple sources of truth collide inside a single window – instructions from different sessions, conflicting system prompts, outdated user preferences – the model has no reliable way to resolve the conflict.

None of these are solved by a bigger context window. They’re structural failures that require active management – not passive accumulation.

LangChain outlined four strategies that address this: Write, Select, Compress, Isolate. Each one is conceptually sound. All four remain an unsolved engineering challenge for production agents.

That gap – between knowing what the strategies are and actually having a production-ready system that executes them – is exactly where EvaanNucleus lives.

–

The Landscape

What Exists Today. And Why None of It Closes the Gap.

The market isn’t ignoring this. There are smart people working on pieces of it.

Mem0 raised $24M and powers memory for AWS agents. But it’s single-model by design. The context stays inside one ecosystem.

Letta goes deeper – it’s a full agentic runtime with persistent state. But adopting it means locking into an entire stack. You’re not adding a layer; you’re replacing your foundation.

OpenAI’s memory works well inside ChatGPT. It’s also ChatGPT-only. The moment you switch to Claude or Gemini, you start over.

Most routing tools – the ones being positioned as “the intelligent layer” – are, at their core, dumb pipes. They move prompts between models. They don’t carry meaning.

The honest picture: there is no production-ready, model-portable context layer that works across LLMs without forcing you into someone else’s runtime.

That’s the whitespace. That’s what EvaanNucleus is being built to fill.

–

The Research Signal

What the Smartest Capital in AI Is Saying

a16z named “How AI agents navigate the context problem” as a major unsolved 2026 opportunity. Not a solved problem with interesting extensions. An unsolved problem with significant commercial upside.

The March 2026 update to the ACE Paper – one of the most cited frameworks in production agent design – treats contexts as “evolving playbooks” that self-refine without labeled supervision. The concept is compelling. The implementation is still research-grade, not production-ready.

The gap between the research result and the working system is precisely where infrastructure companies get built.

Builders who solve this in 2026 won’t just have a better product. They’ll own the layer that every multi-LLM workflow runs on top of.

–

The Vision

EvaanNucleus: A Persistent, Intelligent Context OS

The simplest way I can explain what EvaanNucleus is meant to do:

You have three robot toys – Claude, Grok, and Gemini. Every time you switch toys, the new one forgets everything. It acts like it’s the first time you’ve met. EvaanNucleus is the magic backpack that stays with you. It remembers the important rules and stories, quietly chooses exactly what the next robot needs to know, and even tells you how many batteries you’ve used so you don’t run out unexpectedly. The game continues smoothly, no matter which robot you pick up next.

More precisely, EvaanNucleus is being designed to deliver three core capabilities that production agents currently don’t have access to in a single portable layer.

Active context management. Not passive storage – active curation. For every conversation, the system writes what matters, selects what’s relevant, compresses what’s stale, and isolates what would introduce noise. The four LangChain strategies, automated and running continuously.

Model-portable memory. Your context follows you. Switch from Claude to Grok to Gemini and back – your working memory, your user preferences, your project history, your established patterns – all of it transfers. No cold starts. No re-explaining who you are.

Cost observability. Token spend, made visible. Per user, per session, per model. In a world where multi-LLM workflows can compound costs unpredictably, this isn’t a nice-to-have. It’s the feature that makes production deployment defensible.

–

The Honest Part

This Is Still a Research and Planning Phase

I want to be precise about where EvaanNucleus actually is right now, because I’ve read enough founder posts that blur the line between “we’re building this” and “this is live and working.”

EvaanNucleus is in rigorous research and planning. The problem is well-defined. The architecture is taking shape. The competitive landscape has been mapped. The gap it’s designed to fill is real and validated across multiple independent sources.

What it isn’t yet: production software. That’s coming. But the reason I’m writing about it now is that the research phase is itself generative – and the builders dealing with this problem daily deserve to know that someone is working on the infrastructure layer, not just writing about why it’s hard.

If you’re building agents or multi-LLM workflows and you’re hitting context collapse between model switches – I’d genuinely like to hear what that looks like in your stack. The design of EvaanNucleus should be shaped by real friction, not just theoretical architecture.

–

The Bigger Picture

The Next Frontier Isn’t Better Models

Every major model release gets the headlines. GPT-5. Claude 4. Gemini Ultra. Each one is more capable than the last. Each one still forgets who you are the moment the conversation ends.

Capability without continuity is impressive in a demo. It’s frustrating in production.

The builders who will define how AI actually gets used – not showcased, but used, reliably, at scale – are the ones solving the infrastructure problems that don’t make the press release. Context persistence. Memory portability. Cost visibility. State management across model boundaries.

These aren’t glamorous problems. They’re load-bearing ones.

The next frontier isn’t better models. It’s better memory.

EvaanNucleus is my attempt to build that layer – to make every LLM stateful, without asking you to rebuild your stack around it.

The goal is a world where your context follows the model. Where the model never has to chase the context.

We’re not there yet. But the research is clear, the gap is real, and I’m in it.

–

What context management pain are you seeing in your own agent or multi-LLM builds right now? I’m in the design phase and genuinely want to understand the real friction.

– D