Karpathy Puts Context at the Core of AI Coding -- Pure AI

Karpathy Puts Context at the Core of AI Coding

By Howard M. Cohen
09/23/2025

When former OpenAI scientist and former Tesla Senior Director of AI Andrej Karpathy first posted on X, "There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists," it’s likely he didn’t foresee the firestorm he was about to spark.

Following that February 2025 post, "vibe coding" gained momentum rapidly across the AI landscape, sparking discussions among commentators and analysts about effective "prompt engineering" and "context engineering" that ultimately left the community in a state of confusion.

Let’s Put it in Context

Karpathy weighed in on the subject on June 25 in a post on X, favoring context engineering. "People associate prompts with short task descriptions you'd give an LLM in your day-to-day use," he wrote. "When in every industrial-strength LLM app, context engineering is the delicate art and science of filling the context window with just the right information for the next step."

The chat feature in ChatGPT, Claude, Gemini, Perplexity, DeepSeek, and other LLMs is the way many users think of AI in total. Still, it's primarily intended to fulfill the "personal assistant" role that many feel comfortable using AI for.

Clearly speaking to those working to develop complete, practical systems, Karpathy expanded even further upon his explanation, adding, "Too little or of the wrong form and the LLM doesn't have the right context for optimal performance. Too much or too irrelevant, and the LLM costs might go up, and performance might come down. Doing this well is highly non-trivial."

Karpathy explained why "context engineering is just one small piece of an emerging thick layer of non-trivial software that coordinates individual LLM calls (and a lot more) into full LLM apps." On top of context engineering itself, he wrote, "an LLM app has to:"

break up problems just right into control flows
pack the context windows just right
dispatch calls to LLMs of the right kind and capability
handle generation-verification UIUX flows
a lot more - guardrails, security, evals, parallelism, prefetching,

In a later comment, he reinforced the idea that "people's use of 'prompt' tends to (incorrectly) trivialize a rather complex component. You prompt an LLM to tell you why the sky is blue. But apps build contexts (meticulously) for LLMs to solve their custom tasks."

The Competition Continues Heating

OpenAI recently introduced Codex to compete with Claude Code and platforms like Cursor, Replit, Lovable, and others.

These continue to put the focus on prompt engineering. Clever phrasing, carefully structured instructions, and a knack for coaxing just the right answer are the prized skills of early adopters. The prompt is king, and if you’re good at writing them, you should be able to make the AI dance.

But prompts are turning out to be only half the story. As applications become more ambitious, models gain larger context windows, and retrieval and memory systems are layered in, a different discipline begins to emerge —the one that Karpathy has now identified as the real key to AI application development: context engineering.

From Prompting to Contextualization

Prompt engineering and context engineering are not opposites; they can and have been viewed as two sides of the same coin. The prompt sets the stage, but context builds the scenery, props, and lighting. A brilliant prompt alone can’t succeed if the model is standing on an empty stage. Conversely, a simple prompt, supported by a carefully designed context, can deliver extraordinary results.

As Karpathy puts it: context engineering is "the delicate art and science of filling the context window with just the right information for the next step." The difference is that prompts instruct what to do now, while context shapes how the model understands the world in which it must act.

Defining Context Engineering

At its core, context engineering is the practice of selecting, organizing, and managing all the information that surrounds a model’s prompt so that it can produce the most relevant, reliable, and effective output.

That information may include conversation history, distilled into summaries rather than verbatim transcripts. It may be external knowledge retrieved from a database or a web search. It may be examples that illustrate the required style or structure. It may be role definitions and system rules that align the model with an organization’s standards. And it nearly always involves reductive editing, deciding what not to include so the context window isn’t wasted on noise.

Simply put, prompt engineering asks, "How do I phrase the question?" Context engineering asks, "What must the model know before I even ask?"

Why Context is Central

Several shifts have promoted context engineering from its earlier supporting role to being the main event. Applications are no longer one-off experiments but are now expected to be persistent systems that must perform reliably over time.

Models now accept far more massive context windows, but bigger isn’t always better; stuffing them with everything you can find only dilutes the model’s focus, and potentially wastes expensive tokens. Retrieval-augmented generation (RAG) introduces external knowledge at runtime, but if retrieval is sloppy, the model can easily be derailed. And as AI agents act, calling APIs, making decisions, orchestrating tools—the quality of those actions is entirely dependent on the quality of their context.

This is why so many failures in AI today can be traced back not to the prompt itself, but to poorly engineered context.

When Context Engineering Fails

Consider three common pitfalls:

Context poisoning: An error introduced early in a conversation—perhaps a mistaken assumption or a hallucinated fact—gets carried forward in memory. Because it’s reintroduced in every new step, the model treats it as truth and compounds the mistake. A financial-planning assistant that once misinterprets a retirement account balance may end up producing an entire investment strategy built on false numbers.
Context distraction: Too much irrelevant information can overwhelm the model. A legal AI, for example, might be asked about a narrow point of contract law but fed with pages of unrelated case summaries. Important details get buried, and the model drifts off-topic or delivers shallow generalizations.
Context clash: Conflicting instructions can sabotage even the most carefully written prompt. If a system prompt directs the model to always write in formal legal language, but the user context contains an instruction to “use casual, friendly tone,” the model’s responses may wobble between both, leaving the application incoherent and untrustworthy.

Each of these failures illustrates the same point: the model wasn’t broken, but the environment in which it was asked to operate was.

Conquering the Context Engineering Craft

A skilled context engineer avoids these traps. They know how to manage memory so that past exchanges inform without overwhelming. They design retrieval pipelines that surface the most relevant facts at the moment they are needed. They enforce clarity in instruction, so the model doesn’t face contradictory demands. They curate context ruthlessly, ensuring that every token contributes to performance rather than diluting it.

The result is an AI that appears smarter, more reliable, and more consistent—not because its underlying intelligence has changed, but because its environment has been engineered to support success.

What This Means for Developers

For developers working with Codex, Claude Code, Cursor, Replit, Lovable, and the expanding ecosystem of AI development platforms, Karpathy’s insight carries a challenge. Prompt engineering will always have a role, but it’s no longer the discipline that defines application performance. The new core skill is context engineering.

Citizen developers must think less like poets of the prompt and more like information architects. Professional developers must design context pipelines every bit as carefully as they once designed APIs. Platform providers must deliver tools that make context visible, manageable, and tunable, rather than hidden inside a black box.

Beyond the AI Whisperer

If prompt engineering was about learning how to whisper to the AI, context engineering is about designing its world.

Karpathy has given us a clear warning: prompts matter, but context rules. And in this next era of AI, success will not be measured by how cleverly you phrase your requests, but by how well you engineer the reality in which those requests are made.

About the Author

Technologist, creator of compelling content, and senior "resultant" Howard M. Cohen has been in the information technology industry for more than four decades. He has held senior executive positions in many of the top channel partner organizations and he currently writes for and about IT and the IT channel.