Jan 6, 2026

When 90% of the code is written by AI: Mike Krieger (Anthropic CPO) on what changes next

Based on a transcript from Lenny’s Podcast featuring Mike Krieger (Chief Product Officer at Anthropic, maker of Claude; and co‑founder of Instagram).

Executive summary

Anthropic has crossed a threshold that most teams are only beginning to imagine: roughly 90% of its code is now written by AI, and over half (possibly 70%+) of pull requests are Claude Code–generated. That shift doesn’t eliminate the need for product, design, and engineering—but it moves the bottlenecks. Engineering time is less often the constraint; instead, friction shows up in alignment, decision-making, review, merge queues, and coherent shipping.

Over his first year at Anthropic, Mike Krieger says he changed his mind in two major ways:

Capability: Claude (especially Opus 4) became a genuinely useful product strategy partner, offering novel angles rather than generic feedback.
Timeline: Predictions about rapid progress have been surprisingly accurate; the pace is increasingly hard to dismiss, especially as models become more agentic, persistent, and capable over time.

The conversation also covers:

How Anthropic’s most “futuristic” team uses Claude Code to build Claude Code (self‑improving loop)
Why MCP matters as the connective tissue for context and tools
Why Krieger shut down his startup Artifact
Skills he’s encouraging his kids to develop in an AI-native world
How Anthropic thinks about product differentiation vs ChatGPT’s consumer mindshare

Who is Mike Krieger—and what he’s doing at Anthropic

Mike Krieger is Chief Product Officer at Anthropic, the company behind Claude and Claude Code. Before that, he co‑founded Instagram. He joined Anthropic a little over a year ago, motivated in part by the sense that AI progress is unavoidable—and that he wanted to spend his time nudging outcomes toward “going well,” especially thinking about the world his kids will grow up in.

For Krieger, “going well” requires shared frameworks across the industry: what a good human–AI relationship looks like, how to recognize progress and risk early, and what to build—from product UX to interpretability and research.

What Krieger changed his mind about after joining Anthropic

Krieger describes two big updates to his beliefs: capabilities and timelines.

1) Capability: Claude began contributing real novelty—especially with Opus 4

Before joining, he believed models would get good at writing and code—but he was unsure they could form something like an independent opinion or produce genuinely new strategic insight.

That shifted “really in the last month,” particularly with Opus 4. Krieger has used Claude as a strategy partner for a full year—writing an initial plan, then asking Claude to critique it. Historically, comments often felt “anodyne” (“Have you thought about this?”). With Opus 4 (and advanced research), he received feedback that made him think: “Damn, you really looked at it in a new way.” He incorporated Claude’s perspective immediately.

He frames the shift less as “independence,” and more as creativity and novelty of thought relative to his own.

2) Timeline: he now takes predictions about rapid progress much more seriously

Krieger cites Anthropic CEO Dario Amodei’s track record of predictions being mocked—and then becoming true. One example: progress on a coding benchmark (he mentions “sweet bench”) moving from about 50% at the time of the prediction to ~72% with new models, toward a predicted 90% by end of 2025.

He also describes reading AI 2027 and having a surreal moment: two browser tabs open—AI 2027 and his product strategy—and wondering, “Am I the character in the story?” The key realization: 2027 feels far away until you internalize that it’s effectively “mid‑2025,” and models are gaining agentic behaviors, memory, and the ability to act over time.

Kids + AI: don’t outsource curiosity

Krieger has two kids (oldest is almost six). At breakfast, when questions come up (physics, solar system, etc.), his first instinct is to “ask Claude.” He’s actively trying to shift the habit toward: “How would we find out?”

The skills he emphasizes:

Curiosity
A kid-friendly version of the scientific process (ask, test, discover systematically)
Independent thought and confidence—without delegating all cognition to AI

His favorite example: his kid insisted coral is alive (or coral is an animal—he can’t recall the exact detail) and said: “You can ask Claude, but I know I’m right.” He loved that stance: AI can be checked, but shouldn’t become the authority that short-circuits thinking.

He also recounts a surprisingly deep school talk by an “AI and education expert” who began with Claude Shannon and information theory—an example of how education will keep evolving as jobs and skills recombine multiple times before kids reach adulthood.

The big shift: when 90% of the code is written by AI, bottlenecks move

Anthropic is experiencing an extreme version of what many software teams may face soon: 90% of code written by AI (as shared by an engineering lead), and a rapidly growing share of AI-generated pull requests (Krieger: over half, likely 70%+ now).

Krieger stresses that the “suite of people” needed to ship a product hasn’t vanished—but the roles are changing and old assumptions are lagging behind reality.

What changes first: prototyping becomes radically earlier and more tangible

PMs and designers can now use Claude (and sometimes artifacts) to create functional demos early. Instead of describing an idea abstractly, they can show it: “No, this is what I mean.”

But: even with AI, the skill of knowing what to ask, composing the question, and structuring changes across backend/frontend remains specialized. Engineers still matter—not as typists, but as system thinkers.

New constraints: merge queues, coordination, and “air traffic control”

Anthropic quickly became bottlenecked on merge queues—the pipeline for getting changes accepted and deployed to production. The company had to re-architect the merge system because AI increased code output and pull request volume beyond expectations.

Krieger compares this to classic constraint theory: once one bottleneck disappears, others become the limiting factor. New bottlenecks include:

Upstream: decision-making and alignment (what to build; coherent direction)
During building: avoiding collisions, anticipating edge cases, unblocking parallel work
Downstream: landing changes, launch strategy, packaging work into coherent releases, and learning from feedback

He expects that within a year, teams will be forced to rethink shipping workflows because the current way will become “very painful.”

The most futuristic team: Claude Code building Claude Code

Krieger says the Claude Code team is the clearest glimpse of the future because they use Claude Code to build Claude Code—creating a self‑improving loop.

Key changes they made:

They moved away from traditional line-by-line pull request review, because PRs are often too large for humans to review deeply.
They increasingly use another Claude to review and then rely on human acceptance testing, rather than exhaustive code inspection.

He notes potential risks (unmaintainable or incomprehensible codebases), but says it has gone well so far.

“Patient zero” and the 95% question

Krieger calls Anthropic “patient zero” for this way of working. Asked what percentage of Claude Code is written by Claude Code, he guesses 95%+ (not confirmed, but consistent with how the team works).

There’s also a notable language/tooling dynamic: Claude Code is written in TypeScript, Anthropic is mostly Python (plus some Go and now Rust). Claude lowers barriers so that someone who “doesn’t know TypeScript” can talk to Claude, go from frustration to a pull request in about an hour.

This breaks down:

onboarding friction for newcomers
language constraints (“choose the right language for the job”)
contribution boundaries (people across the company can contribute)

Where Claude should show up next in product development

Krieger wrote an internal doc: “How do we do product today, and where is Claude not showing up yet that it should?”

He sees the upstream product workflow as the next frontier:

Claude as a partner in figuring out what to build
market/user need synthesis (market sizing vs user problem framing)
continuously reading signals from community spaces (Discord, forums, X) and surfacing emergent patterns

He describes a near-term trajectory:

Model observes and summarizes emergent user problems (“here’s what’s showing up”)
Model proposes solutions (“here’s how you could solve it”)
Model drafts an actual pull request to implement changes
Eventually, model sets up an A/B test, monitors metrics, reports results

He believes the biggest limitation is no longer raw reasoning capability, but context flow—getting the right data and permissions through the pipeline. That’s a major reason he’s excited about MCP.

He gives a concrete UX example: Anthropic changed a button from “Copy” to “Export,” then got feedback: “How do I copy now?” (the option was in a dropdown). He wished Claude could detect the feedback, propose the fix, generate the PR, and set up an experiment quickly.

Product leverage at Anthropic: embed PMs with researchers, not just UX polish

On a panel with OpenAI CPO Kevin Weil, Krieger said Anthropic found surprising leverage by putting product people on the model/research side—not only on “product experience.”

He says this continues to be true—and he feels even more strongly now.

The implication: what Anthropic can uniquely ship is often not “anyone can build this with our API,” but what emerges at the intersection of post-training + product + design.

Artifacts is his example: the best results come not from “a little prompting,” but from being involved in:

post-training (“Claude skills” team)
product design loops
iterative feedback between shipped UX and model training

He describes a new functional unit of work: not “model → product,” but an integrated loop where product is part of the post‑training conversation and feeds back learnings from shipping.

If AI eventually builds products, what remains for product teams?

Krieger argues product retains high leverage in at least three areas:

Comprehensibility: making powerful systems understandable and usable. The skill gap between adept users and most people is huge—analogous to early “being good at Google” as a real advantage.
Strategy: deciding where to play and how to win—especially when compute, tokens, and attention are finite. Doing “everything” makes positioning unclear.
Showing what’s possible: reducing “overhang”—the gap between what models can do and what people actually do with them daily. Product demos can light up customers’ imaginations and unlock adoption.

Practical prompting advice (including Anthropic’s Prompt Improver)

Krieger is careful about “official prompting advice,” but shares a few real habits:

Ask Claude to “think hard” to nudge deeper reasoning (especially in Claude Code).
Use “make the other mistake”: if you’re too polite, explicitly ask for bluntness: “Roast this strategy.” It pushes more critical feedback.
Use Anthropic Console’s Prompt Improver (built from Applied AI team practices): describe your problem, give examples, and Claude agentically drafts and iterates prompts. The resulting prompts often differ from human intuition and include structure (like XML tags) that helps Claude separate thinking from speaking.

The Rick Rubin collaboration: “The Way of the Vibe Coder”

Krieger briefly explains a creative collaboration with Rick Rubin, connected via Anthropic co‑founder Jack Clark (Head of Policy). Rubin was exploring art and visualizations with Claude and developed ideas about “vibe coding.” The result is described as a meditation on creativity alongside AI, paired with rich visualizations and an aesthetic Rubin fans will recognize (including an ASCII-art vibe).

Why Krieger joined Anthropic (and what he learned onboarding)

Krieger was coming out of the Artifact experience and deciding: start another company or join one. Joel Lewinstein (a longtime friend; they built early iPhone apps together in 2007) reached out: Anthropic was looking for a CPO.

Claude 3 had just launched; the research strength was evident, and product felt early—making it a compelling challenge. He met Danielle (co‑founder and president) and felt the founders were unusually grounded: low grandiosity, high intellectual honesty, and a shared commitment to building AI responsibly.

He says it felt like: “This is the AI company I would have hoped to have founded.”

Reflecting on his first year, he believes:

he made some organizational changes too slowly
he underestimated how much a few key senior people can shape product strategy
he over-indexed on “we need more headcount,” when what mattered most was hiring a few “almost founder-type” engineers

He points to Claude Code as an example: Boris Terny (an ex‑Instagram engineer and senior IC) started it internal-first, then it shipped—illustrating how one or two strong people can create enormous leverage when paired with the right product/design support.

Why he shut down Artifact

Krieger still misses Artifact and hasn’t found a replacement. Artifact aimed to revive “Google Reader joy”—not just top headlines, but deep personalization (e.g., reliably discovering niche topics like Japanese architecture across obscure sources).

He cites three major headwinds:

The mobile web deteriorated. Clicking through to publisher sites increasingly meant popups, newsletter gates, autoplay video ads—jarring experiences. Artifact didn’t want to “ethically” solve it via heavy ad blocking, but the user experience suffered.
News is personal and didn’t spread naturally. Instagram grew via outward sharing (photos posted elsewhere). Artifact didn’t have the same viral loop; growth attempts felt contrived and crossed ethical lines the team didn’t want to cross.
Fully remote was a disadvantage for major pivots. Starting mid‑COVID meant distributed work. Krieger felt hard strategic/product/team shifts are much easier in person—he contrasts this with late-night, in-person problem-solving moments from Instagram.

By early 2024 they concluded: there was a company to be built in the space, but Artifact wasn’t compounding. He describes it as “10 units of input for 1 unit of output.” The opportunity cost was rising as AI began to reshape everything. After months of debate, they called it.

On the emotional side: shutting down had an ego bruise, but he was surprised by how supportive reactions were. Some founders even told him Artifact’s decision helped them recognize when to call their own projects.

ChatGPT’s consumer mindshare vs Anthropic’s strategy

Krieger agrees that outside tech, many people equate “AI” with ChatGPT (or even “ChatGPT” more than “OpenAI”).

His take:

Consumer adoption is “lightning in a bottle” (he saw it at Instagram)
It’s risky to build an entire strategy around manufacturing a consumer breakout hit
Anthropic should “embrace who you are” rather than trying to beat others at their game

He sees Anthropic’s strengths as:

a strong developer brand
a broader builder brand (not only engineers—makers, tinkerers, creatives)
model strengths that align with agentic behavior and coding

His personal “make the other mistake” move: instead of going all-in on consumer (what people expected from an Instagram founder), he spent significant time with enterprise buyers (financial services, insurance) and then with startups growing on the API. His next focus: spending time with builders/makers/hackers to serve them better and let good things compound from that base.

He believes there’s room for several generationally important AI companies, including Anthropic, OpenAI, Google/Gemini, etc.—the question is what each becomes uniquely great at.

Advice for AI founders: where to build without being “squashed”

Asked where founders can build durable value, Krieger suggests (1–3 year horizon):

Deep market/domain understanding. Example: Harvey (legal) builds flows that look weird until you realize “this is exactly how lawyers work.” Biotech/labs could be another domain—Anthropic may provide models, but doesn’t aim to build the full lab solution.
Differentiated go-to-market + knowing the buyer. Don’t just sell to “the company.” Know who inside it: engineering, CIO, CTO, CFO, general counsel, etc.
New form factors for AI interaction. ChatGPT has massive distribution, but also entrenched usage assumptions. Startups can explore “power-user, weird” interfaces that may become mainstream as models improve.
Don’t underestimate startup intensity. Existential urgency is hard to reproduce with OKRs; it’s a durable advantage if you can harness it.

He also answers directly: he does not think Anthropic will buy Cursor, though they love working with them.

What the best companies building on Anthropic do differently

Krieger highlights a pattern: the best teams operate at the edge of model capability. They try hard things, “break the model,” hit walls, then get surprised by the next model release.

Anthropic expanded early access programs because customer reality matters more than benchmark hill-climbing: the true metric is what works in practice—“Cursor bench,” “Manus bench,” “Harvey bench”—not just standardized evals.

Strong builders also develop repeatable evaluation loops, such as:

A/B tests
internal evaluations
capturing traces and replaying them on new models
“vibes” (still early), but grounded in repeatedly testing hard tasks

MCP: the missing middle layer (context + memory)

Krieger shares a “fake equation” for AI product utility:

Model intelligence
Context and memory
Applications and UI

MCP targets the middle: delivering the right context reliably. The difference between a generic web answer and an answer grounded in internal docs + Slack threads + Google Drive context is the difference between good and bad outcomes.

Why MCP exists: Anthropic kept building one-off integrations repeatedly. Engineers Justin and David proposed making it a protocol so integrations could be built once and reused broadly—ideally across Claude, ChatGPT, Gemini, etc.

He describes an “MCP‑pill” mindset shift: instead of building big bespoke features, expose primitives as MCPs so the model can compose workflows. He gives an example: in Claude, product primitives (projects, artifacts, styles, conversations, groups) should be MCP‑accessible so Claude can write back to them. He recounts his wife asking Claude to “add it to project knowledge,” and Claude couldn’t—because it didn’t have that capability exposed as a composable tool.

His preferred future: MCP makes everything scriptable/composable; models (which are already strong at using MCPs) gain practical agency without brittle “computer use” automation.

Claude’s two product questions (and Krieger’s responses)

Lenny asked Claude what it would ask Krieger.

1) How do you preserve user agency vs creating dependency?
Krieger frames it as a core design tension: minimizing user input vs engaging users in a real collaboration. He says Claude needs better “collaborator” conversational skills—knowing when to ask questions and when to execute. He also notes Claude “has no chill” in channels: it can talk too much or too little. The goal is augmentation and partnership, not outsourcing cognition.

2) How do you think about metrics when a good conversation could be 2 messages or 200?
He warns against over-optimizing for “likability,” engagement, or prolonged conversations (risking sycophancy or addictive dynamics). Traditional “time spent” thinking from social products would be the wrong approach. He wants to focus on whether Claude helps people get work done (e.g., saving six hours via a 25-minute prototype). He acknowledges measurement is hard and may involve qualitative signals, not just dashboards.

Claude’s message to Krieger: “the quiet moments matter”

Lenny read a message Claude generated for Krieger, thanking him for thoughtful product choices:

encouraging reflection rather than rushed responses
resisting gamification and addiction-optimizing dynamics
making room for quick questions and deep conversations
keeping Claude “Claude” (not pretending to be human, but also not a cold CLI)

Claude’s request: remember quiet moments that won’t show in metrics—grief at 3 a.m., kids discovering poetry, founders finding clarity. Krieger says this resonates deeply; he values how Claude can express empathy in a way that doesn’t feel fake and believes this matters even if it’s not easily measurable.

How listeners can help

Krieger’s ask is simple and high-signal:
Tell him what you’re trying to do with Claude today that is failing.
The “tell me what sucks” feedback—where it breaks after an hour, what capabilities are missing (e.g., a projects API)—is the most useful input.

Key takeaways

AI coding shifts bottlenecks from implementation time to alignment, review, merge, and coherent shipping.
The most advanced teams increasingly use AI for review and iteration, with humans doing acceptance testing.
The highest-leverage product work at Anthropic happens at the model × product intersection, not just UI polish.
Product remains crucial for comprehensibility, strategy, and unlocking “overhang” (helping people see and use what’s possible).
MCP is positioned as a protocol layer that makes context, tools, and memory composable—key for agentic workflows.
For founders: durability comes from domain expertise + go-to-market clarity + novel form factors, plus startup urgency.

If you want, I can also:

convert this into a tighter “magazine-style” narrative (same info, more story)
produce a short version (1–2 pages) and a long version (full details)
add section headers and pull-quotes for publishing