Skip to main content
· Updated 24 April 2026 · 14 min read

Starting a new repo with AI is easy. Coming back three weeks later is the hard part.

AI re-onboarding is the real productivity killer. A METR study found devs are 19% slower with AI on familiar codebases. Here's how I handle 30 projects.

Nathan Schram Nathan Schram
On this page

Starting a new repo with AI is easy. Coming back three weeks later is the hard part.

I opened Untether last Tuesday (March 11, 2026) for the first time in 11 days. I’d been heads-down on Viewpo - a completely different codebase, different language, different platform. Within 5 minutes, Claude renamed resolve_engine to get_engine. I corrected it. Two files later, it did it again.

The function is called resolve_engine because it handles fallbacks and retries across five different AI coding tools, not just a simple lookup. That distinction matters, and it’s exactly the kind of thing no amount of code reading will tell you.

That’s the re-onboarding tax. I had two problems, not one. The agent didn’t know my conventions. And honestly, after 11 days away, neither did I.

TL;DR:

  • A METR study found experienced devs are 19% slower with AI on familiar codebases - while believing they’re 20% faster. The perception gap is real.
  • For solo devs juggling multiple projects, every session is a cold start. The AI forgets everything. You forget where you left off.
  • What works: lean context files that tell the agent only what it can’t figure out from your code, paired with feature scoping that stops it from refactoring things you didn’t ask it to touch.

The numbers in context. METR’s July 2025 randomised controlled trial tested 16 experienced open-source developers on 246 real issues in repositories they knew well. Before the study, developers predicted AI would make them 24% faster. After the tasks, they believed AI had made them 20% faster. The measured result: AI made them 19% slower. METR has since acknowledged methodology issues (Feb 2026) and is redesigning the experiment - later data on the same developers shows -18%, on new recruits -4%. The 19% number isn’t the last word. But the perception gap - predicted, believed, actual all telling different stories - is what every follow-up study keeps confirming.

The maintenance phase nobody talks about

Building in public has a visibility problem. The content that gets shared - the Show HN launches, the “$0 to $10K MRR” posts, the “I built this in a weekend” threads - is almost entirely about starting. The first commit. The first user. The first dollar.

In my experience across 30 active projects, the starting part accounts for maybe 5% of the total time. The other 95% is maintenance - adding features, fixing bugs, updating dependencies, and context switching between codebases that all need attention at the same time.

Nobody talks about the Tuesday three weeks later when you need to add webhook retry logic to a project you haven’t thought about since the launch post. How long does it take you to feel productive on a project you haven’t touched in two weeks?

I run 30 active projects through Untether, a Telegram bridge that lets me send tasks to AI coding agents from my phone. Viewpo (SwiftUI), Untether (Python), PitchDocs (TypeScript plugin), Outlook Assistant (Node.js MCP server), and twenty-six others at various stages. Some days I touch three of them (I’ve written about the voice-coding workflow from my phone that makes this practical). The launching part was fun. The maintaining part is where I actually live.

A developer on Dev.to described the same problem: “Each context rebuild takes 10-15 minutes, and switching between projects multiple times a day adds up to a significant amount of time lost to repetitive explanations.” That’s one person’s experience, not a study. Here’s the study: an Atlassian survey of 3,500 developers found that while 99% say AI tools save them time, 68% still lose more than 10 hours per week to organisational friction - and context switching is a major driver. The same report found 63% of developers say leadership doesn’t understand their pain points (up from 44% the year before). When you’re the solo dev, you are the leadership gap. Every switch is a cold start. Not just for me - for the AI too.

Two cold starts at the same time

When I pick up a project after a week away, two things are broken.

I’m broken. I don’t remember where I left off. I’m not sure if I finished that refactor or left it halfway. I can’t remember whether I decided to use polling or websockets for that feature, or why. My mental model of the codebase is stale and I don’t trust it.

The agent is broken too, in a different way. It starts completely fresh every session - no memory of what we did last time, no sense of what’s half-finished, no knowledge of the conventions I’ve built up over months. It sees the code. It doesn’t see the reasoning behind it.

The two cold starts fail in opposite directions, which is what makes them hard to notice.

I have fuzzy memories that are sometimes wrong. The AI has zero memories and fills the gap with confident guesses based on generic best practices. I need a reminder of where I stopped. The AI needs project-specific rules it can’t infer from the code. Every guide I’ve read treats these as separate problems. They’re not. They’re the same problem from different sides, and the same file fixes both.

A METR randomised controlled trial (preprint on arXiv) tested 16 experienced developers on 246 real issues in their own repositories - large open-source projects averaging 22,000+ stars and over a million lines of code. The developers expected AI to make them 24% faster. After using it, they believed it had made them 20% faster.

The actual result: they were 19% slower.

METR 2025: THE PERCEPTION GAP

0%

-20% -10% +10% +20%

Predicted before task +24%

Believed after task +20%

Actually measured -19%

39 percentage point gap between prediction and reality. Source: METR 2025 RCT · 16 devs · 246 issues · arXiv:2507.09089 · preprint

That’s a 39 percentage point gap between perception and reality. And these were developers working on codebases they knew deeply. They weren’t switching between projects. They weren’t coming back after a week away. They had full context and AI still slowed them down.

Switching projects without context files is like walking into a meeting 20 minutes late. You’ll ask questions that were already answered and suggest ideas that were already rejected. Now imagine that happening when both you AND the AI are cold on the project.

AI RE-ONBOARDING TIME (MINUTES)

0 15 30 45 60

Without context file Cold start, confident wrong guesses

30-60 min range shown

With lean AGENTS.md Under 120 lines, undiscoverable signals only < 5 min

Source: author's measurement across 30 active projects (Python, TypeScript, SwiftUI)

What the AI gets wrong (and what I get wrong)

The AI’s mistakes on an unfamiliar codebase follow a pattern. It renames things that were deliberately named that way. Suggests abstractions you considered and rejected months ago. Why is it restructuring your directory layout? Because it doesn’t know you had reasons for the current one. And somewhere in a file it hasn’t read yet, the utility it just wrote already exists.

These aren’t bugs. The AI is making reasonable suggestions based on what it can see. The problem is what it can’t see - the history. The decisions. The trade-offs.

My mistakes are different. I open a project and I can’t remember which branch I was working on. I grep for a feature I’m sure I built and find it half-implemented with a TODO comment from three weeks ago. I start making changes and break something because I forgot about a constraint I discovered the hard way last month.

What’s the first thing your AI agent gets wrong when it starts fresh on a project? For me it’s always naming. The obvious solution is to write everything down. Give the AI a context file with all your conventions. Leave yourself notes about where you stopped.

Easy. Right?

Wrong. An ETH Zurich study tested context files against four coding agents and found that LLM-generated context files - the kind that dump everything about a project into one file - actually reduced task success rates while increasing inference costs by 20-23%. The agents spent more tokens processing redundant context and solved fewer problems. I wrote about this in detail if you want the full breakdown.

The fix isn’t more context. It’s the right context.

How I actually pick up a project after a week away

After testing this across Python, TypeScript, and SwiftUI codebases, I’ve settled into a two-step workflow that takes about 5 minutes instead of 45. It solves both cold starts - mine and the AI’s - in sequence.

Step 1: Re-onboard the agent.

Every project has an AGENTS.md file containing conventions the agent can’t discover from reading the code. Naming patterns that break language defaults. Build environment quirks. Testing rules that contradict framework conventions. Mistakes I’ve made before — including the edge-case bugs that tests can’t reach — and don’t want repeated. This file stays under 120 lines and contains only what I call “undiscoverable signals” - the stuff the agent will get wrong unless you tell it.

Claude Code loads this automatically at session start through an @AGENTS.md import in CLAUDE.md. The file being there isn’t enough on its own, though. It also needs to be accurate. After the skill count incident I described in my last post - where three context files were wrong for 4 days and I didn’t notice - I built hooks that verify context files against the actual codebase at session start. If the file says “15 skills” and the code has 16, it flags the drift before the AI starts working with stale information.

That’s ContextDocs, a tool I built to manage context files across tools and catch drift. The health scoring runs 13 checks including stale path detection, line budget compliance, and consistency between AGENTS.md and tool-specific bridge files.

Step 2: Scope the work.

This is the part that stops the AI from going rogue. When I say “add webhook retry logic to Untether,” a general-purpose AI agent interprets that as permission to touch anything it thinks is related. It might refactor the error handling module while it’s at it. Or rename variables for “consistency.” Or decide the directory layout would be cleaner restructured.

I use PitchDocs’ evidence-based feature extraction to anchor the work. The docs-audit command scans the codebase and maps what exists, backed by file paths. When the agent can see exactly what’s already built and documented, it’s much less likely to reinvent things that are already there. The feature extraction covers 10 signal categories - commands, exports, routes, config options - so the agent has a clear picture of what the project already does before it starts adding to it.

PitchDocs - another tool I built - handles this. It generates repository documentation, and the side effect I care about most is that it gives the agent a verified map of what’s already in the codebase.

Why AI context files and documentation are two different tools

I used to think documentation generation and context file management were the same problem. They both involve files that describe your project. How different can they be?

Pretty different, it turns out.

DocumentationContext files
AudienceHumans (users, contributors)AI agents (Claude, Codex, Cursor)
Update triggerReleases, major changesEvery feature, renamed command, new convention
ExamplesREADME, CHANGELOG, user guidesAGENTS.md, CLAUDE.md, .cursorrules
Staleness costConfused usersWrong AI output that looks right

If I add a skill to PitchDocs, the README can wait until the next release. The context files need to update now, or the next AI session starts with wrong information.

I kept these together in one tool for too long. PitchDocs v1.x handled both, and the result was that documentation updates triggered context file checks and context file updates triggered documentation warnings. They fought each other. In March 2026 I split PitchDocs v2.0.0 into two separate tools - PitchDocs for repository documentation, ContextDocs for AI context files. 39 files changed, but the separation has been clean. Each tool does one job.

What I’d do if I only had 15 minutes

Not everyone wants to install plugins. If you’re juggling multiple projects with AI coding tools and want to cut your re-onboarding time from 45 minutes to 5, here’s what I’d prioritise:

  1. Start with an AGENTS.md of your top 5 gotchas. Not an architecture overview. Not a directory tree. Just the things the AI will get wrong if you don’t tell it. Naming conventions, build commands, testing rules, past mistakes. Keep it under 120 lines. I wrote a full guide on what to include.

  2. Leave a “where I left off” note. At the top of your context file, add a line like Current work: webhook retry logic, branch feature/retry, tests passing except integration. Update it when you stop for the day. This is for you, not the AI - but the AI benefits from it too.

  3. Pick one context file and reference it from the others. If you use Claude Code and Cursor, write your conventions in AGENTS.md and have CLAUDE.md import it with @AGENTS.md. Have .cursorrules say “Read AGENTS.md for project conventions.” Two files, no duplication, one source of truth.

  4. Scope before you start. Before asking the AI to add a feature, spend 2 minutes telling it what NOT to touch. “Add webhook retry logic to the event handler. Don’t refactor the error handling module, don’t rename existing functions, don’t restructure the directory.” This saves more time than any plugin.

The non-obvious lesson here is that the context file is as much for future-you as it is for the AI. Both of you are going to be cold on this project next Tuesday. The 10 minutes you spend writing down the gotchas today is an investment in every future session.


Common questions about AI and project switching

Why is AI slower on existing codebases than new ones?

A greenfield repo is a blank canvas. No implicit rules, no historical baggage. The AI picks sensible defaults and they’re usually fine. An established codebase is full of invisible constraints - why this module uses a different pattern, why that dependency is pinned to an old version, why the test suite skips a specific integration test on CI. The METR researchers found that experienced developers already knew these constraints intuitively. The AI didn’t, and the time spent correcting its confident-looking mistakes ate more than the time it saved writing code. The perception gap - believing you’re faster while actually being slower - suggests developers don’t notice most of the corrections they’re making in real time.

Can I use AI context files with Cursor or Copilot, not just Claude Code?

Yes, and this is why AGENTS.md matters as the canonical source. Codex CLI, Gemini CLI, and OpenCode read it at startup without any configuration. For tools that have their own format - Cursor wants .cursorrules, Copilot wants .github/copilot-instructions.md, Windsurf wants .windsurfrules - you create a one-line bridge: “Read AGENTS.md for project conventions.” Then add only the tool-specific bits that differ. I covered the full setup and line budgets in a companion post if you want the details.

How long does it take to re-onboard an AI agent on a project you haven’t touched in weeks?

Without any context file, expect 30-60 minutes of accumulated friction. Not one big failure - lots of small ones. A renamed variable here, a duplicated utility there, a test that uses mocking when your project deliberately avoids it. Each correction takes 30 seconds. They add up to an hour before the agent is reliably producing code that fits your project’s norms. With a lean AGENTS.md (under 120 lines, only undiscoverable signals), I’ve gotten that down to under 5 minutes. The agent loads the file at session start and immediately knows the guardrails.

What is the difference between re-onboarding yourself and re-onboarding the AI?

The practical difference is in how you fix each one. For yourself, you need breadcrumbs - a “current work” note that says which branch, which feature, which decisions are still open. It takes 30 seconds to write at the end of a session and saves 20 minutes the next time you pick it up. For the agent, you need guardrails - naming conventions, build quirks, testing rules that contradict framework defaults. The elegant part is that both can live in the same file. I put my breadcrumbs at the top of AGENTS.md and the project rules below. One file, both cold starts addressed.


About the author: Nathan Schram is a solo developer at Little Bear Apps building open-source tools from Melbourne, Australia. 13 years in tech, currently building things he actually uses. Find him on GitHub, Bluesky, or get in touch.

Last reviewed: 2026-04-24. All statistics in this post link to their primary sources. The METR study is a randomised controlled trial with 16 developers and 246 issues; METR has since acknowledged methodology questions and is redesigning the experiment. The ETH Zurich study tested four coding agents on 138 Python tasks. PitchDocs and ContextDocs are both open source.

Nathan

Written by

Nathan

Dog dad, curious soul - building open-source tools &amp; helping businesses get found online. Tech at littlebearapps.com, consulting at nathanschram.com.