<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: tdd</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/tdd.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-02-24T12:30:05+00:00</updated><author><name>Simon Willison</name></author><entry><title>First run the tests</title><link href="https://simonwillison.net/guides/agentic-engineering-patterns/first-run-the-tests/#atom-tag" rel="alternate"/><published>2026-02-24T12:30:05+00:00</published><updated>2026-02-24T12:30:05+00:00</updated><id>https://simonwillison.net/guides/agentic-engineering-patterns/first-run-the-tests/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;&lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/"&gt;Agentic Engineering Patterns&lt;/a&gt; &amp;gt;&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;Automated tests are no longer optional when working with coding agents.&lt;/p&gt;
&lt;p&gt;The old excuses for not writing them - that they're time consuming and expensive to constantly rewrite while a codebase is rapidly evolving - no longer hold when an agent can knock them into shape in just a few minutes.&lt;/p&gt;
&lt;p&gt;They're also &lt;em&gt;vital&lt;/em&gt; for ensuring AI-generated code does what it claims to do.  If the code has never been executed it's pure luck if it actually works when deployed to production.&lt;/p&gt;
&lt;p&gt;Tests are also a great tool to help get an agent up to speed with an existing codebase. Watch what happens when you ask Claude Code or similar about an existing feature - the chances are high that they'll find and read the relevant tests.&lt;/p&gt;
&lt;p&gt;Agents are already biased towards testing, but the presence of an existing test suite will almost certainly push the agent into testing new changes that it makes.&lt;/p&gt;
&lt;p&gt;Any time I start a new session with an agent against an existing project I'll start by prompting a variant of the following:
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;First run the tests&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
For my Python projects I have &lt;a href="https://til.simonwillison.net/uv/dependency-groups"&gt;pyproject.toml set up&lt;/a&gt; such that I can prompt this instead:
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Run &amp;quot;uv run pytest&amp;quot;&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;
These four word prompts serve several purposes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;It tells the agent that there is a test suite and forces it to figure out how to run the tests. This makes it almost certain that the agent will run the tests in the future to ensure it didn't break anything.&lt;/li&gt;
&lt;li&gt;Most test harnesses will give the agent a rough indication of how many tests they are. This can act as a proxy for how large and complex the project is, and also hints that the agent should search the tests themselves if they want to learn more.&lt;/li&gt;
&lt;li&gt;It puts the agent in a testing mindset. Having run the tests it's natural for it to then expand them with its own tests later on.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Similar to &lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/red-green-tdd/"&gt;"Use red/green TDD"&lt;/a&gt;, "First run the tests" provides a four word prompt that encompasses a substantial amount of software engineering discipline that's already baked into the models.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tdd"&gt;tdd&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="testing"/><category term="tdd"/><category term="ai"/><category term="llms"/><category term="coding-agents"/><category term="ai-assisted-programming"/><category term="generative-ai"/><category term="agentic-engineering"/></entry><entry><title>Red/green TDD</title><link href="https://simonwillison.net/guides/agentic-engineering-patterns/red-green-tdd/#atom-tag" rel="alternate"/><published>2026-02-23T07:12:28+00:00</published><updated>2026-02-23T07:12:28+00:00</updated><id>https://simonwillison.net/guides/agentic-engineering-patterns/red-green-tdd/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;&lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/"&gt;Agentic Engineering Patterns&lt;/a&gt; &amp;gt;&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;"&lt;strong&gt;Use red/green TDD&lt;/strong&gt;" is a pleasingly succinct way to get better results out of a coding agent.&lt;/p&gt;
&lt;p&gt;TDD stands for Test Driven Development. It's a programming style where you ensure every piece of code you write is accompanied by automated tests that demonstrate the code works.&lt;/p&gt;
&lt;p&gt;The most disciplined form of TDD is test-first development. You write the automated tests first, confirm that they fail, then iterate on the implementation until the tests pass.&lt;/p&gt;
&lt;p&gt;This turns out to be a &lt;em&gt;fantastic&lt;/em&gt; fit for coding agents. A significant risk with coding agents is that they might write code that doesn't work, or build code that is unnecessary and never gets used, or both.&lt;/p&gt;
&lt;p&gt;Test-first development helps protect against both of these common mistakes, and also ensures a robust automated test suite that protects against future regressions. As projects grow the chance that a new change might break an existing feature grows with them. A comprehensive test suite is by far the most effective way to keep those features working.&lt;/p&gt;
&lt;p&gt;It's important to confirm that the tests fail before implementing the code to make them pass. If you skip that step you risk building a test that passes already, hence failing to exercise and confirm your new implementation.&lt;/p&gt;
&lt;p&gt;That's what "red/green" means: the red phase watches the tests fail, then the green phase confirms that they now pass.&lt;/p&gt;
&lt;p&gt;Every good model understands "red/green TDD" as a shorthand for the much longer "use test driven development, write the tests first, confirm that the tests fail before you implement the change that gets them to pass".&lt;/p&gt;
&lt;p&gt;Example prompt:
&lt;div&gt;&lt;markdown-copy&gt;&lt;textarea&gt;Build a Python function to extract headers from a markdown string. Use red/green TDD.&lt;/textarea&gt;&lt;/markdown-copy&gt;&lt;/div&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tdd"&gt;tdd&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="testing"/><category term="tdd"/><category term="coding-agents"/><category term="ai-assisted-programming"/><category term="agentic-engineering"/></entry><entry><title>Quoting Catherine Wu</title><link href="https://simonwillison.net/2025/Feb/24/catherine-wu/#atom-tag" rel="alternate"/><published>2025-02-24T23:48:37+00:00</published><updated>2025-02-24T23:48:37+00:00</updated><id>https://simonwillison.net/2025/Feb/24/catherine-wu/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://news.ycombinator.com/item?id=43163011#43164561"&gt;&lt;p&gt;We find that Claude is really good at test driven development, so we often ask Claude to write tests first and then ask Claude to iterate against the tests.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://news.ycombinator.com/item?id=43163011#43164561"&gt;Catherine Wu&lt;/a&gt;, Anthropic&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/tdd"&gt;tdd&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;&lt;/p&gt;



</summary><category term="tdd"/><category term="testing"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/></entry><entry><title>Test-Driven Heresy</title><link href="https://simonwillison.net/2009/Jun/24/heresy/#atom-tag" rel="alternate"/><published>2009-06-24T11:03:44+00:00</published><updated>2009-06-24T11:03:44+00:00</updated><id>https://simonwillison.net/2009/Jun/24/heresy/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.tbray.org/ongoing/When/200x/2009/06/23/TDD-Heresy"&gt;Test-Driven Heresy&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Tim Bray advocates TDD for maintenance development, but argues that it may not be as useful during the exploratory, greenfield development phase of a project.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/tdd"&gt;tdd&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tim-bray"&gt;tim-bray&lt;/a&gt;&lt;/p&gt;



</summary><category term="tdd"/><category term="testing"/><category term="tim-bray"/></entry><entry><title>Quoting Titus Brown</title><link href="https://simonwillison.net/2007/Feb/25/titus/#atom-tag" rel="alternate"/><published>2007-02-25T14:44:13+00:00</published><updated>2007-02-25T14:44:13+00:00</updated><id>https://simonwillison.net/2007/Feb/25/titus/#atom-tag</id><summary type="html">
    &lt;blockquote cite="http://www.jacobian.org/writing/2007/feb/23/pycon/"&gt;&lt;p&gt;I don't do test driven development. I do stupidity driven testing... I wait until I do something stupid, and then write tests to avoid doing it again.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="http://www.jacobian.org/writing/2007/feb/23/pycon/"&gt;Titus Brown&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/pycon"&gt;pycon&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tdd"&gt;tdd&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/titusbrown"&gt;titusbrown&lt;/a&gt;&lt;/p&gt;



</summary><category term="pycon"/><category term="tdd"/><category term="testing"/><category term="titusbrown"/></entry></feed>