AI-First TDD: Write Tests, Let AI Implement | Module 03

The Problem

You want the safety of TDD but AI makes implementation so fast that writing tests first feels backwards. Why write tests when the AI can generate both code and tests together? So you skip TDD, ship fast, and then production breaks in ways your missing tests would have caught.

Traditional TDD says: Write test, implement, refactor. With AI, this feels slow because the AI can implement faster than you can think. But skipping tests entirely means you lose the design benefits and safety net that TDD provides.

The real issue: You need a TDD workflow designed for AI collaboration, not against it.

The Core Insight

Tests are the best way to communicate intent to AI. Write them first, let AI figure out the implementation.

When you write a test, you're creating a specification: "Given this input, produce this output. Handle these edge cases." That's exactly what AI needs. The test becomes your prompt, and the implementation becomes AI's job.

The AI-First TDD cycle:

Step	Traditional TDD	AI-First TDD
1. Red	Write failing test	Write failing test expressing intent
2. Green	Write minimal code to pass	Give test to AI, validate implementation
3. Refactor	Improve code quality	AI refactors, you verify tests still pass

The Walkthrough

Step 1: Write Tests That Express What You Want

Start with the interface and behavior, not the implementation. Your test should read like a specification.

# Example: URL shortener service
# test_url_shortener.py

def test_shorten_url_returns_short_code():
    """
    Given a valid URL,
    When I shorten it,
    Then I get back a 6-character alphanumeric code
    """
    shortener = UrlShortener()
    short_code = shortener.shorten("https://example.com/very/long/path")

    assert len(short_code) == 6
    assert short_code.isalnum()

def test_same_url_returns_same_code():
    """Shortening the same URL twice should return the same code"""
    shortener = UrlShortener()
    code1 = shortener.shorten("https://example.com")
    code2 = shortener.shorten("https://example.com")

    assert code1 == code2

def test_retrieve_original_url():
    """
    Given a short code,
    When I retrieve the original URL,
    Then I get back the full URL
    """
    shortener = UrlShortener()
    short_code = shortener.shorten("https://example.com/path")
    original_url = shortener.retrieve(short_code)

    assert original_url == "https://example.com/path"

def test_invalid_short_code_raises_error():
    """Retrieving a non-existent code should raise NotFoundError"""
    shortener = UrlShortener()

    with pytest.raises(NotFoundError):
        shortener.retrieve("BADCODE")

Why This Works

These tests tell AI exactly what the class needs to do without dictating how. The docstrings provide context. The assertions define success criteria. AI has everything it needs.

Step 2: Give Tests to AI for Implementation

Now prompt the AI with your tests as the specification:

# Prompt to AI:
"Implement the UrlShortener class that makes these tests pass.

Requirements:
- Use an in-memory dictionary for storage (production would use DB)
- Generate short codes using base62 encoding
- Handle collisions with a retry mechanism
- Raise NotFoundError for invalid codes

Here are the tests:
[paste tests above]"

The AI implements based on your specification. Run the tests. If they pass, you're done with this iteration. If they fail, the AI misunderstood - refine your tests to be clearer.

Step 3: Validate Edge Cases

AI implementations often miss edge cases. Add more tests for scenarios you think of:

def test_empty_url_raises_validation_error():
    """Empty URLs should not be accepted"""
    shortener = UrlShortener()

    with pytest.raises(ValidationError):
        shortener.shorten("")

def test_malformed_url_raises_validation_error():
    """Invalid URL format should be rejected"""
    shortener = UrlShortener()

    with pytest.raises(ValidationError):
        shortener.shorten("not-a-url")

def test_very_long_url_still_works():
    """URLs up to 2048 characters should work"""
    shortener = UrlShortener()
    long_url = "https://example.com/" + "x" * 2000
    short_code = shortener.shorten(long_url)

    assert shortener.retrieve(short_code) == long_url

Run these new tests. They'll likely fail. Give failing tests back to AI: "These tests are failing. Update the implementation to handle these cases."

Step 4: Refactor Through Tests

Once all tests pass, you can safely refactor. Ask AI to improve performance, extract methods, add type hints - as long as tests keep passing, you're safe.

# Prompt to AI:
"Refactor the UrlShortener class to:
1. Use SHA-256 hash instead of random generation (deterministic codes)
2. Add type hints to all methods
3. Extract URL validation into a separate method

All existing tests must still pass."

Coverage Goals

Not all code needs the same test coverage. Use this guide:

Code Type	Coverage Target	Focus Areas
Business Logic	90%+	All edge cases, error paths
Public APIs	85%+	Contract tests, input validation
Data Processing	80%+	Boundary conditions, null handling
UI Components	60%+	User interactions, error states
Glue Code	40%+	Integration points only

Failure Patterns

1. Tests Too Coupled to Implementation

Symptom: Tests break when you refactor, even though behavior is unchanged.

Fix: Test behavior, not implementation. Use public interfaces only.

# BAD: Tests internal state
def test_shortener_stores_in_dict():
    shortener = UrlShortener()
    shortener.shorten("https://example.com")
    assert len(shortener._url_map) == 1  # Don't test private state!

# GOOD: Tests observable behavior
def test_shortener_remembers_urls():
    shortener = UrlShortener()
    code = shortener.shorten("https://example.com")
    assert shortener.retrieve(code) == "https://example.com"

2. AI Generates Code That Passes Wrong Tests

Symptom: Tests pass but the feature doesn't work right.

Fix: Your test assertions are too weak. Make them specific.

# BAD: Weak assertion
def test_shorten_returns_something():
    shortener = UrlShortener()
    result = shortener.shorten("https://example.com")
    assert result  # AI could return literally anything truthy

# GOOD: Specific assertion
def test_shorten_returns_valid_code():
    shortener = UrlShortener()
    result = shortener.shorten("https://example.com")
    assert len(result) == 6
    assert result.isalnum()
    assert result.isupper()

3. Writing Tests After Implementation

Symptom: You ask AI for "code and tests" and get both together.

Fix: Resist the temptation. Write tests first, always. The test-first approach catches design issues.

The False Confidence Trap

When AI generates tests to match existing code, those tests will pass but they're useless. They test what the code does, not what it should do. Write tests first or you lose TDD's main benefit: tests that validate requirements.

Quick Reference

AI-First TDD Workflow:

Write failing test that expresses desired behavior
Give test to AI with context: "Make this test pass"
Run tests - if they pass, done; if not, clarify test or prompt
Add edge case tests, repeat step 2-3
Refactor via AI, verify tests still pass

Good Test Characteristics:

Tests public interface, not private implementation
Has a clear docstring explaining the scenario
Makes specific assertions (not just "truthy")
Tests one behavior per test function
Fails for the right reason when code is wrong

Prompt Template for Implementation:

"Implement [class/function] that makes these tests pass.

Context: [brief description of what this does]

Requirements:
- [requirement 1]
- [requirement 2]

Tests:
[paste test code]

Use [language/framework specific guidance] patterns."