Living document

The Working-with-AI Playbook.

The depth behind Day 18. Eight sections on how to actually collaborate with AI tools as a builder. Skim once now; come back to specific sections as you hit specific friction.

Day 18 names the moves. This is the depth. It is a living document, updated as we learn what actually works in 2026's AI workflows.

If you are reading this without having taken the course, the short version of who this is for: you are someone building real software with AI tools as your collaborator, you are not a professional software engineer, and you have run into the friction that makes the difference between "got a demo working" and "actually shipped something I trust." The Playbook is the operational manual for that friction.

1. Context engineering, in depth

Bad output from AI is almost always a context problem, not a prompt problem.

Most people, when AI produces something they do not like, react by rewriting the prompt. That is the wrong instinct. The right instinct is to ask, "what context would have made this question answerable?" Then provide that context.

There are three layers of context to think about.

Project-level context (the file)

Most modern AI tools support a project-level context file:

Claude Code: a CLAUDE.md at the root of your project.
Cursor: a .cursorrules file (and .cursor/rules/*.md for more detail).
Aider: a CONVENTIONS.md or similar.
Chat tools: a custom system instruction or saved "project" with files attached.

The shape of a great project context file:

What we are building (one paragraph). Goal, who it is for, what success looks like.
Voice and conventions. How you want code structured, how files should be named, what patterns you use, what patterns you do not. The constraints the AI cannot guess.
Examples and anti-examples. "Components look like this." "We do not use class components." Show, do not just tell.
Stack and dependencies. What tools and libraries are in play. Versions if they matter.
What has already been tried that does not work. This one is the most under-used and the highest-leverage. The AI does not have a memory of your past attempts; the file is the memory.
What is out of scope. What you do not want the AI to touch or refactor without asking.

A good CLAUDE.md is 100 to 400 lines. Less than that and you are leaving leverage on the table. More than that and it stops being read carefully by the model.

Task-level context (what you bring to the conversation)

When you ask the AI to do a specific thing, bring with you:

The file or files most relevant to the task.
An example of a similar thing done well (in your project or elsewhere).
An anti-example if one exists ("not like this, the previous attempt did this and it broke X").
Any constraints specific to this task that are not already in the project context.

"Like this one but different in these ways" beats "make a new one" every time.

Living context (the maintenance part)

The file you wrote on day one is not the file you should still be using on day thirty. As you discover what works and what does not, update the context. The most-skipped maintenance task in any AI workflow is keeping the context current.

A useful rhythm: every few sessions, review what kept going wrong and add a line to the context file that prevents it next time.

2. Tool selection, deep

There is no single best AI tool. Each surface is good at different things. Picking the wrong surface for the task is the source of about half the friction.

A rough map:

Surface	Best for	Avoid for
Chat (Claude.ai, ChatGPT)	Exploring, learning, one-shot questions, thinking out loud	Anything that needs to touch a real codebase, anything iterative across files
Editor-integrated (Cursor, Copilot)	Line-by-line editing in a real codebase, refactoring within a file	Multi-file architectural changes, long-running tasks
Terminal AI (Claude Code)	Multi-file refactors, broader changes that span the project, tasks that need to read AND write files	Quick conceptual questions, exploring a new idea
Agentic / Cowork	Long-running autonomous tasks where you can step away, batch work	Tasks where you want to be in the loop, anything subtle

A useful heuristic: pick the surface that gives you the right level of human-in-the-loop. Too autonomous and you lose visibility into what's happening. Too hands-on and you lose the speed advantage.

Switch surfaces deliberately, not by accident. If you find yourself fighting a tool, ask "would a different surface make this easier?" before blaming the model.

3. The iteration loop, in depth

AI work is iterative. The first output is rarely the final output. Productive iteration follows three principles.

Specific feedback beats vague feedback

"Make this better" is wasted breath. The AI does not know what better means without context.

Useful feedback patterns:

The before/after pattern. "This output does X. I want it to do Y instead."
The reason pattern. "The naming is wrong because in our project this concept is always called Z."
The constraint pattern. "Whatever you propose has to also work with [existing constraint]."
The example pattern. "More like [other file], less like [this file]."

Be the editor, not the cheerleader. The AI will produce what you ask for; if you ask vaguely it produces vaguely.

Verify before you trust

The most important habit in working with AI is to actually check the output before believing it.

Run the code. Does it actually do what was claimed?
Read the diff. Did the change touch things you did not expect?
Look at the actual output. Not the AI's summary of the output, the output itself.
Check the references. If the AI mentions a function or library, confirm it exists.

This is the single biggest difference between people who ship with AI and people who ship bugs with AI.

Push back vs reset vs do it yourself

When something is not working, three options:

Push back. Tell the AI specifically what is wrong with the current output and ask it to try again. Use this when you can articulate exactly what is off.
Reset. Start a fresh conversation with the right context. Use this when the conversation has gotten tangled or the AI is anchored on a wrong assumption.
Do it yourself. Open the file, make the change, move on. Use this when the task is small and the AI is fighting you on something simple.

The mistake is to push back forty-seven times when you should have reset on attempt three.

4. The kill rule (in depth)

If you have asked for the same thing five times and it still is not right, the AI is not going to get there in attempt six.

Signs you are in the death spiral:

The AI keeps "fixing" the thing in different but equally wrong ways.
Each iteration introduces a new bug while supposedly fixing the old one.
You are repeating the same correction in different words.
You feel personally annoyed at the AI.

The reset move:

End the current conversation.
Re-read your project context file. Is something missing that would prevent this confusion?
Start a new conversation with the updated context.
Reframe the task. "Here is what I want, here is the constraint, here is an example."

If three resets in a row don't help, do it yourself. The AI is not the bottleneck on every task.

5. Failure modes catalog

Six patterns to learn to recognize. Each one calls for a different response.

Hallucination

The AI invents libraries, syntax, functions, or APIs that do not exist. Especially common with newer technologies or libraries the model has not seen much of.

Spot it: the code references something you cannot find in the actual library docs, the function call signature looks plausible but throws "is not a function," the import path is wrong.

Fix: bring the actual library docs (or relevant source files) into the context. Ask the AI to verify the API surface before writing the code.

AI slop

Verbose, hedging, generic output. The text sounds correct but commits to nothing. Lots of "you might want to consider" and "it depends on your specific needs."

Spot it: if you removed the AI output, would your decision be the same? If yes, the output had no opinion.

Fix: ask for a specific recommendation. "Pick one. If it is wrong, I will tell you, but pick one."

Mock implementations

TODO comments, placeholder data, fake responses. The thing "works" in a demo and falls over in production.

Spot it: search the diff for TODO, FIXME, placeholder, mock, example. Look for hardcoded values that should come from data.

Fix: ask explicitly for a real implementation. If the AI says it cannot, ask why; usually a missing piece of context is the answer.

Over-engineering

Abstractions you did not ask for. Three classes where one function would have done. Premature flexibility.

Spot it: the output is longer than the problem warranted. Names like BaseAbstractFactory show up.

Fix: state the principle in the project context: "default to inline, simple, direct code. Add abstraction only when there is a second use site."

Scope creep

The AI does more than you asked. You asked for a bug fix, you got a refactor of three other files.

Spot it: the diff has more files than the task warranted.

Fix: be explicit about scope. "Only change file X. Do not touch anything else."

Confidently wrong

Looks right. Reads right. Is wrong. The hardest failure mode to catch.

Spot it: you can only spot this by running the thing and verifying. There is no shortcut.

Fix: the verify-before-trust habit. Build the muscle. Pay the tax every time.

6. Verification practices

Three habits to make automatic.

Run the thing. Whatever the AI built, execute it. Click the button, hit the endpoint, render the page. "It should work" is not "it works."

Read the diff. Before accepting any AI change, look at the actual file diffs. The AI's summary of what it changed is not always accurate.

Don't ship code you have not read. This is the operational expression of the principle. If you would not be comfortable explaining what every line does, do not deploy it.

7. Cost awareness

AI is not free. API calls in production add up fast, especially if the AI is in a user-facing loop.

The patterns that bite:

A user-facing AI feature with no per-user cap, and one user (or a bot) hammers it.
A background job that calls the model in a loop with no exit condition.
Long context windows on every call when you only need short ones.
Premium models for tasks that smaller models handle fine.

The fixes:

Set a per-user usage cap (requests per day, tokens per day, or dollars per day).
Set a budget alert at your AI provider. Day 12's billing alert applies here.
Use the smallest model that does the job. Reserve the premium models for tasks that need them.
Watch the cost dashboard the first few weeks after shipping; you will be surprised.

8. When NOT to use AI

Not every task is an AI task. The places AI is the wrong tool:

Math and deterministic lookups. Use code, not a model. AI will get arithmetic wrong in surprising ways.
Anywhere precision is mandatory. Search bars that need exact matches, financial calculations, currency conversions. Use the actual tool, not a model approximation of it.
Anywhere the cost of a confidently-wrong answer is high. Legal advice, medical diagnosis, financial decisions. AI can assist a qualified human; it should not be the final word.
Trivial tasks where the AI overhead costs more than the task. Renaming a single variable, fixing a typo. Just do it.
Anywhere the user expects "the system" to be authoritative. If the user assumes your output is correct because it came from your product, you are now responsible for being correct, and AI is not reliable enough to bear that responsibility alone.

Knowing when AI is the wrong tool is itself a meta-skill, and it is the one that most distinguishes builders who use AI well from builders who use it everywhere.

This page grows. Most recent additions are listed at the top. If you have a pattern that should be in here, mention it and it goes in.