Better B

High output

I used to work with this young leetcoder. He is smart and fast. He can whip out code double quick time. Sometimes when I’m looking at his PR, I’m in awe. In awe at how a requirement could be followed to the letter, yet doing the wrong thing 😭 The angle of how his code can be incorrect catches me off guard.

The current AI coding agent reminds me of him: one moment exceeding expectations one-shot-ing an entire feature, another moment stumbling over the weirdest rabbit hole. We are in the era of high output programmers ^[1] ^[2] ^[3] ^[4] ^[5], so everyone gets to work with my young leetcoder now:

Reading all their code all the time is unsustainable and quickly becomes a bottleneck
Not verifying is even more reckless; what is clearly X to you might be clearly Y to them

We need all the help we can get to verify quickly & sustainably.

Triage

What did they say they’re going to do? If that’s wrong, interrupt and course correct. 1 unit of my time & effort spent; 0 of theirs.

Best time to fix a bug is before it’s written.

AI coding agents are really good at this: explaining their plan and verbalizing what they’re doing. They are basically the model pair programmer interviewees. Puts us human programmers to shame.

Pro-tip is to have them write down their plan in a tasks/{pending|doing|done}-{task}.md file. It leaves good documentation, helps reel them in when they go down test-fix rabbit holes losing the plot, and even lets us switch between AI models & humans easily.

The finishing touch is to have them [x] check off the sub-tasks inside the file as we verify their progress.
Can it compile? If not, short circuit; no need to look at anything else until this is fixed. 1 unit of time & effort spent.

Good type systems in the category of “if it compiles, it works” really shine when wielded by high output coders.

Make bugs into type errors^[6].

The more we can express in the type system, the more we can catch super fast, the lesser we need to verify by running code (last bullet point; orders of magnitude more time & effort).
Does the db schema, type definition, type signature changes make sense? If not, short circuit. 1-5 units of time & effort spent. ^[7]

Anything else is more verbose, less precise, and less trustworthy than these gems.

Look ma, no schema? Implicit function arguments? Such ergonomic conveniences now hurt more than help. Just pass the arguments explicitly, and let the type system do the rest.

As the saying goes, “Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowchart; it’ll be obvious.”
What does PR description or inline comments say / show? If that’s wrong, short circuit. 5-10 unit of time & effort spent.

Again, AI coding agents put most human Pull Request descriptions to shame.

Finding a problem here is often faster than finding a problem from test scenarios.
Are tests passing and do the scenarios cover what’s needed? If not, short circuit. 20-100 units of time & effort.

Humans: are there even tests?
Run the code in our head and/or on a machine. 100-1000 units of time & effort.

Getting from A to B

I know many teams prefer to use a language & framework to get from A to B fast: it is easy to hire for, easy to learn, easy to write & read. You need to staff up quickly, or if your engineers are already familiar with some tech stack.

Moving forward, we are all augmented with new age high output programmers that don’t have these problems: they already know the languages, write them quickly, translate them effortlessly, document them even better. What they need from us is to verify their work quickly & sustainably.

In this new world, are there legitimate reasons to hold on to a codebase

that doesn’t^[8] have sound type checking^[9]?
that cannot^[10] do parse don’t validate^[11] nor make illegal states unrepresentable^[12]?
that isn’t memory safe^[13]?
that is memory hungry^[14]?
that is slow^[15]?

Do we prefer something that is good to hire for? Or something that is good to work with?

Getting to a better B

We are finally truly able to walk the talk of choosing the best tool for the job. I am once again^[16] looking forward to getting to “a good B” fast, instead of only getting from A to B fast.

_{[1] Anthropic Claude Code

[2] Cursor

[3] Codeium Windsurf

[4] GitHub Copilot - The Agent Awakens

[5] Aider Chat

[6] Make Bugs Into Type Errors

[7] Historically, when my leetcoder gets the wrong idea, this is usually where I really find out.

[8] JavaScript

[9] TypeScript

[10] In Go, the best you can do is make it hard (but never impossible) to produce an invalid value.

[11] Parse Don’t Validate

[12] Make Illegal States Unrepresentable

[13] C++

[14] Java

[15] Ruby

[16] Getting to a Better B}