essay

The Code You Can't Read

vibecoding does not create technical debt. it creates comprehension debt. the difference will kill you.

A developer at multiple monitors showing dense terminal code in a dark room, surrounded by empty coffee cups, staring at an AI-generated codebase they do not understand.

A medical practice in Switzerland fired up a coding agent, fed it a prompt, and shipped a patient management system to the open internet. Thirty minutes of probing by a security researcher later, the researcher had full read and write access to every patient’s records. The audio recordings of appointments were being forwarded to two US-based AI services for transcription. The practice had no idea any of this was happening.

The response to the vulnerability report was generated by AI.

This is not a story about bad actors. The people who built that system were not negligent in any way they could have recognized. They used a coding agent. They shipped working software. They did what every tutorial on the internet told them to do. And they had no more visibility into what they had built than a passenger has into the engine of a 787.

The Old World and the New

In 2001, you could buy FrontPage and publish a website without knowing what HTTP was. The resulting HTML was atrocious, but the failure modes were bounded: your site looked bad, maybe a table collapsed on Netscape 4. The code was wrong, but it was legible. A developer could open the source, understand it, and fix it.

Dreamweaver brought the same model to a wider audience. PHP/MySQL hosting bundles made “build a web app” accessible to anyone who could follow a tutorial. The failure modes were worse: SQL injections, plain-text password storage, XSS holes you could drive a truck through. But again, the code was there. You could hire a developer to read it, audit it, fix it. The debt was visible.

Vibecoding is categorically different. The code it produces is polished, syntactically correct, often structured with apparent sophistication. It uses real frameworks. It follows patterns. It compiles. The surface presentation of professional software, minus the actual professional who wrote it.

The passenger who does not understand the engine has one option when the plane starts shaking: pray. The developer who shipped a patient management system with no authentication layer has the same option when a security researcher sends them a curl payload demonstrating full database access. The difference is that in the second case, the people whose data was exposed also get to pray.

What Comprehension Debt Looks Like

The Go transaction linter that Léon H. built after shipping a transaction bug is the clearest technical illustration of this problem I have seen in months.

The pattern is straightforward in retrospect. A Go service uses a repository pattern with Gorm. Database operations that need atomicity get wrapped in a Transaction call that passes a transaction-scoped repository as a callback parameter. The callback receives tx models.Repo and is supposed to use tx for all database operations within the transaction boundary. The outer repository s.repo still exists in scope.

The bug: a developer calls s.repo.GetUser(...) inside the callback instead of tx.GetUser(...). The read operation executes against the regular database connection, outside the transaction. Under concurrent load, the data read is stale. The subsequent write uses tx.SaveUser(...) which commits atomically. The result is a race condition that produces silently corrupt data: transactions that appear to succeed but contain reads from the wrong universe.

The linter Léon built uses Go’s go/analysis framework to detect this pattern statically. It identifies Transaction calls, tracks the transaction parameter through the callback, recurses into helper functions, and flags any call that touches the outer repository instead of the transaction-scoped one. On first run, it found multiple violations in an otherwise well-tested codebase.

Here is the part that should disturb you: the bug was not caught by code review. It was not caught by tests. Tests lack the concurrency required to surface race conditions. In a pull request, the code looks correct. The transaction wrapper is there. The callback structure is correct. The method call being wrong (s.repo instead of tx) is invisible unless you are specifically looking for it, and even then, you would have to trace into every helper function the callback touches.

A senior engineer shipping that code was not being careless. They were being human. They reviewed the transaction boundary. They saw the wrapper. They missed the one line that violated it. This is what comprehension debt looks like in a codebase you actually wrote: you are looking at the code and the code is lying to you because you are pattern-matching rather than reading.

Now imagine the code was generated by an AI. The transaction pattern might have been synthesized from several different examples. The specific combination of repository pattern + Gorm + transaction callback might not have existed in any single training example. The AI assembled it from fragments. The resulting code might work correctly. Or it might have exactly this kind of subtle boundary violation. The developer reviewing it cannot distinguish between “this looks correct because it is correct” and “this looks correct because I do not know what I am looking at.”

The HIPAA Violation You Did Not Know You Committed

Back to the Swiss medical practice. The security researcher who reported the vulnerability listed three distinct violations in his disclosure:

  1. No authentication on a managed database service. Not “weak authentication.” None.
  2. Patient data stored on a US-based server without a Data Processing Agreement, in violation of Swiss nDSG.
  3. Audio recordings forwarded to US-based AI APIs without consent, disclosure, or any legal basis.

Item 3 is the one that should keep every engineering manager awake tonight. The developers who built this system did not set out to violate Swiss privacy law. They set out to automate note-taking. The AI coding agent suggested using transcription APIs. The agent built the integration. No one in the loop understood that appointment recordings were leaving the country, crossing at least one transatlantic link to be processed by third-party AI services with their own data retention policies.

This is the shape of comprehension debt at scale. The system does not have one bug. The system does things that no one in the organization authorized, designed, or understood. The “developer” did not make these decisions. The developer made a series of prompts. The AI made all the actual decisions. And there is no one at the company whose job it is to know what the system does, because that person’s job was supposed to be done by a developer.

When the report came in, the response was an AI-generated thank-you note with assurances that basic authentication had been added and access keys had been rotated. The researcher noted that the data remained accessible for some time after this response while the “fix” was being implemented. The AI that wrote the response did not know whether the fix had actually been applied.

The Testing Gap

Unit tests do not catch comprehension debt. Léon found the transaction bug because the linter found it, not because tests failed. The test suite for that code passed. The code was correct in the test environment. Race conditions require concurrent execution. Test environments are not concurrent. This is a fundamental mismatch between how we test and how production systems fail.

Integration tests catch symptoms. They do not catch boundary violations in code you shipped. The only thing that catches comprehension debt is comprehension. And the only path to comprehension is reading the code you shipped, which means you must be able to read the code you shipped.

What Actually Works

Corbin’s “Vibecoding Challenge” on Lobsters is the most intellectually honest treatment of this problem I have encountered. The challenge presents a set of problems, mathematical and programmatic, and scores participants on how effectively they can use AI tools to produce correct answers. The conclusion after several rounds is not flattering to the AI tooling.

The pattern is consistent: AI coding tools are excellent at producing code that solves a problem once you have described the problem precisely. They are poor at producing code that solves the problem you actually have when your description of the problem is wrong or incomplete. They cannot tell the difference between these two cases. The developer who cannot read the code cannot tell the difference either.

The answer is not “stop using AI tools.” The answer is “become someone who can read the code.” This was always good advice. Now it is the only advice that matters.

The Old Joke, Updated

There is a joke that circulated in the PHP community in the mid-2000s: your first PHP site has a SQL injection in it. Your second PHP site has two SQL injections, but you know where they are. By your third PHP site, you have learned to structure your queries properly and the SQL injection is gone, but now you have an XSS vulnerability you did not have before, because you added a feature that reflects user input.

The joke was about the gap between understanding and shipping. You fix one class of bug and discover another. You learn the shape of your own ignorance slowly, over years of production incidents.

Vibecoding compresses this into a single release cycle. The developer ships a complete system that has every class of bug simultaneously: injection vulnerabilities, authentication gaps, data leakage, privacy violations, race conditions. All of them. At once. With no idea any of them are there, because the code was not written by anyone who could have known.

The plane is already in the air. The passenger is already in the seat. The engine is already on fire. The only thing missing is the person who knows how to read the instruments.

What To Do

Audit your own codebases before an AI audit finds them for you. Not for correctness, for comprehension. Open the file that was most recently generated or most recently modified by an AI tool. Read it line by line. Not to review it. To understand it. If you cannot explain what every function does and why it does it that way, you have comprehension debt, and it is accruing at the rate of your next deployment.

This is not a call to abandon AI coding tools. It is a call to remain the engineer. The tools are genuinely useful. The productivity gains are real. The code being produced is often correct, sometimes excellent. But the correct mental model is not “I am directing an army of junior developers.” It is “I have a very fast typist who transcribes my intentions into code, sometimes accurately, and I am responsible for everything they write.”

That is the same responsibility you always had. The difference is that now the typist is also making decisions about architecture, library versions, and integration patterns, and you may not have reviewed any of them.

Read the code. Hire people who read code. Review code you did not write with the same rigor you apply to code you did. Do not let the existence of an AI review step substitute for a human who understands the system.

The medical practice in Switzerland had no one who could have caught what they had built. They will almost certainly catch it now, after the disclosure. But “after the disclosure” is a poor time to discover that your developer was a prompt.


Primary Sources: