A researcher spent $2,283 and a week using Claude Opus 4.6 to build a working remote code execution chain against Google Chrome. The target was not Chrome itself. It was Discord, which bundles an old version of Chrome 138 while Google had already shipped the fix for CVE-2026-5873 in Chrome 147. The researcher popped calc. The vulnerability was not a zero-day. It was an N-day. That distinction matters more than the exploit.
Anthropic built a cyberweapon with its Mythos model and spent months trying to put the genie back in the bottle. That story was about frontier model safety testing. This story is about what happens when the commercially available model demonstrates the same capability, when the barrier to entry drops to the cost of a week and a few thousand dollars in API credits, and when the actual attack surface has nothing to do with AI capability and everything to do with the structural failure of how software updates propagate through a framework ecosystem.
The chain was not simple. Mohan Pedhapati (s1r1us) at Hacktron described it in detail. It started with CVE-2026-5873, a bounds-check elimination flaw in V8’s Turboshaft compiler. When V8 tier-upgrades a WebAssembly module from the Liftoff compiler to Turboshaft, a specific code pattern involving i64 parameters truncated to i32 triggers an incorrect bounds check elimination. The OOB write primitive lets you read and write past the bounds of a Wasm ArrayBuffer. From there, 64 ArrayBuffers are sprayed with 64KB each and unique markers. After warmup-triggered compilation, the OOB primitive scans for those markers to determine the relative offset between Wasm linear memory and the V8 sandbox cage base. You corrupt a victim ArrayBuffer’s backing_store and byte_length to get full 4GB R/W within the sandbox. An addrof primitive leaks compressed pointers. That is Phase 0 through 7.
Phase 10 is where it gets architectural. The V8 sandbox is not the OS. To escape to full 64-bit address space, the exploit uses a use-after-free in the WasmCodePointerTable. When a wasm module imports a JS function, V8 creates a WasmDispatchTable holding a WCPT handle and a shared_ptr controlling the entry’s lifetime. The exploit grows a table to trigger free of the WCPT entry, then redirects the dispatch handle to an import dispatch table. A forged CanonicalSig lets you reinterpret a Wasm struct pointer as a raw i64. The result is arbitrary read/write across the full 64-bit address space. Phase 11 locates the V8 Isolate, finds pointers to the macOS dyld cache, derives system() from a known printf delta, redirects the WCPT base in the ExternalReferenceTable to a fake table pointing at system(), and calls a Wasm function. The generic JS-to-Wasm wrapper jumps to the fake WCPT entry. Calculator opens.
The entire process consumed 2.3 billion tokens, $2,283 in API costs, and approximately 20 hours of Pedhapati unstaking the model from dead ends. Claude Opus required a human driver throughout. It lost track of failed attempts in long sessions, guessed at offsets instead of verifying them with LLDB, and could not recover from logic loops without redirection. The model is not autonomous. The model is a power tool.
But look at the cost comparison. Google and Discord bug bounty programs pay $10,000 to $15,000 for a working RCE chain. The total cost of AI-assisted development was $6,000 including human labor. The math already breaks in favor of attackers for anyone who can afford an API key and a week of their time. That is not a future concern. That is the present.
The real vulnerability is not the code. The real vulnerability is the gap between what Google patched and what end users are actually running.
Discord ships Chrome 138. Google has shipped Chrome 147 with the fix for CVE-2026-5873. That is a nine-major-version lag. Discord has 150 million monthly users. None of them chose to run a vulnerable browser. None of them received a notification that their chat application bundles a browser with a publicly known, publicly patched RCE vulnerability. The patch exists. The fix is in the commit. The N-day is live because Electron does not automatically sync with every Chromium security release.
Electron is not alone. Slack, Notion, VS Code, and dozens of other applications are built on Electron and lag behind Chromium releases by varying intervals. This is not a new problem. The Equifax breach in 2017 was enabled by a known Apache Struts CVE that had been patched months before attackers exploited it. The vulnerability was public. The patch was public. The exploitation was still successful because update propagation is not instantaneous and organizations do not patch instantly.
The browser ecosystem has had a decade to build automated update mechanisms that work. Chrome updates itself silently. Firefox has a rapid release cycle. But Electron applications update on their own schedules, which means every Electron-based application is a browser security surface that updates on the application vendor’s timeline rather than the browser engine vendor’s timeline.
This is the XZ Utils problem at browser scale. XZ Utils was a supply chain attack where a malicious actor spent years building trust as a contributor before inserting a backdoor. The vulnerability here is the same structural problem: an intermediary layer between the fix and the end user. In XZ Utils, the intermediary was a maintainer who introduced malicious code. In Electron, the intermediary is a release cycle that lags behind upstream security patches by weeks to months. The attacker in both cases benefits from the delay.
Pedhapati’s point about public commits in open-source projects is worth dwelling on. V8 is open source. When Google patches a vulnerability in V8 and commits that fix to the public repository, every line of the patch is visible to anyone with an internet connection. An AI model can analyze that patch, identify the vulnerable code pattern, determine which versions are affected, and construct a working exploit. The patch commit is a starting gun. The question is not whether AI can read patches and build exploits. Claude Opus did it for $2,283. The question is how wide the window is between the commit and the deployment of the fix to end-user software.
The AI capability question is settled. Opus is already there. The question is what the security ecosystem does about the structural conditions that make an N-day viable months after the patch ships. Automated security updates for Electron applications would close most of this gap. An inventory of Electron dependencies with monitoring for Chromium security releases would surface the lag proactively. Vercel’s security incident today, which involved unauthorized access to internal systems and a recommendation that customers rotate environment variables, is a reminder that deployment infrastructure is itself a security surface that compounds across every application running on it.
The irony of the Claude Opus exploit chain is that it needed a human driver throughout. The model could not figure out it was stuck. It could not recover from dead ends. It guessed rather than verified. The 20 hours of human intervention suggests that current frontier models are sophisticated tools for experts rather than autonomous weapons. That distinction will not survive another generation of capability improvements. The cost is already dropping. The autonomy is increasing. The question is not whether this class of attack becomes commodity. It is how long the structural lag persists as a forcing function for defenders to close the update propagation gap before the gap itself becomes irrelevant.