Jun 16, 2026

We scanned 27 vibecoded apps from the outside. Every one had something open.

Probe scanned 27 recently shipped vibecoded apps from the outside. Every one had at least one finding.

We want to be careful with this before we start and so this post is not a dunk on anyone. It's not a breach claim. It's not a census of every AI-built app on the internet.

It's one scan set of 27 recently shipped apps from solo builders and small teams, most of them squarely in the zone of apps that are vibecoded. Probe only looked at public surfaces, the same things any stranger can see from outside the app. Response headers, the JavaScript shipped to the browser, obvious admin and debug routes, how the API behaves with no session, what the webhook endpoints accept, and the usual framework leftovers that survive a fast build.

The result was clean enough to write down in that 27 out of 27 had at least one finding. 55 issues total, breaking down to 8 critical, 18 high, 29 medium.

What got our attention was how repetitive the shape was across completely unrelated products. The same handful of security gaps showed up over and over like:

missing browser security headers
permissive CORS on sensitive endpoints
secret-like values sitting in browser JavaScript
debug and admin routes answering in production
LLM endpoints reachable without auth or any rate limit
webhooks with no visible signature check
source maps served straight off production

That repetition is the whole story which means these aren't 27 individual mistakes. They're one blind spot showing up 27 times. The app looks finished, onboarding works, the Stripe button works, the demo feels good, and the deployed surface is quietly saying things the builder never meant to say.

One app had what looked like a server-side credential sitting in the JavaScript it shipped to browsers. Probe redacted the value and we aren't naming the app, because the builder's identity isn't the lesson. The lesson is that browser JavaScript is public. If a paid API key, an email key, a service-role key, or any internal token lands there, treat it as already exposed, because it is.

The admin and debug routes were a softer version of the same problem. A route called admin or debug answering in production doesn't automatically mean someone can take over the app. Sometimes it's a harmless frontend fallback. But it's still worth closing, because a public admin-shaped route is a magnet. It tells a stranger where to start poking, and it tells the builder that production routing needs its own review, separate from whether the product works.

The LLM findings were the most vibe-coded of the set. Probe flagged endpoints that looked callable without auth, and others where hammering the same endpoint never hit a rate limit. That's not a claim that anyone lost money or leaked data. It's the more boring point, which is usually the more useful one. If an endpoint can trigger paid model work, the builder should be able to answer three things before sending it traffic. Who can call it. How many times. And what stops one bug or one bored person on the internet from turning it into a bill.

The CORS findings sat in the same bucket. CORS is easy to wave off because the app still loads. It's also easy to set too broadly while moving fast. A sensitive endpoint answering with wide-open cross-origin access is worth a second look. Not panic, just a look.

The most common finding was the least dramatic. Missing security headers, 25 times. Nobody signs up because your headers are tasteful, and headers will never improve a demo. But missing them is a reliable sign that nobody has done the boring outside-in review yet. They're usually the first loose thread, not the whole sweater.

That's why we think AI coding moves where the review happens more than it removes the need for one. Cursor, Claude Code, Bolt, Lovable, and Replit are genuinely good at getting you from idea to working software. They don't get you from working software to reviewed software, because they work from your code and your prompt. They don't go look at what actually deployed. They won't tell you which JavaScript went public, which routes respond, which headers are missing, which endpoints answer with no session, or which defaults survived the sprint. That last mile is the one place the tools that built the app can't see, and it's where these mistakes keep hiding.

So the fix isn't to stop using AI tools. That would be silly. The fix is to stop treating "the demo works" as the finish line. Before you put real users, real payments, real uploads, or real API spend behind an AI-built app, run one outside-in pass. Check that paid API calls happen server-side. Check that no secret-like values are in your browser JavaScript. Check that admin and debug routes aren't open in production. Check that auth actually guards your LLM, billing, upload, and webhook endpoints. Check that anything costing money has a rate limit. Check that webhooks verify signatures. Check that CORS is tight on sensitive routes. Check that your security headers are there. And if you're on Supabase, Firebase, or object storage, check that your storage and database policies aren't wide open.

None of this is glamorous and almost none of it improves the demo, which is exactly why it gets skipped. That gap, the stretch right after an AI-built app works but before strangers start stress-testing everything you forgot to check, is the reason Probe exists.

Run a scan with Probe. Better to find the boring stuff before the internet does.