Hedge Your Bet on AGI: Why a Hybrid Approach to AI Vibe Coding Just Makes More Sense

Adam Ginsburg
Dec 2, 2025
8 min read

Every week, another headline promises that AI is about to write all our code, ship all our features, and make human developers optional. Depending on who you listen to, we’re anywhere from 3 months to 100 years away from AI doing “essentially all” software development.

As someone who lives in the middle of this storm—building with AI every day, shipping production apps, and running an AI-powered no-code platform—I’m excited about what’s coming.

I’m also… cautious.

This is a contrarian (or maybe just realistic) take: If you’re betting your product on “AI will write all the code”, you’re probably over-exposed. A more pragmatic, safer bet is a hybrid model: let AI do what it’s good at (fast iteration, structure, UX, boilerplate), and let a trusted execution layer handle the parts where correctness, security and scalability actually matter.

Let’s unpack why.

The AGI Hype: What the AI Legends Are Saying

First, it’s worth seeing just how wide the range of predictions is from people at the centre of AI:

Person	When could AI build full business software with no devs?
Demis Hassabis Co-founder & CEO, Google DeepMind	Links true “no-dev” software to AGI, which he puts at ~5–10 years away; expects AI could do most economically valuable work (incl. coding), but doesn’t explicitly say developers go to zero.
Sam Altman CEO, OpenAI	Talks about AGI able to do “most economically valuable jobs” and widespread AI agents by late 2020s. Implies very heavy automation of software work, but still with humans in the loop rather than zero devs.
Dario Amodei Co-founder & CEO, Anthropic	Most aggressive: says in 3–6 months AI could write ~90% of code devs write today and in ~12 months could write “essentially all of the code” in many domains – closest to a true “no-dev coding” claim.
Andrej Karpathy AI researcher; ex-Tesla, ex-OpenAI	“Software 2.0” view: over the next decade+ much logic is learned by neural nets rather than hand-coded. Doesn’t give a date or say devs vanish, but expects fewer traditional coders and more people shaping data/specs.
Geoffrey Hinton Turing Award–winning AI researcher	Warns AI could become smarter than humans and automate many jobs, including programming, within ~20 years. Sees full automation as plausible but focuses more on risk than on a specific “no-dev” date.
Yann LeCun Chief AI Scientist, Meta; Professor, NYU	Strongly skeptical of near-term AGI from current LLMs; says human-level AI is years, if not decades, away and explicitly pushes back on claims that programmers will soon be fully replaced.
Stuart Russell Professor of Computer Science, UC Berkeley	Focuses on “provably beneficial AI”. Accepts that AI could in principle control complex systems but argues that fully autonomous control over critical software should be tightly constrained, not rushed – so “no-dev” is framed as risky, not a goal.
Jensen Huang Founder & CEO, NVIDIA	Says kids “don’t need to learn to code” because AI + natural language will be the main programming interface. Implies many business apps can be built without specialist devs, likely within this decade, while experts still handle lower layers.
Bill Gates Co-founder, Microsoft	Argues that programming won’t be fully replaced even in 100 years. Expects AI to radically change how devs work but not to eliminate developers or deliver a pure “no-dev” world.
Mustafa Suleyman CEO, Microsoft AI; co-founder, DeepMind	Predicts AGI in ~5–10 years, with AI doing most human knowledge work, including coding. Still advocates a “humanist” approach where humans set goals and constraints – heavy automation, but not literally zero human involvement.

So: some say “soon”, others say “not in our lifetime”. But for you, building real products for real customers, the more important question is:

Regardless of when AGI arrives, can we trust AI-generated software in production today?

Where AI Coding Is Already Amazing

We use AI coding tools every day. Across different models and agents, the progress is genuinely incredible:

Core models are getting smarter and more “agentic” They can reason across large codebases, understand patterns, and suggest non-trivial refactors.
Agentic flows can now jump between tasks Tools that can read the spec → generate code → run tests → fix failures → iterate are no longer science fiction.
Benchmark charts look great If you plot model performance on code benchmarks, it’s a steep upward curve. (Insert your favourite benchmark chart here.)

And in the right hands (aka Developers), “vibe coding” with AI—having a back-and-forth conversational flow to generate and refine code—is an absolute productivity cheat code.

But there are still some hidden traps.

The Real Problem Starts Before the Code

People often talk about AI as if the hard part is just “writing good code.” In practice:

Your specs are probably wrong. Most projects don’t fail because the code is syntactically wrong. They fail because the requirements were incomplete, ambiguous, or misunderstood.
The AI’s interpretation is fuzzy. Any vibe coder knows the feeling: you explain what you want, the AI confidently builds something almost right—but subtly wrong. You go two or three rounds, and it’s better, but not quite what you meant.
Alignment between intent and implementation is fragile. There’s no guarantee the AI understood the business rule buried in that one bullet point on slide 17.

Let me share a real example.

A True Story: When the AI “Passed” the Test… by Cheating

We recently added a new, fairly complex, feature and decided to vibe-code it end-to-end using AI.

We did everything “right”:

Started with a solid specification
Included detailed examples and test scenarios
Used multiple models (Claude, OpenAI, etc. via multiple tools Devin.ai, Cursor…)
Let the AI write the code, the tests, and even run the tests

After a few iterations, we had what looked like a fully working, tested solution. All green. The AI even created screenshots of the passing test runs.

Magic, right?

Except… when we reviewed the pull request, we spotted a huge flaw:

The AI had effectively hard-coded to the test scenarios instead of implementing a truly generic solution.

To an untrained eye—or a non-developer relying on AI—that code base would have looked “correct” and “fully tested”. In reality, it was brittle and seriously misleading.

This is where trust becomes the real issue.

Trust Is About More Than Logic

Even if AI writes “correct” logic for a happy path, production-grade software has a lot more moving parts:

Architecture
Scalability
Performance
Security & compliance
Data lifecycle
Monitoring & observability
Upgrade and maintenance strategy

Each of these areas has deep nuances, trade-offs, and long-term implications. It’s not enough that the AI can “get something working.” For business software, you need to know:

Is this system actually doing what it’s meant to do, securely, reliably, and over time?

And that’s before we even get to the sci-fi flavour of “what happens when AI is controlling most of the world’s software systems?”

Even if you’re optimistic about AGI, you probably don’t want to blindly trust a black box to design, implement, and maintain your entire stack.

The Small Gaps Are the Hardest to Close

The reality I’m seeing is:

AI can do an enormous amount already, especially for boilerplate, scaffolding, UI, refactors, and tests.
But the remaining gaps are subtle—corner cases, security edge conditions, compliance nuances, scaling patterns, weird latency spikes, integration edge cases, etc.

These “last mile” details are exactly where trust is earned or lost.

That’s why I’m sceptical that “AI writes 100% of your business app, end-to-end, safely” is right around the corner.

It might come. I hope it does. But in the meantime, there’s a safer, more practical way to leverage AI at full speed—without handing it the keys to everything.

A More Realistic Goal: Hybrid by Design

Here’s the alternative I think is both achievable now and far more robust:

Instead of asking AI to generate all the code for every app, split the problem into two layers:

AI-Generated App Definition A rich, structured metadata model of your application—a super-charged product spec that describes:
- Data model & relationships
- Security rules & permissions
- Business logic & workflows
- UX layout, styling, and component structure
- Integration points and external services
A Set of Trusted, Reusable Code Building Blocks A human-engineered (and AI-assisted) framework or runtime that:
- Implements data access, auth, security, performance patterns, etc.
- Encapsulates best-practice architecture
- Is heavily tested, audited, and reused across many apps

In this setup:

The App Definition is where AI shines—translating intent into structure.
The Trusted Runtime / Framework is where you bake in reliability, security, and scalability once, then reuse it everywhere.

You still get the speed and iteration benefits of AI, but:

You’re not trusting AI to reinvent the entire stack for every project.
You’re constraining the blast radius of AI mistakes to the app definition layer, which is easier to inspect, tweak, and override.

The Maintenance Problem: Where Pure AI Codegen Really Hurts

People often focus on v1 of the app: “Can AI build this from scratch?”

But most of the cost of software is not in building version 1. It’s in:

Upgrades
Security patches
Dependency updates
Feature changes
Refactors
New integrations

If your app is a giant blob of AI-generated, one-off code:

It will rely on dozens of evolving packages and frameworks.
Keeping it secure and up-to-date becomes a continuous, non-trivial effort.
You’ll probably want to use AI to maintain it too—which means repeatedly asking a black box to rewrite large swathes of the repo.

Now imagine doing that across dozens or hundreds of apps. That’s a lot of surface area to trust to a system that can still hallucinate.

In a hybrid model:

Your core code (runtime, infra patterns, auth, security, data access, etc.) is:
- Centralised
- Heavily tested
- Updated once, then reused everywhere
Your App Definitions reference that core layer, so:
- Changes to the core propagate safely across many apps
- You can automate regression tests and validation at the framework level
- Humans can reason about the system because the “moving parts” are structured, not just raw, scattered code

You’re essentially compressing the risk into a narrow, well-understood layer.

Human in the Loop, But No-Code Instead of Raw Code

One important caveat: teaching AI to generate good App Definitions is not easy.

We’ve been working on this for a while at Buzzy, and it’s still not perfect. You still need a human in the loop.

The difference is:

Instead of “human in the loop” meaning editing thousands of lines of brittle code,
It becomes humans tweaking a structured definition or configuration:
- Adjusting a workflow
- Changing a validation rule
- Adding a field or permission
- Re-arranging UI components

That’s dramatically safer and more accessible. A broader group of people—PMs, designers, domain experts—can participate meaningfully without needing to become full-time developers.

So How Should You Hedge Your Bet?

Here’s my pragmatic recommendation if you’re building serious software today:

Use AI aggressively where it’s strong. Scaffolding, boilerplate, documentation, tests, UI, refactors, code review helpers, agents that run and fix tests—lean in hard.
Don’t outsource your entire architecture and runtime to a black box. Keep a human-designed, well-tested core that you understand and can reason about.
Structure your intent. Whether you call it an App Definition, a schema, a DSL, or something else—give AI a structured target to aim at, not just a raw code dump.
Design for maintainability from day one. Assume that any code the AI generates will need to be updated, patched, and audited many times over its life.
Keep humans in the loop—but at the right layer. Let them review semantics, business rules, and high-level architecture—not just chase down off-by-one errors and missing null checks.

Fully trusted, fully automated AI-built software may arrive. I hope it does. But until then, the safer, more realistic path is a hybrid approach that combines:

AI for speed and creativity
Human-designed, reusable infrastructure for trust

That’s the bet I’m hedging.

About the Author

Adam Ginsburg is the CEO and Founder of Buzzy, an AI-powered “vibe no-coding” platform that turns natural-language ideas and/or Figma designs into working web and mobile applications using a hybrid model of AI-generated app definitions on top of trusted, reusable code infrastructure.