tech

Why agents are actual game changers

ICYMI: AI isn't bad. You're just using it wrong.

I don't mean that as an insult. I mean it literally — the way most people use AI is fundamentally the wrong tool for the job, and it's not obvious why until you see the alternative.

Here's what's actually happening when you use ChatGPT or Claude in a normal chat window:

You type something. It predicts the most statistically likely next tokens. It outputs text. Done.

No verification. No feedback. No way of knowing if what it just told you is right.

That's it. That's the whole thing. A very sophisticated autocomplete.

And when you ask that autocomplete "when's the cheapest time to fly to Lisbon?" — it doesn't search flights. It doesn't check prices. It recalls patterns from training data and generates text that sounds like an answer. It has no mechanism to know if it's wrong.

(Yes, newer versions of these tools can do things like search the web — and that's genuinely useful. But if you tried ChatGPT a year or two ago and walked away unimpressed, this is what you were dealing with. And even with web search bolted on, the core problem remains: it still can't verify its own work.)

So yeah, of course it hallucinates. Of course it confidently gives you the wrong answer. It has no feedback loop. It can't course-correct.

Now let me show you what changes when you add one.

I build with AI constantly. A while back I was testing something: I asked Claude to build me a small web app, one-shot, no tools, just a prompt.

"Here's your finished site!"

Opened it. Errors everywhere. Blank screen.

Same task, different setup: I gave it a browser tool — the ability to actually open the page and see what's on screen. I gave it the ability to run tests.

This time it didn't finish in 10 seconds. It spent time. It built something, opened it, saw the error, fixed it, checked again. Iterated.

It shipped something that worked.

The code wasn't the constraint. The feedback loop was.

This is what "agentic AI" actually means. Not AI that's smarter. Not a better model. Just AI with a loop:

Do → Check → Correct → Repeat

When you give an AI agent the tools to verify its own output — to know when it's wrong, and by how much — you stop getting autocomplete and start getting something that can actually do work.

Flight search. Postage calculations. Code that runs. The examples don't matter. What matters is: can it check itself?

If yes: game changer. If no: fancy autocomplete.

This is what most devs and founders I talk to are missing.

They've tried Copilot. They've pasted code into ChatGPT. They've walked away thinking "it's impressive but I can't trust it."

They're right — they can't trust a token generator with no feedback loop. But they've written off the wrong thing.

The agentic version of this isn't a different vibe. It's a different category of tool.

If you're a dev: stop asking AI to one-shot things. Give it unit tests. Give it a browser. Let it be wrong and fix itself.

If you're a founder: the thing you imagine AI doing for you — autonomously, correctly, at scale — that version exists. It just requires the loop.

Start there.

(Note to self: find citations — the token generation point, maybe a link to something on agentic frameworks. Also consider: is this too long for LinkedIn? Could tighten the flight example or cut it entirely and let dev example carry the weight.)

← All drafts