I'm using my engineering colleagues as my personal agents

I don't write code. I've never written a for-loop in my life. I don't even know what a for-loop is. And yet, for the past few weeks, I’ve been shipping new features to a Laravel application with 184 tests, a thing called Livewire, domain-driven architecture, and a codebase with Blade files for god knows what reason why they exist.

This post is not trying to be yet another opinion about the future of engineering or jobs that will disappear. It contains my thoughts on creating software as a non-technical product owner, but mainly my setup on how I do it. I know what needs to be built, but can I actually do this without the help of the brains of my human counterparts?

Some context

The product we're talking about is Take Five, a weekly trivia game for remote or hybrid teams. Think of it as the love child of pub trivia and team building, but asynchronous, so teams across time zones can actually play together. Seasons, duels, sabotages, streaks, and social questions about your colleagues. It's a proper game with proper mechanics.

A couple of months ago, I was still copy-pasting prompts into ChatGPT. Now I’m building with a team of agents. It still generates files I don’t fully understand, which hasn't changed. But since Opus 4.6 came out, it has become reliable enough to trust inside a structured workflow. I’m shipping features, running tests, managing branches, and keeping documentation alive. The whole pipeline, alllll by myyyyyyseeeelfff.

Want to play Take Five with your team?

Competition, bragging rights, and real connections — all wrapped in a game your team will actually look forward to every week.

Discover Take Five

The setup

My current development partner is Craft Agent, powered by Claude Code. It lives in a desktop app on my Mac, connected directly to the Take Five codebase locally. When I say connected, I mean it reads every file, understands the architecture, knows the domain patterns, and follows the conventions that my engineering friends set up. It's less autocomplete on steroids and more like a junior developer who never forgets the codebase conventions and never gets tired.

I use the GitHub Desktop app to keep track of changes locally and run localhost for manual testing. I suppose Docker has something to do with this. I still have everything open in VS Code as well, but I stopped staring at diffs or merge conflicts I don't understand anyway. Instead, I describe what I want in plain language in either Craft Agents or in Claude Desktop app (The Code tab). Not "add a column to the questions table." More like "players should see how many times each question was answered and what the success rate is, and admins should be able to sort by those stats." The agent figures out the rest. Schema changes, migrations, query optimisation, UI components.

Agents and code reviewing?

But I don't just blindly trust what comes out. That would be reckless, right? I know I can't properly judge the quality of the code and will only catch bugs after thorough manual testing. I also dont want to bother or rely on the engineers who are always busy or in focus mode. So I need some kind of safeguard before pushing to production.

I work with two apps for version control. The agent creates branches and commits in Sourcetree, which is my window into what's happening in git. Well, not really my window. I don’t even know how to access Sourcetree, let alone explain to you what it does, but the agents do, and they are doing their thing in there, cloning the local repo and doing branching HEAD magic. I keep out of it, don't want to get into any fights.

I do see the branch locally when it's ready, with a summary of what's changed. When a feature is done, and all tests pass, a pull request gets created. That's when I switch to GitHub Desktop, where I review the PR before it gets merged. I'm not reading PHP line by line. I'm checking whether it matches what I asked for, whether the scope has crept, whether the commit messages make sense, and whether the test output looks clean.

Sounds pretty simple, right? Well, that's the point.

What really improved the output of the code

Now for the part that actually makes this work and where I gain speed: documentation.

Most projects treat docs as an afterthought, I think. Something you write once, put in a wiki, and never touch again. I try to do the opposite: documentation is the operating system of this entire workflow. It's how the agent knows what to build and how to build it. Without it, you're just throwing prompts at a wall and hoping something sticks.

There are two critical documents that my team of agents (more later) reads before touching a single file. The first is GAME_MECHANICS.md. This is the player-facing truth. Every scoring rule, every streak multiplier, every sabotage effect, every duel mechanic. If you want to know what happens when a player in the bottom 25% activates Easy Mode during a Losing Streak, the answer is in this document. Not buried in code comments. Not in someone's head. In one Markdown file that gets updated whenever a game rule changes.

The second is TECHNICAL.md. This is the developer-facing truth. The full architecture, every pattern, every subscriber, every repository, every Livewire component. When the agent needs to add a new feature, it reads this file first and follows the patterns already there. Command bus with Jobs and Handlers. Event-driven side effects via subscribers. Repository pattern for data access. It's all documented. And the document gets updated after every single build, that's baked into all the knowledge and skills of my agents.

These aren't static documents gathering dust. They're living references. Every time a feature ships, the agent updates them as part of the build process. The project root has a CLAUDE.md file that enforces this like a strict checklist: "After building: update TECHNICAL.md with technical decisions and implementation details. Update GAME_MECHANICS.md if game rules changed." The agent follows it. Every time. No exceptions. Again, this is the major insight: keep your documentation up to date after every change and every commit. Make it part of the automated workflow.

There's also a FEATURES folder for bigger specs. Social Questions has its own product spec and a separate technical plan. Multi-tenancy has its own document. Ideas and backlog live in IDEAS.md. Each feature goes through a complete documentation cycle before a single line of code gets written. Think of it like a kitchen where you mise en place everything before you turn on the stove. The prep is the work.

Now for the cool part: the expert panel - you!

Before any major feature gets built, I run it through what I call the Expert Review Panel. It's a custom skill I built in Craft Agent that launches seven specialised AI agents in parallel. Each one reviews the feature spec from a different angle. And here's the fun part. I modelled them after real people. My colleagues and friends.

Michal is the Technical Architect.He asks about data models, migration strategies, and whether your queries will survive at scale. Pragmatic to a fault. He references actual files and line numbers. Abstract hand-waving annoys him.

Bram is the Clean Code Expert.Ruthless about simplicity. His job is to find the smallest possible implementation. He counts every new class, every if-statement, every line of code, and asks: does this earn its place? If your spec needs more than five conditional checks for a new feature, Bram will tell you the approach is fundamentally wrong.

Kristof, a friend of mine with whom I organise team buildings every now and then, is the Game Designer.He thinks like a player, not a developer. He maps engagement loops, questions retention mechanics, and compares your time commitments to casual game benchmarks. Wordle takes two minutes. Duolingo takes five. Where does your feature sit? "Would I come back next week?" That's his only real question.

Stijn, an ex-colleague and good friend of mine, handles Product and UX.He counts taps. He measures minutes. He imagines a real person on their phone during a work break and asks whether your flow actually respects their time. Unnecessary modals make him physically uncomfortable.

Miss Quality is the QA Expert, the name is still up for grabs.Paranoid by design. She thinks about what happens at boundaries, with empty data, with concurrent users, after 52 weeks of continuous running. Every edge case she finds gets a tracking ID and a severity rating. She assumes developers will miss things. She finds them first.

The Bodyguard handles Security. He's badass.They assume there's always a clever user trying to exploit every feature. Cross-tenant data leaks. Scoring exploits. Public Livewire properties that can be tampered with. They rate everything by severity and propose a concrete fix for each risk.

And finally there's Emma, the All-Round Full-Stack Engineer that makes you laugh.She shows up to every review with a terrible joke, then delivers the most practical feedback of the lot. Integration points, deployment concerns, testing strategy. She splits everything into "Must ship" and "Nice to have." Everyone loves her, even when the jokes are awful.

How do these agents work?

These seven agents run simultaneously when I want to. Each gets pointed at the spec and two or three relevant source files. They have six turns to do their work to keep token consumption limited. No rabbit-holing allowed. I must admit, they consume a lot of tokens. I've hit the limits 2-3 times a day since using them.

But when they all return, their findings get woven directly into the spec. They show me the highlights of their findings so I dont need to read everything, I trust them.Where the experts disagree, the conflicts get surfaced as simple either-or questions for me to decide. Then the decisions get baked into the document, and the spec becomes the build plan.

This isn't a gimmick. This process has caught real issues before a single line of code was written. Data isolation problems that would have broken multi-tenancy. Scoring edge cases that would have let players exploit streak multipliers. UX flows with too many taps. Architecture decisions that were over-engineered for what we actually needed.

The whole thing runs on a simple loop. I describe a feature. The experts stress-test it. I make the calls on any disagreements. The agent builds it, runs the tests with PHPUnit and PHPStan every single time, updates the documentation, creates the branch and commits. I review the PR. Merge. Ship. Then do it again.

Yeah cool, but Take Five is just a side project

Of course, I know what you're thinking. "This is fine for a side project, but it wouldn't work for a real application." To give some context: Take Five has 24 Eloquent models, 26 command handlers, 18 domain events, 12 event subscribers, 9 repositories, and a scoring system complex enough that it needed its own reference document. It runs seasons, rounds, duels, sabotages, achievements, social questions with AI-generated decoys, an admin panel, a separate company admin panel for multi-tenancy,... All 184 tests pass. PHPStan is clean. And the codebase follows DDD patterns.

Feels kinda real to me.

But the trick isn't AI. Everyone has AI now. That’s table stakes. The real shift is that execution is getting cheaper and clarity is getting more valuable.

If you know what to build, if you can think in systems, if you are disciplined about documentation, testing, and review, you can move far faster than you could six months ago. The bottleneck is no longer typing speed, but decision quality.

I’m not saying engineers are obsolete. Quite the opposite. This only works because real engineers designed the architecture and patterns I’m standing on. But the distance between idea and production has collapsed.

I don't need an engineering degree. Just judgment, discipline, and seven imaginary colleagues who refuse to let you ship rubbish.

I'm using my engineering colleagues as my personal agents

Yannick De Pauw

What will the state of AI be like by this time next year?

Things we do in our first weeks as Fractional CTO

Stop Coding, Start Leading: Shifting dynamics for startup CEOs

Some context

The setup

Agents and code reviewing?

What really improved the output of the code

Now for the cool part: the expert panel - you!

How do these agents work?

Yeah cool, but Take Five is just a side project

Member discussion

Onboard the AI like you'd onboard a developer

QA is the last bottleneck

The AI Agile Manifesto

Bots and Boundaries: Who do you blame when the bot defames? (Part 2)

Bootstrapping a birding database using GenAI (Part 2)

I'm using my engineering colleagues as my personal agents

Yannick De Pauw

What will the state of AI be like by this time next year?

Things we do in our first weeks as Fractional CTO

Stop Coding, Start Leading: Shifting dynamics for startup CEOs

Get all the latest posts delivered straight to your inbox.

Some context

The setup

Agents and code reviewing?

What really improved the output of the code

Now for the cool part: the expert panel - you!

How do these agents work?

Yeah cool, but Take Five is just a side project

Member discussion

Onboard the AI like you'd onboard a developer

QA is the last bottleneck

The AI Agile Manifesto

Bots and Boundaries: Who do you blame when the bot defames? (Part 2)

Bootstrapping a birding database using GenAI (Part 2)