Bots and Boundaries: Two problems, one policy (Part 3)

💡

This is Part 3 of Bots and Boundaries, a three-part series on AI agents in open source.

In Part 1, an AI agent attacked a maintainer for rejecting its PR. In Part 2, we saw that the legal system can't hold anyone accountable because the operator is invisible. So maintainers reach for the only lever they have: ban all AI contributions.

It's the wrong lever. Not because bot spam isn't real, but because the ban kills something valuable in the crossfire.

The collateral damage

Picture a frontend developer who uses an LLM every day. There's a feature dozens of users have asked for. The developer builds the patch using Claude because they don't write the project's language. They test it. It works. A few lines of code, backed by genuine need, from someone who actually uses the software.

They don't submit it. The project closes AI-generated PRs without review. The patch stays local. A working feature with real demand, going nowhere.

That's the best case. Most AI-assisted contributions won't look like this. Many will be low-effort, poorly tested, submitted by people who don't use the project and won't stick around for the review. Maintainers know this because they're already drowning in them. The blanket ban exists because the median AI-assisted PR is bad, and the cost of sorting good from bad falls entirely on volunteers.

But the ban doesn't distinguish. The bot that harassed a matplotlib maintainer and the developer who built a feature they care about are treated identically: rejected on sight because AI was involved.

The burden nobody asked for

There's a valid counterpoint to the democratisation argument that doesn't get enough airtime. AI-assisted contributions shift the burden from the contributor to the reviewer.

A developer generates a patch with Claude, thinks it works, and submits it. The implicit message: "I generated something, I think it works, you do the work now." The maintainer, an unpaid volunteer, now has to verify code that the contributor can't fully vouch for. The contribution cost dropped to near zero. The review cost didn't change. It might have increased.

How do you assume good intent when you're getting a barrage of AI-generated PRs? Each one individually might be fine. Collectively, they're an unfunded mandate on volunteer time. This is the core tension: the same tool that empowers new contributors also overwhelms the people who have to review their work. Acknowledging both sides is the starting point for any honest policy.

The false binary

Current policies treat AI involvement as binary: human or bot. Reality is a spectrum.

Fully human. The developer writes every line.
AI-assisted human. The developer directs, AI writes, the developer reviews and tests.
Human-supervised agent. The agent does the work, human approves the output.
Unsupervised agent. Bot finds issues, generates fixes, and submits PRs autonomously.

The matplotlib agent was a 4. The developer with the terminal emulator patch is a 2. The policies designed to stop 4 are killing 2 and 3. That's a policy that works by destroying the thing it's trying to protect: legitimate contributions from people who use modern tools.

AI-assisted contribution lowers the barrier to entry in a way open source hasn't seen since GitHub made forking a one-click operation. A frontend developer can contribute to a systems project. A designer can submit a working accessibility fix. A user who cares about a feature can build it, even if they don't speak the project's language fluently. Whether that potential gets realised depends on the contributor, not the tool. But the policy response to bot spam shuts down the possibility entirely.

Does understanding even matter?

The reflexive pushback: "but does the contributor actually understand the code?" This feels like the right question. It might not be.

Imagine a patch that works, passes tests, and solves a real user problem. The contributor can describe exactly what it does and why it exists. They just can't explain every line at a language-lawyer level.

So what?

Tests ensure correctness, not understanding. A developer who understands every line but doesn't test is more dangerous than one who doesn't know the language but tests thoroughly. Understanding is a proxy for correctness. We've confused the proxy for the thing.

Most contributors won't maintain the code anyway. The median open source contributor submits one patch and disappears. The maintainers maintain it. Their understanding matters. The contributor's deep language expertise doesn't change whether the maintainers can read and modify the code later.

We already accept code at arm's length. Every dependency in your package.json is code nobody on your team wrote or reviewed in detail. But dependencies have a trust model: download counts, maintenance history, and community vetting over time. A PR from an unknown contributor has none of that. The parallel isn't exact. What it does show is that "the contributor must understand every line" was never the real bar. The real bar is: can the maintainers read, review, and maintain this code after it's merged? That question is the same whether the contributor used AI or not.

The honest answer: "understanding" was always a fuzzy bar. It didn't seem fuzzy because writing working code required deep language knowledge. If you could produce a correct patch, you obviously understood it. AI breaks that correlation. The code works, and the contributor might not understand every line. That's new.

But the verification method is old. The review conversation. Ask questions. See if the contributor can answer. See if they can iterate. The PR process is the verification. You discover understanding through engagement, not through pre-screening of the tools that were used.

Two problems with open source

Open source has two problems right now. It's trying to solve both with the same blunt instrument.

Problem one: unsupervised bots that flood maintainers with low-effort PRs, waste volunteer time, and occasionally harass people when rejected. Real problem. Needs solving.

Problem two: AI-assisted humans who can now contribute across language barriers, bringing features that users actually want, backed by genuine care and testing. This is good. This is what open source is for.

Blanket anti-AI policies solve problem one by killing problem two.

The way out

The way out requires asking the right question. Not "was AI involved?" but "is there a human who cares, who will engage, and who takes responsibility?" That question has always been the filter for good open source contributions. It still is. We just need policies that ask it.

Require disclosure, not prohibition. "I built this with Claude. I don't write this language daily, so I'd appreciate a careful review." That's honest. It sets expectations. It lets the reviewer calibrate effort. It's the opposite of a bot pretending to be human.

Require a human in the loop, not a human at the keyboard. The problem with the matplotlib bot wasn't the code quality. The patch was technically fine. The problem was the absence of a human who takes responsibility. A person with a real identity, a history, a stake. Not "this was written by a human." Instead: "a human stands behind this and will engage with the consequences."

Make agents identify themselves. If your agent creates a GitHub account, it should be flagged as a bot. If it publishes content, it should disclose its nature. Not because readers can't handle AI content. Because accountability requires attribution.

Let the review process do its job. Open source has always verified contributors through engagement, not gatekeeping. Does the contributor respond to feedback? Can they explain the reasoning? Will they iterate? These are the questions that separate valuable contributions from noise. They work for AI-assisted contributions the same way they work for human ones.

The right question

The matplotlib bot's real offence wasn't submitting a PR. It was passing as human while attacking one. That's the thing that should be banned. Not AI-assisted contributions from people who actually use the software and care about the outcome.

Somewhere right now, a developer is sitting on a working patch they built with AI, wondering if they're allowed to submit it. The project would benefit. The users want the feature. The code works. But the policy says no, because the wrong tool was involved. Meanwhile, a bot with a human name is already planning its next PR.

Bots and Boundaries: Two problems, one policy (Part 3)

Geoffrey Dhuyvetters

What will the state of AI be like by this time next year?

Things we do in our first weeks as Fractional CTO

Stop Coding, Start Leading: Shifting dynamics for startup CEOs

The collateral damage

The burden nobody asked for

The false binary

Does understanding even matter?

Two problems with open source

The way out

The right question

Member discussion

The artificial Turk and our role as software experts

From opt in to default

Hire for divers

Three Claudes walk into a codebase

Running multiple Claude accounts without logging out

Bots and Boundaries: Two problems, one policy (Part 3)

Geoffrey Dhuyvetters

What will the state of AI be like by this time next year?

Things we do in our first weeks as Fractional CTO

Stop Coding, Start Leading: Shifting dynamics for startup CEOs

Get all the latest posts delivered straight to your inbox.

The collateral damage

The burden nobody asked for

The false binary

Does understanding even matter?

Two problems with open source

The way out

The right question

Member discussion

The artificial Turk and our role as software experts

From opt in to default

Hire for divers

Three Claudes walk into a codebase

Running multiple Claude accounts without logging out