The tool is not the author
Authorship has always been about direction and selection, not manual execution of every detail.
When a photographer presses the shutter button, nobody questions who owns the photo. The camera made every technical decision — metering, focus, exposure — using algorithms trained on optical models. The photographer chose what to point it at, when to shoot, and which photo to keep. That's enough. It has been enough since 1884, when the Supreme Court ruled in Burrow-Giles v. Sarony that photography produces copyrightable work, despite being a mechanical process. The human directed it. That settled it.
Now replace "camera" with "language model" and watch the same principle suddenly become controversial.
We've been here before
Before we talk about LLMs, let's be precise about what we're already comfortable with.
Modern smartphone cameras use neural networks to composite multiple exposures into a single image: HDR, night mode, portrait blur are all generated by statistical models. The photographer copyrights the photo. Auto-tune corrects a singer's pitch using a statistical model of frequency distributions. The producer copyrights the song. Machine translation runs a document through a language model that generates entirely new sentences, word choices, and structure in another language. The translator copyrights the result. Procedural generation in video games uses algorithmic models to create terrain, levels, and textures. The studio copyrights all of it.
In every one of these cases, a statistical model transforms human input into output that the human couldn't have produced manually in the same way. And in every case, nobody questions that the human who directed the tool owns the result.
An LLM is the same mechanism. It's a statistical model trained on text, producing output based on learned patterns. The only difference is that it's more general and more capable. If the principle is that statistical modeling disqualifies output from copyright, then it has to apply equally to all of these. To draw a line between "this much statistics is fine" and "this much isn't," you'd have to explain where the threshold sits and why it belongs there.
I don't think that line can be drawn — and I don't think it should be.
Meaning needs direction
An LLM on its own produces nothing meaningful. It has no problem to solve, no audience to address, no intent to express. Without a human pointing it at something — a codebase, a reader, a creative vision — its output is inert. It doesn't become a work because tokens were generated. It becomes a work when a human gives it purpose: deciding what to ask for, whether the result is good, and where it fits. The human isn't just claiming authorship. The human is what makes the output a work at all.
This is no different from how authorship has always functioned. A director tells actors what to do and gets credit for the film. A composer specifies notes and an orchestra performs them. The person who directs the work is the author. Authorship has always been about direction and selection, not manual execution of every detail.
This also handles the obvious counterexample. Someone who types one vague sentence and publishes the first output unreviewed has a weak authorship claim — not because AI was involved, but because they barely constituted the work. Minimal direction means the human gave the output almost none of its meaning. The existing copyright framework already makes this distinction. The question has never been "which tool did you use?" — it has always been "did you exercise creative judgment?"
Nothing comes from nothing
If you deny copyright to human-directed LLM output because the model learned from copyrighted material, you have created a principle that applies to humans too. Every programmer learned from existing code: documentation, books, Stack Overflow, other people's repositories. Every writer learned from published work. If learning from copyrighted material taints the output, then human-produced work is equally tainted. Applied consistently, this principle would collapse copyright entirely, because no one creates from nothing.
And consider the alternative. The LLM cannot hold copyright: it has no legal personhood, no rights, no standing. If the human who directed it also cannot hold copyright, then the work belongs to nobody. This isn't a theoretical edge case. It's a legal vacuum where useful work has no owner, no protection, and no incentive structure around it. The entire framework of copyright exists to incentivize creation. If using more capable tools strips you of ownership, that's a perverse outcome that undermines the purpose of the system itself.
It was always about trust
Several open source projects — NetBSD, QEMU, FreeBSD, Gentoo, Forgejo, among others — have restricted or banned AI-assisted contributions. The most common legal justification is the Developer Certificate of Origin, which requires contributors to certify they wrote the code or have the right to submit it, and that it doesn't carry incompatible license obligations.
The concern is genuine: LLMs are trained on code under every license imaginable, and there's no way to inspect whether a particular output closely mirrors something from an incompatible source. Contributors can't make that certification with full confidence.
However, that uncertainty isn't unique to AI. A developer who spent three years working on a proprietary codebase carries patterns, idioms, and sometimes near-verbatim snippets from that work into everything they write afterward. They can't prove their contribution wasn't shaped by proprietary code.
The DCO has never been a proof system. It's a good-faith attestation. When a developer signs Signed-off-by, nobody audits their memory or browsing history. The system runs on trust. If a good-faith attestation is acceptable from a human black box, it should be equally acceptable from someone who used an LLM, reviewed the output, understood it, and is confident it's not reproducing something verbatim. The standard is the same.
The real problem
The honest reason many projects are banning AI contributions isn't copyright or the DCO. It's that they're drowning in low-quality submissions. AI has made it trivially cheap to produce superficially plausible but fundamentally broken code, and maintainers are bearing the cost of reviewing it. That's a real problem, but it's a quality control problem, not a copyright problem, and banning the tool doesn't solve it.
Before LLMs, projects dealt with the same issue on a smaller scale: drive-by pull requests, Hacktoberfest spam, code copied from Stack Overflow without understanding. AI just made low-effort contributions cheaper to produce. The Linux kernel has never banned AI-assisted contributions. Its review process filters for quality regardless of what tools were used. It cares whether you understand the code and can stand behind it. That approach is future-proof in a way that tool bans never will be.
These tools are becoming standard in professional workflows. A policy that assumes you can reliably distinguish "human-written" from "AI-assisted" code is already barely enforceable and will only become less so. Worse, it forces responsible contributors who use these tools productively to either leave or lie about their process. Neither outcome helps the project.
The law is already moving
The principle is sound. And the legal landscape is catching up. The US Copyright Office's January 2025 report affirms that using AI to assist in creation does not bar copyrightability, reserving skepticism only for cases of minimal human involvement. In Thaler v. Perlmutter (affirmed on appeal in 2025, certiorari denied in 2026), the courts ruled that an AI system cannot be named as author, but that case was deliberately filed with no claim of human involvement, and the court explicitly left open the question of human-directed AI output. Allen v. Perlmutter, currently before a federal court in Colorado, is testing exactly that question. In Europe, Italy became the first EU member state to pass a law explicitly regulating authorship of works created with AI assistance.
The remaining legal question isn't whether human-directed AI work can be copyrighted. It's where the minimum threshold of human involvement sits. That's a question about degree, not principle. It's the same question copyright has always asked about every tool.
The projects and institutions making policy today should consider what position they want to be in as this settles. Building policy around the assumption that AI-assisted work isn't copyrightable is building on ground that is already shifting beneath them.
The tool is not the author. The human is.