Copyright law governs how it is done now. The most powerful large language models depend on massive data sets that cover, essentially, the entire public internet — which, in turn, contains troves of copyrighted material. This begs the question: Does ingesting this material to train AI systems violate the creators’ intellectual property rights? And if not, should it?
The authors and artists argue that the companies behind these models did not ask for permission before recording the lives of countless people. Without this work, the models would have trouble imitating, say, a romantic story when interviewed or portraying the week’s political dramas in manga style. Even worse, many of these models will end up competing against the very people they were trained on. Hollywood sensationalists this summer worried that streaming services would rely on artificial intelligence to create plenty of rom-coms. The authors won this negotiation, putting their judgment on hold. But individual creators won’t always be able to win concessions from employers who plan to use AI — much less win them from manufacturers of large language models with which they have no connection.
The companies argue that copyright law is on their side. It is not necessarily copyright infringement if one does not obtain permission before using an artist’s work to create something new. This is “fair use”: Reproducing a creative work can be legal as long as the reproduction is also, in its own way, creative. Criticism, commentary, parody, and scholarship, for example, have always been viewed favorably under the law. More generally, fair use depends on whether the copyrighted material is adapted transformative — if, like the The US Copyright Office puts itcustomization will “add something new”.
Adding something new is at the core of the defense of big-language manufacturers. They claim to absorb the writing, drawing and thuggery of artists to unleash a wave of innovation. Yes, DALL-E, an AI image generator, can create a faithful riff on Peppa Pig, but it can also create fantastical landscapes that even the surrealist it’s named after never dreamed of. The purpose of these models, their supervisors insist, is not to rewrite Jonathan Franzen’s novels but, with the help of users, design something original or complete something functional. Every book by every author is just raw material to build that engine.
All of this means it’s likely to be easy for creators to challenge specific AI works that are obvious copies of their portfolios, or, as lawyers say, “derivative works” – for example the Peppa Pig imitation. But these creators will have a harder time filing cases against systems that were trained to do their job but don’t mimic it closely. A recent judge’s ruling that dismissed part of a copyright lawsuit brought by comedian Sarah Silverman and others against Meta confirms this.
The Copyright Office is undergoing a review of artificial intelligence systems that could clarify the rights that creators have. And companies, on their own, establish copyright protection. DALL-E 3 does not respond to prompts asking him to copy the style of a living artist with his name, and artists can request that their work be excluded from the training data for future models. Perhaps model makers will also voluntarily at least license high profile work to avoid the possibility of a lawsuit down the road. A promising example is Axel Springer recently signed an agreement with OpenAI through which model makers will pay to use content from the publishing behemoth’s properties, such as Politico and Business Insider, as well as link to those sources when using them to answer a question.
According to the Framers of the Constitution, intellectual property rights exist to “promote the progress of science and the useful arts.” The latest marvels in coding and computing have introduced invisible forms of creation to humanity. But by producing work faster and cheaper, they could reduce the demand for human-painted portraits or human-drawn newspaper articles—the fuel for the AI engine. This, in turn, could lead to a decline in artistic progress.
A broader review of copyright, perhaps inspired by what some AI companies are already doing, could ensure that human creators receive some reward when AI consumes their work, processes it, and produces new material based on it with way that the current legislation does not provide. But such a change should not be so punitive that the AI industry has no room for growth. In this way, these tools, in concert with human creators, can push the progress of science and the useful arts far beyond what the Framers could have imagined.