The Case with Generative AI and Copyright: It's Complicated



Companies like Microsoft, Adobe, and GitHub are incorporating artificial intelligence into their products. Startups are funding hundreds of millions of dollars to compete with these companies. But if you pay attention to any industry discussion about generative AI, you'll hear, in the background, a question being murmured by both supporters and opposers in tones that show rising concern: Is any of this legally legal?

The issue emerges as a result of the training methods used by generative AI systems. They operate by recognizing and recreating patterns in data, much like the majority of machine learning software. Simply put, they must first learn from the actual work of actual humans in order to produce an output like a written text or picture.

Blurry Lines between AI and Human Generated Content

The US Copyright Office has long held the position that works produced by non-humans, including machines, are not protected by copyright. As a result, a generative AI model's output cannot be protected by copyright.

For instance, if an AI image generator creates artwork that is similar to Georgia O'Keefe's, it suggests that Georgia O'Keefe's genuine artwork was used to train the AI. Similar to this, for an AI content generator to write in Toni Morrison's style, Toni Morrison's words must be used to teach the AI.

Therefore, how can we balance the complex details of US copyright law with the rapidly growing artificial intelligence industry? The US government, companies, courts, and creators are all working to come up with a solution to it.

In the age of ChatGPT, the US Copyright Office has commented on the ownership of AI-generated works.

According to updated guidance on AI and copyright law provided by the federal agency, ownership of AI-generated works may be granted on a ‘case-by-case’ basis.


In essence, how someone uses AI to create material will determine whether or not it is copyrighted work. You can instruct it to produce a song or a poem in the manner of William Shakespeare, as we've seen with ChatGPT and Bing Chat. However, as generative AI generates complex written, visual, or musical works in response, the traditional elements of authorship are determined and executed by the technology, so the Office wouldn't recognize this as copyrighted property. It doesn't count since the user had no creative input into how the AI evaluated and expressed the work.

A user, however, can select or arrange AI-generated material in a sufficiently creative way such that it transforms into an original work based on the user's ingenuity, and such a work may be protected by copyright.

This entire situation is vague and perplexing, as it should be. The Copyright Office has been compelled to handle this brand-new area of copyright law due to the fast rise in popularity of generative AI.

The distinction between work produced by humans and machines is blurring more and more due to the sophistication of AI chatbots. Despite the meandering wording, the Copyright Office's stance is rather straightforward in theory: it won't register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author. However, it is viewed as a skeptical business in practice to use AI to brainstorm concepts or collaborate on a creative piece.

Generative AI Already Sued for Numerous Cases

In a number of ongoing court proceedings, certain creators and businesses are attempting to deprive generative AI companies of the legal defense of fair use on the grounds that they have violated their intellectual property rights.

One of these companies is Getty images, which has filed a lawsuit against Stability AI (the company behind Stable Diffusion) for copying and processing millions of copyright-protected photos as well as the related metadata without its consent or payment. Bev Standing, a voice actor, filed a complaint against TikTok, alleging that the firm improperly utilized her voice for its text-to-speech feature. The two parties recently reached a settlement.

While this is going on, Stability AI and Midjourney, both of which use Stable Diffusion to create images for them, have been sued in a class-action copyright infringement complaint by artists Sarah Anderson, Kelly McKernan, and Karla Ortiz.

Can Copyright Protected Data be Used to Train AI Models?

The main concerns regarding AI and copyright, in the opinion of the majority of experts, revolve with the data utilized to train these models. AI researchers, start-ups, and multibillion-dollar tech corporations all use the fair use doctrine, which encourages the use of copyright-protected works to advance freedom of expression, as their reason for sharing these photos (at least in the US).

However, two things are involved, according to experts. The first is what is the use's objective or nature and how will it affect the market?

In other words, you can use other people's data to build an AI model, but what you do with that model can be illegal.

Can the Cases be Solved?

Even if it turns out that the training of generative AI models falls under fair use, the issues facing the industry will still not be fully resolved. It also won't always apply to other generative AI domains, such as coding and music, and won't appease artists who are upset that their work has been used to train commercial models. In light of this, the question is: what solutions, technical or otherwise, can be implemented to enable generative AI to grow while acknowledging or compensating the creators whose work made the field possible?

Again, according to experts, the most obvious solution is to license the data and compensate its authors. However, some believe that this will ruin the sector.

Others, however, note that we have successfully handled copyright issues of a similar size and complexity in the past and can do so again.

Trials of other options are also being conducted. For instance, Shutterstock wants to create a fund to pay artists whose work is used by AI firms to train their models, and DeviantArt has developed a metadata tag for online photos that cautions AI researchers against scraping their content.

Regardless of how people look to solve legal issues, the various players in the generative AI space are already preparing for something. The companies profiting immensely from this technology are solidifying their positions by adamantly claiming that everything they do is legal. Copyright holders are establishing their own hesitant stances on the other side of the line of no return without fully committing to action.