Stealing Styles or Creating Futures? The IP Crisis of AI Art
With OpenAI’s GPT-4o, a singular-line prompt is now all that it takes to conjure up an evocative, dreamlike replica of the work synonymous with Miyazaki’s Studio Ghibli. Over the past few weeks, the internet has been flooded with whimsical, Ghibli-style adaptations of personal portraits. From the White House to OpenAI founder Sam Altman’s profile picture, these “Ghiblified” images have taken social media by storm.
Since the advent of generative AI, the sociolegal conversations surrounding intellectual property rights have intensified, prompting us to take a step back and assess the legal avenues currently at our disposal. The recent OpenAI–Studio Ghibli IPR fiasco underscores this uncertainty as generative models continue to go into the legal and moral grey areas of generating artworks eerily similar to copyrighted works.
To better understand the extent of copyright infringement caused by text-to-image generation tools such as GPT4o, DALL-E, Stability Diffusion, it is important to understand how LLMs are trained and how they are able to produce the desired output images with just a singular text.
Large language models or LLMs are trained on huge datasets using data collected through a process known as “text and data mining”, it employs certain techniques such as website scraping, website crawling or website indexing to source data from the internet. The data collected can come from both publicly available works and copyrighted content. These models use pattern recognition to process and replicate content. For example, when an image is used to train a model, it gets broken down into mathematical representations and paired with a text description that includes keywords and the relevant dimensions. The model, using artificial neural networks, learns to associate the keywords with the correct mathematical equation and URLs. When an external prompt using the same keywords is entered, the AI is capable of creating an output that shares an almost non-discriminable similarity to the dataset that it was originally trained on.
The relationship between TDM law and intellectual property law has been a longstanding point of contention. Traditionally, such data has been employed for research, scientific or creative purposes and the process has been protected by the doctrine of fair use. However, owing to the lack of a standardized and uniform framework in the realm of AI’s intersection with IP law, the question of whether or not using copyrighted material for the purpose of training a LLM requires careful consideration.
Title 17 of the USC( the US code )/art 107 of the Copyright Act draws the boundaries in terms of copyright protection in the United States. A very common exception to the usual rights and protections given to the creator of a copyrighted piece of work is the doctrine of fair use, which permits limited use of copyrighted material without having to first acquire permission from the copyright holder. The goal of the fair use doctrine is to balance the rights of the creators with public good by allowing adaptations and renditions of copyrighted pieces of work which includes but is not limited to commentary, search engines, criticism, news reporting, research, and scholarship.
A 2005 copyright infringement class action suit against Google is an apt illustration of the applicability of the fair use doctrine in the case of an overlap between strides in technological innovation, public good and the protection of intellectual property. Google launched the Google Book Search in 2002 scanning millions of books to update its search engine. Initially limiting itself to books available only in the public domain, Google in 2004 scanned entire copies of copyrighted books to enable its users to find words and phrases as they would with web sites, along with other advanced search features. In the 2015 ruling of the case Authors Guild vs Google Inc., the Supreme Court of the United States ruled in favour of Google Inc., stating that the usage of the copyrighted material fell under the purview of the doctrine of fair use. In doing so, the court relied on the 4 factor test laid down the landmark case of folsom v marsh, reiterating that for a piece of work to be considered a “fair use” of a copyrighted work the factors to be considered include both the degree of transformativeness and the ascertainment of whether the derivative work is of a commercial or a non commercial nature.
While text-to-image LLMs like Stable diffusion and GPT-4o are free to use, there are paid premium versions available. Infact, the ability to create Ghibli art that was previously available on the free version of ChatGPT is now exclusively available for premium users.
Both these metrics are quite germane to the current generative AI conundrum. The degree of transformativeness test was developed as a relevant factor for fair use in internet or technological realms in the Authors Guild v Google case. The transformative nature of the later work often lies in its ability to offer the public a previously inaccessible benefit, a benefit that would otherwise remain unavailable. This aspect of transformativeness plays a significant role in a fair use analysis and can potentially exempt what would otherwise be considered a clear copyright infringement from liability.
Back in 2023, authorsguild filed a class-action suit against OpenAI in the Southern District of New York for copyright infringement of their works of fiction on behalf of a class of fiction writers whose works have been used to train GPT.
While a definitive global legal stance on the application of national or international principles concerning TDM laws in relation to generative AI remains elusive, examining Directive (EU) 2019/790 of 17 April 2019 on copyright and related rights in the Digital Single Market (Copyright Directive) may provide some clarity on the current situation. The 2019 amendment to the EU’s Digital Single Market Act expanded Article 3 (exceptions to copyright) by adding the “TDM for any purpose” exception to the already existing exception for research purposes. With the provision of copyright holders retaining an option to opt out and prevent their work from being used for any data mining purposes, the EU bill presents itself to be a forerunner in attempting to navigate the fine line between balancing free and fair use and protecting the intellectual property rights of creators.
Japan, incidentally the place of origin of Studio Ghibli, released draft guidelines back in 2024 discussing the impact of opening up art30(4) of Japan’s copyright law which states “ It is permissible to exploit a work, in any way and to the extent considered necessary, in any of the following cases, or in any other case in which it is not a person’s purpose to personally enjoy or cause another person to enjoy the thoughts or sentiments expressed in that work; provided, however, that this does not apply if the action would unreasonably prejudice the interests of the copyright owner in light of the nature or purpose of the work or the circumstances of its exploitation”.
The committee that drafted the guidelines underscored the importance of balancing the free usage of copyrighted material for the purpose of training models in order to encourage technological advancements and make ai data more reliable and diverse to make information more accessible to the general public with protecting the rights of the copyright holders.
These rights are outlined in Article 27 of the Universal Declaration of Human Rights, which provides for the right to benefit from the protection of moral and material interests resulting from authorship of scientific, literary or artistic productions. It is this fundamental legal dispute that distinguishes all legal precedents on data mining and using copyrighted material to create datasets ( as was the case with the lawsuit against GoogleBooks) from the complexities that arise as a result of the type of output that generative artificial intelligence produces.
Gen ai goes a step further when it generates an output which to the human eye is an exact replica of works of living artists. When users can generate a “ ghiblified “ version of almost every item around them in a split second by just clicking a picture and entering it into chatgpt, an artstyle that has become synonymous with Miyazaki’s person, it challenges both legal and ethical standards.
Almost every significant discourse on the protection of intellectual property talks about protecting both the natural right of man over his creations and the right to benefit commercially from said creations. There is however a famous principle of copyright law that is the “idea–expression distinction or idea–expression dichotomy”. The idea-expression distinction states that only expressions of ideas are copyrightable and not the idea itself. A musical creation is copyrightable, but the genre is not. A book is copyrightable, the writing style is not. A movie is copyrightable, the filming style is not. Understanding this difference is quite pertinent to discern whether or not the “output” generated by GenAI is a copyright violation.
Many text-to-image AI models are trained on a specific style, and they generate new and original content that mimics the “style and idea” of a piece of work, and not the “expression”. The legal dilemma arises when the style is what distinguishes the creator and the expression is just a means to that end. This very hypothesis presents itself in the case of a Polish digital artist Greg Rutkowski who has made illustrations for games such as Horizon Forbidden West and Dungeons & Dragons. He found that his art style was widely copied through Stable Diffusion, which is a text-to-image generative AI model developed by StabilityAI. When users can simply generate a scene “in the style of the work of Greg Rutkowski” by entering a prompt, does it not hurt the commercial interests of the artist? Why would businesses that previously would have hired digital artists such as Greg to get a piece of art commissioned, now not move towards a much quicker and economical method of art “creation“ that gen AI seems to offer.
Such advancements in generative AI have substantially increased the potential for both personal and commercial harm to original creators and artists, as it now allows for the replication of artworks and personal styles that once took significant time, effort and skill to create. The ability to generate an artist’s work with a simple text command raises legal, moral and ethical dilemmas about the fundamental rights to create, propagate and to commercially benefit from one’s creations. With the pace at which the technology surrounding generative AI and machine learning is developing, the legal framework regulating it cannot afford to fall behind. A lot is at stake.
-Kriti Taneja & Sidharat Som Mohanty
-09/04/2025
