Some notes on ChatGPT and Studio Ghibli – TechnoLlama

Spread the love

If you were online between March 25 and March 26, your timeline may have been flooded with a barrage of AI-generated images, a large number of which would have been existing photographs recreated using the style of Studio Ghibli’s anime. On March 25, Sam Altman announced on Twitter that they were releasing a new image model as an update to their 4o language model, this would replace the DALL-E 3 model for some users, particularly paying customers. The announcement was followed by a modified photo of Altman with two developers in the style of Studio Ghibli, which read “Feel the AGI” (a running motto at the company). A couple of hours later, an engineer called Grant Slatton wrote in a tweet “tremendous alpha right now in sending your wife photos of yall converted to studio ghibli anime”, accompanied by a picture on the right, and on the left a modified image also in the Ghibli style.

All hell broke loose.

As more people started gaining access to the new image model, one of the first tests they performed was to modify pictures in the Ghibli style. What followed was a combination of selfies and family pictures, followed by memes, film screenshots, characters, and even famous photographs from history. Everything was game. For a few hours my timeline was awash with Studio Ghibli images. I even briefly joined with a couple of pictures of my cat Leonardo. It was all a bit of fun. But then the pictures kept coming. And coming, and coming. By March 26 it felt like the Internet was suffering from a serious Studio Ghibli hangover. And then it all came crashing down on March 27 when the White House Twitter account joined the fray and released a picture in the Ghibli style.

The fun was over. The shark had been truly and thoroughly jumped.

Everything that happens nowadays regarding generative AI eventually devolves into a copyright discussion, so that has been an interesting part of the online debate during and after the meme storm (I propose we call it the “Ghiblicalypse”). My timeline has been flooded with some of the most preposterously bad legal takes, I’ve noticed that the level of misinformation is directly proportional to the amount of public interest. But I don’t think that the legal issues are that complicated, and I don’t think there will be any legal action, as I’ll explain later. I think that this may actually be a more interesting discussion about AI art in general, and also about the culture and effect of generative AI that can replicate various styles.

The cultural fallout

I’d like to preface this section by saying that I’m a genuine Studio Ghibli fan of many years. I own most of the films on DVD and Blu-ray, I visited the Ghibli Museum in Tokyo, and I own an embarrassingly large number of Ghibli-related merchandise, particularly Totoro plushies and t-shirts. I may be the equivalent of a Disney adult, a Ghibli adult if you may.

So when I first saw that ChatGPT could easily transform any image into Ghibli style I was quite amused, and even rather happy. At first I didn’t see it as anything different than many other filters out there. People have been using LoRAs (Low-Rank Adaptation) for a while, these are fine-tunes of existing models which are specialised in a specific style. So you can turn an image into a Tarot card, a Simpson’s character, a Wojak, and a Pixar character. So I didn’t think much of it at first, and even laughed at some of the images.

But then the stuff kept coming. And coming. And coming. And a sort of dread started to descend upon me. Will the amount of stuff truly affect the Ghibli brand? Could it be possible that in the future the term “AI slop” would become synonymous with Ghibli? At the moment most of the stuff that is branded as “AI slop” inevitably came from older models, particularly DALL-E 3, which had as a default setting that sickly shining 3D quality that many have come to hate.

The answer is that I don’t know, but I really don’t think so. The Ghibli barrage appears to have dwindled, at least for now, and I don’t think that it will be a lasting feature, just another Internet fad. We’ll probably look back at this as many other strange memetic trends that we now find quaint, or cringe, or both. That is, unless the Facebook boomers discover the trend, in which case be prepared for another flood of images from your normie relatives.

I am however surprised that the Ghibli style was the one that was over-used, when the new ChatGPT image model can easily do style transfers of other iconic ones, such as The Simpsons, The Muppets, Aardman, Family Guy, ligne claire, and many others. I think that Ghibli is so iconic and easily recognise, but it’s also identified as wholesome, which sort of made some of the juxtapositions amusing.

Copyright issues

As mentioned above, one of the first discussion points surrounding the meme storm was a heated debate regarding copyright, with some of the same lines drawn in what has now become the AI Wars, or even reverting to old Copyright Wars lines. OpenAI was stealing Studio Ghibli’s content and misusing it for profit, one side argued, while the other answered that styles aren’t copyrighted, and it all falls under fair use anyway.

Perhaps one thing that has bothered me, and that is common in most copyright AI conversations, is that the discussion immediately defaults to the USA. This reminded me in some ways of all of the arguments surrounding Palworld, where most participants missed the fact that both companies involved were Japanese; there was actually a lawsuit in Japan, but for patent law. The same would apply here I think, most of the discussion has been centred on US copyright law, missing the fact that Studio Ghibli is based in Japan. There is a very good chance that were there to be any legal action, it would take place in Japan, even if OpenAI is a US company. The reason is cost, US litigation tends to be quite expensive, and it would always make sense for companies to initiate legal action in their own countries. However, I am quite happy to predict that there will not be any legal action forthcoming.

We divide the analysis of copyright infringement cases into two stages: the input phase and the output phase. With regards to the input phase, this would very much depend on a variety of factors, including where Studio Ghibli would choose to bring up legal action, if it decides to do so. In order to be able to replicate the Ghibli style, it is incontrovertible that OpenAI would have used large amounts of screenshots from Studio Ghibli films, which could open them to a copyright infringement lawsuit. But this would not be as open a case as many on social media are stating. Assuming that Studio Ghibli would sue in Japan, the first hurdle would be that Japanese copyright law has a broad text and data mining exception that would appear to allow this type of training for commercial purposes, although the extent of this exception is still open to interpretation (here’s an excellent article on the subject). But a Japanese court may not even apply this exception if it is asked to apply the law of the country where the training took place, which is likely to be the United States. So we could end up having a Japanese court interpreting US law, and that could get tricky, as we still do not have an answer there due to the fact that the cases are ongoing. I’ve often mentioned this, but jurisdiction in copyright law is hard, and courts may not always do what you think they might.

So even assuming that US law applies, we have to wait to get a firm resolution to the ongoing infringement cases (40 and counting), and that is not straightforward either. The cases exploring fair use in the input phase are all over the place at the moment, and anyone who tells you that they know what the result is going to be is lying to you. My assumption is still that we will get various results, then appeals, until there’s a big showdown in the US Supreme Court. How do we know that this is an open legal issue? A court just told us. In denying an injunction in the case of Concord Music and others v Anthropic (regarding using lyrics to train an AI), Judge Lee reminded the claimants that this is not a settled legal issue by far:

“By seeking a preliminary injunction, Publishers are essentially asking the Court to define the contours of a licensing market for AI training where the threshold question of fair use remains unsettled. The Court declines to award Publishers the extraordinary relief of a preliminary injunction based on legal rights […] that have not yet been established.”

Furthermore, I think that there is a very real practical problem with filing an infringement lawsuits of this nature, in what I have called the “Pikachu Paradox“. I believe that the more prevalent a work is in the training data, the less likely it is to be enforced by the rightsholder, the reason for this is that the very prevalence in the training data would act as a counterfactual to any damage that the copyright holder would claim, as the reason for this prevalence is the fact that the property can be found all over the Internet. There is a reason why the Studio Ghibli content is so easy to reproduce, and it is because it is to be found all over the Web, screenshot after screenshot in forums, gifs on social media, etc. We are the infringers. We are the ones providing the inputs. There are three decades of Ghibli property to be found online.

With regards to outputs, we are also unlikely to witness legal action for similar reasons. Most of the images we are seeing would be classified as potentially infringing as outputs, and the leaving out the input question, the person potentially infringing copyright would be the user. So if I upload a picture of the famous “distracted boyfriend” meme, and ask ChatGPT to generate an infringing image, I would potentially be the one infringing copyright, that is, if we consider that this action infringes copyright in the first place. The main problem here is that generally speaking, styles aren’t protected by copyright law, as it protects expressions of ideas, not the ideas themselves, and styles are closer to ideas. Moreover, many of the outputs could easily fall under existing exceptions such as transformative use under US copyright law, or parody and pastiche in other jurisdictions. Interestingly though, Japan does not have a parody exception, so it may actually be easier to sue for outputs there.

Ironic.

However, outputs aren’t fully in the clear, I actually think that there could be legal action forthcoming, but not from Studio Ghibli, but from the owners of some of the images being fed to produce the Ghiblified versions. Using the hypothetical mentioned above, that of the “distracted boyfriend” meme, that is a photograph that is protected by copyright, and the photographer could sue me for copyright infringement, as the reproduction could be found to be similar to the original, even stylised. So we could have the interesting situation in which we do get copyright infringement lawsuits for outputs, but not from Ghibli as styles aren’t protected, but from the owners of the photographs being transformed.

Isn’t it ironic?

Moreover, those stylised outputs may still fall under the exceptions mentioned, namely fair use, parody, or pastiche. Except in Japan.

Isn’t it ironic? Don’t you think?

Concluding

This has been an interesting addition to the Copyright Wars, and while I suspect that we will soon forget about it, there’s still potential for legal trouble, although I still doubt that it will happen. Copyright lawsuits are lengthy and expensive, and Studio Ghibli may look at the legal uncertainty, and perhaps hope that this kerfuffle will go away on its own.

As a Studio Ghibli fan, I’ve found the barrage of images annoying, even if I found it funny at first. But I strongly believe that the implications of having style transfer available to the masses may have much wider implications that we haven’t even started to fathom. This is also another opportunity to perhaps explore core concepts about art, simulation, and memes. Some would say that the memes must flow, while others see what has happened as the harbinger of artistic collapse.

Personally, I keep being reminded of Baudrillard, and the excellent “Simulacra and Simulation“, which explores how reality is replaced by symbols and representations, creating a world where simulations become more real than reality itself.

We’re living in Baudrillard’s world now. Welcome to the Matrix.