Things are about to get a lot worse for Generative AI

@[email protected] · 6 months ago

Things are about to get a lot worse for Generative AI

@[email protected] · 6 months ago

I pay OpenAI for a chat and image generation service. If I make Mario or something random, I pay them the same amount. If I go sell those pictures of Mario that I made with the service then I am liable for infringement, not OpenAI. OpenAI is not charging me more for making Mario or anything else.

Same as if I draw Mario to keep privately or draw him and sell the images. Adobe is never mentioned as a liability even though I used that software to infringe and paid Adobe for the ability to do so.

Please tell me how it’s different. Don’t tell me scale because they don’t care if it’s one or 1 million Marios. If someone was making money on a million Marios they would be sued independently, whether or not they used AI.

Shazbot · 6 months ago

The short version is that it’s a licensing issue. All art is free to view, but the moment you try to integrate it into a commercial product/service you’ll owe someone money unless the artist is given fair compensation in some other form.

For example, artists agree to provide a usage license to popular art sites to host and display their works. That license does not transfer to the guy/company scraping portfolios to fuel their AI. Unfortunately, as we can see from the article, AI may be able to generate but it still lacks imagination and inspiration; traits fundamental to creating truly derivative works. When money exchanges hands that denies the artist compensation because the work was never licensed and they are excluded from their portion of the sale.

Another example: I am a photographer uploading my images to a stock image site. As part of ToS I agree to provide a license to host, display, and relicense to buyers on my behalf. The stock site now offers an AI that create new images based on its portfolio. The catch is that all attributed works result in a monetary payment to the artists. When buyers license AI generated works based on my images I get a percentage of the sale. The stock site is legally compliant because it has a license to use my work, and I receive fair compensation when the images are used. The cycle is complete.

It gets trickier in practice, but licensing and compensation is the crux of the matter.

@[email protected] · edit-2 6 months ago

deleted by creator

Shazbot · 6 months ago

That’s fine, but not the primary issue.

At some point these companies will need to get licenses for any copyrighted work that was part of the training data, or start over with public domain works only. The art may be data, but that data has legal owners whose rights grant control over it’s use.

Another way to think about is proprietary code. You can see it and learn from it at your leisure. But to use it commercially requires a license, one that clearly defines what can and cannot be done with it, as well as fair compensation.

@[email protected] · edit-2 6 months ago

But it’s not reposting copyrighted images. It is analyzing them, possibly a long time ago, then using complex math and statistics to learn how to make new images when requested, on the fly. It’s an automated version of the way humans learn how to make art or take pictures. If it happens to produce Mario very closely it is because it learned very well.

That is why this isn’t cut and dry. And why it might be good to think of it as derivative works. I don’t think you will be able to nail down this idea of imagination and inspiration. It’s just not that straight forward.

Edit: Also, the generator is not pumping out copyrighted images intentionally. It is waiting for a prompt from a user. Who will then go and post it somewhere. If it is too close to Mario, it is that human user who has violated copyrights. They only used the generator as a tool. I feel like that is very relevant.

@[email protected] · edit-2 6 months ago

That’s the ideal and how it’s advertised, but in reality they retain a lot more of the original copies than most people think, in addition to the fact that the output is often not sufficiently “transformative” in copyright terms to avoid being considered a derivative work still needing a copyright license.

In your Mario example, the character as such is unique enough that it has its own copyright and you can’t use images of that character commercially without a license regardless of how the image was created. If it’s recognizable as Mario then you copied the design as far as a judge would be concerned. If you asked a human to draw it then it would be equally infringing.

A human doesn’t even need to ask for a copyrighted work for it to generate infringing outputs.

@[email protected] · 6 months ago

In the case of Mario, it’s not literally copyright for the most part, but other IP protections such as trademarks, logos, etc.

@[email protected] · 6 months ago

I don’t think intention or prompting matters much. If you type “Mario movie” into the YouTube search box and it shows you the Mario movie, YouTube needs to license that material, even if you explicitly ask it to do so and even if you don’t redistribute it. An AI tool is in a similar situation, you still need to license content if you’re making a tool.

@[email protected] · 6 months ago

I think the problem is that the user often doesn’t realize that they’re infringing on copyright because the prompts don’t specifically mention the copyrighted work.

In other word, there is a need for a way to tell the AI that only public domain work can be used in the generation.

@[email protected] · 6 months ago

Very possible people don’t realize. And you know what? We shouldn’t care. But if someone generates a Mario and puts in on their website or makes a fanfic comic, doesn’t matter how they made it… go after that person. Just like you always have, Nintendo…

But I worry for the future of any tool if they win this. Add a feature to a computer art tool that feels too “generatey”, you better watch out… I worry about human artists, having to prove the sources they learned from were not protected copyrights when they lean into a style that feels like Nintendo’s…

@[email protected] · 6 months ago

I’m not an infrengment lawyer… but disney and nintendo, and NYtime and whole lot of artist seems to think they have a case. I suspect it is the same as using a sample from a beyonce song in something you are selling, you may have a problem with beyonce’s jurists

@[email protected] · 6 months ago

I’m not a lawyer either, but I’m fairly sure that every plaintiff thinks they have a case.

@[email protected] · 6 months ago

oh great we cant use tools to draw culture characters

copyright sounds stupider every time i hear about it

Billegh · 6 months ago

Does it matter? I can infringe on copyright without AI. I’m infringing copyright right now in my imagination. I’m hoping this can set boundaries on copyright enforcement or begin neutering it altogether.

@[email protected] · 6 months ago

Your imagination is not making money out of it. The Idea that openAI is making money because they use the work of others is unsetling. What you wish for is an OpenAI monopole on the size of google with control over creativity… sound like a dystopia

@[email protected] · 6 months ago

I do make money off of it. I just understand that I need to blend enough of other people’s ideas and change names until it’s considered a “unique idea” which, of course, no idea ever is.

@[email protected] · 6 months ago

It’s a strange thought really. Deep down we need to admit that there is little “new ideas” resources this late in society, we document so much. Almost all ideas were formed from old ideas. Are they copyright infringement? Make up a new TV show, or invent a new story for a book, and chances are there already is a show or a book that is really similar. One that you may have read or seen and forgotten about, until it’s pointed out that your story is the same as the one you read.

@[email protected] · 6 months ago

https://www.everythingisaremix.info/

@[email protected] · 6 months ago

A classic

@[email protected] · 6 months ago

I don’t even think saying “this late in society” is doing it justice. The human mind has always grown in proportion to its input. Cavemen weren’t any dumber than us, they just didn’t have as many ideas to build off of. As availability of information has increased so has our ability to progress rapidly. Just like an LLM getting a larger data set.

@[email protected] · 6 months ago

Why is that unsettling? People make money off of other people’s ideas all the time. The boundary of when this is allowed and when it isn’t is pretty arbitrary.

@[email protected] · edit-2 6 months ago

deleted by creator

@[email protected] · 6 months ago

yeah , hopefully it will be fun… let’s hope some of those mega corporation die in the process

magnetosphere · edit-2 6 months ago

Is there anything more relaxing than watching multinational corporations get ready for a slap fight?

Edit: “relaxing” isn’t quite the word I’m looking for. I’m trying to express how satisfying it is to see corporations suffer the consequences of their own legal shenanigans. It’s also relieving to know that I have zero stake in this situation, and won’t be affected by the outcome in a meaningful way. I don’t have to care, or feel guilty for not caring.

jungle · edit-2 6 months ago

I don’t know about you, but I will be affected if OpenAI is forced to close shop or remove ChatGPT.

magnetosphere · edit-2 6 months ago

Yeah, that’s why I chose the words “in a meaningful way”. It’s relatively new technology, so you got along without it before. You can do it again.

I don’t think that’ll happen, though. There’s too much interest, potential, and money in the concept to kill it completely. Plus, we’re all acting as free beta testers, which is incredibly valuable. There’ll be a lot of motivation to find a compromise and keep it going.

db0 · 6 months ago

Copying characters and styles is not copyrights infringement. This is basic stuff

@[email protected] · 6 months ago

The only reason the AIs knows what SpongeBob looks like is because they are using tons of copyrighted images in their database as part of their commercial product.

@[email protected] · 6 months ago

The problem with copyright law is you need, well, copies. AI systems don’t have a database of images that they reference. They learn like we do. When you picture SpongeBob in your mind, your not pulling up a reference image in a database. You just “learned” what he looks like. That’s how AI models work. They are like giant strings of math that replicate the human brain in structure. You train them by showing them a bunch of images, this is SpongeBob, this is a horse, this is a cowboy hat. The model learns what these things are, but doesn’t literally copy the images. Then when you ask for “SpongeBob on a horse wearing a cowboy hat” the model uses the patterns it learned to produce the image you asked for. When your doing the training, presumably you made copies of images for that (which is arguably fair use), but the model itself has no copies. I don’t know how all of this shakes out, not an expert in copyright law, but I do know an essential element is the existence of copies, which AI models do not contain, which is why these lawsuits haven’t gone anywhere yet, and why AI companies and their lawyers were comfortable enough to invest billions doing this in the first place. I mostly just want to clear up the “database” misconception since it’s pretty common.

@[email protected] · edit-2 6 months ago

It doesn’t matter if actual copies of the original images exist, reproducing copyrighted concepts like characters is still infringement. It’s the same whether it’s done by a human or a machine. The real question is who is liable. Obviously whoever distributes the images is liable (except for those exempted by section 230) but there’s no president for the trainers.

I think there are arguments to be made for fair use with open models run by end users where there is no profit motive, but for closed models where use is paid the trainers are potentially profiting directly from the creation of infringing work.

@[email protected] · 6 months ago

You can’t sue the paint and brush manufacturer because they made it possible for John Doe’s replicas or style. Generative AI and LLMs are a tool not a product. It is the exploitation of them as a product that is the problem. I can make the same stuff in Blender or some shit program from Autodesk, and you won’t sue them. No one tells people what to prompt. Current AI is like the internet, it is access to enormous amounts of human knowledge but with more practical utility and a few caveats. AI just happened a lot quicker and the Luddites and Lawyers are going to whine as much as the criminal billionaire class is going to exploit those that fail to understand the technology. Almost all of these situations/articles/lawsuits/politics are about trying to create a monopoly for Altmann and prevent public adoption of open source offline AI as much as possible. If everyone had access to a 70B or larger LLM right now, all of the internet would change overnight. You don’t need search engines or much of the internet structure of exploitation any more. If a company can control this technology, with majority adoption, that company will be the successor of Microsoft and then Google. All the peripheral nonsense is about controlling the market by any means necessary, preventing open minded people from looking into the utility, and playing gatekeepers to the new world paradigm of the next decade or two.

@[email protected] · 6 months ago

Paint making companies typically don’t have massive databases of illegally obtained copies of other people’s copyrighted images. Nor does paint fundamentally requires the existence of said database for the manufacture of paint itself. That’s where the “it’s just a tool” argument falls apart.

I love your enthusiasm though, to think that giving access to a massive llm for everyone would rid the internet of exploitation is extremely naïve and hopelessly optimistic.

@[email protected] · edit-2 6 months ago

deleted by creator

@[email protected] · 6 months ago

How do you show a computer something? Do you perhaps add pictures to a database that the program then processes? I understand it’s not a folder called SpongeBob but at some point somebody fed it pictures of SpongeBob and now those picture exists in a database.

The reason the legal system is slow is because it’s complicated and everyone turns into philosophy majors when discussing thing like what a database is but are somehow used words like “show” without any explanation.

The reason investors are comfortable pouring billions into AI is because investors either think they are going to make the money back before regulation catches up or they are just coked up maniacs investing anything that sounds shiny.

@[email protected] · 6 months ago

Saving pictures of spongebob in a file system to use as drawing reference also doesn’t equal copyright infringement.

@mindbleach · 6 months ago

And?

db0 · 6 months ago

And that’s legal. There’s no law against this

@[email protected] · 6 months ago

Also when it is exactly the same, you make money out of it, and you dont divulge your source material?

db0 · 6 months ago

Sorry I don’t understand this question.

@[email protected] · 6 months ago

Copyright does protect fictional characters, however usually these characters are also registered trademarks, in which case it’s a very obvious violation to reproduce the likeness of a character.

db0 · 6 months ago

It’s not a trademark infringement to create a tool which has the potential to create trademarked characters.

@[email protected] · 6 months ago

I’m not a lawyer, but this doesn’t seem obvious to me.

@[email protected] · 6 months ago

Downvote cuz substack.

sapient [they/them] · 6 months ago

Corp fight corp fight corp fight ^.^

These companies will take FOSS AI models from the cold, dead, torrenting hands of the free internet :p

fr though, both of these corp groups push against FOSS AI - media corps because of “”“intellectual property”“” and closed AI companies for ~~monopoly and control~~ “”“safety”“”. But the resilience of distributed coordination and hosting makes it basically impossible to kill, just like how the war on piracy is nearly futile.

@[email protected] · 6 months ago

So now the Mouse and Nintendo both get to sue ChatGPT, DallE and OpenAI?

@[email protected] · 6 months ago

They are. I suspect that is the reason behind the Nasdaq bad day

@[email protected] · 6 months ago

Hey I used to work for Gary! Haven’t heard about him for a while.

@[email protected] · 6 months ago

Not for his lack of trying.

that guy · 6 months ago

I like to pretend Mario doesn’t exist and point at the AI model and go BAD INFRINGER BAD!

JackGreenEarth · 6 months ago

I’ll just carry on using Llama and stable diffusion uncensored on my own machine.

@mindbleach · 6 months ago

LLMs reproducing source text is a failure called overtraining and it’s something those companies want to avoid regardless.

Image AI knowing what popular characters look like is a non-issue.

What are we even talking about, there? These aren’t exact copies of existing images. You expect the draw-anything robot to be incapable of drawing Superman? Or a cartoon mouse? Even when you explicitly describe the color of a costume and the letter that goes on a hat?

Some guy on Twitter asks, ‘how can liability be pushed to the user?,’ as if liability exists when you just draw a thing. If someone goes straight from Bing image generator (or Google image search) to the logo for their new company, yeah no shit that’s a legal issue, and that person is being an idiot. But you can draw Spongebob pornography. Like, right now. Pick up a pen and go. You can even take suggestions, or ask someone else to do it for you. It’s not about to confuse customers or even involve customers. That act alone is not what copyright and trademark laws cover. And if they did - the moral answer is to walk them way the hell back.

Generative AI systems like DALL-E and ChatGPT have been trained on copyrighted materials

No shit. But so were you.

Generative AI systems are fully capable of producing materials that infringe on copyright

Literally no-one has ever promised otherwise. ChatGPT’s earliest examples namedropped Tolkien characters.

The slapdash filters are just another extra-legal effort to fend off copyright cartels’ flesh-eating lawyers. Those bastards get mad that paper can contain scribbles they maybe sorta kinda theoretically might make one penny from, and one more company goes ‘fine, here, fuck off.’