AI companies have all kinds of arguments against paying for copyrighted content

Lee Duna@lemmy.nz · 1 year ago

AI companies have all kinds of arguments against paying for copyrighted content

grue@lemmy.world · 1 year ago

Simple solution: all AI output is copyleft.

Hildegarde@lemmy.world · 1 year ago

That’s already the case. Copyright is only possible for creative works of human authorship. By definition AI generation is uncopyrightable.

grue@lemmy.world · edit-2 1 year ago

First of all, copyleft and uncopyrightable are entirely different things.

Second, if something is a derived work of a copyleft work, then either it must also be copyleft, or it’s simply infringement and entirely unusable. You’re suggesting that AI remixing can effectively “remove” the copyleft, but it would be entirely unjust (and more to the point, contrary to established legal precedent) for it to work that way.

nevemsenki@lemmy.world · 1 year ago

At firstglance, if AI art is copyleft, there’s no reason to buy/license the original from anyone; just include their stuff in the model and tweak the prompts until it’s close enough. Voila, free art! As long as tweaking the model is cheaper than buying art, the AI industry wins.

grue@lemmy.world · edit-2 1 year ago

It’s not that there’d be no reason to buy/license it for commercial use, it’s that it would be impossible to do so. Downstream users simply couldn’t legally use it at all – no matter how much or little they wanted to pay – unless they were willing to release their work as copyleft, too.

In other words, making* AI output copyleft maximizes freedom, but it’s hardly “free.” And that impossibly-high cost to those who would leech is why it would be a good thing.

(* Or rather, affirming it as such in court, since it’s already rightfully copyleft by virtue of having already used copyleft input. It wouldn’t be a change in status, but rather a recognition of what the status always was.)

nevemsenki@lemmy.world · 1 year ago

I feel this assumes two things.

AI art would be used in products that can be copyrighted in the first place, and not things like marketing/political campaigns or decor.
depending on the exact license agreement, you could use copylefted things in commercial products. The actual art can be free to reuse/share, but the rest of product may not be; things like illustrations in a book say (an analogy I drew up based on how Android works, commercial products based on a copylefted component).

RememberTheApollo_@lemmy.world · 1 year ago

Typical corporate attitudes.

You take from us, you have to pay us or we will come for you.

We take from you…well, too bad.

fubo@lemmy.world · 1 year ago

The question should be pretty simple:

Does the AI product output copyrighted material?

If I ask it for the text of Harry Potter, will it give it to me? If I ask it for a copy of a Keith Haring painting, will it give me one? If I ask it to perform Williams’s Jurassic Park theme, will it do so?

If it does, it’s infringing copyright.

If it does not, it is not.

If it just reads the web and learns from copyrighted material, but carefully refuses to republish or perform that material, it should not be considered to infringe, for the same reasons a human student is not. Artistic styles and literary skills are not copyrightable.

e e cummings doesn’t get to forbid everyone else from writing in lowercase.

(Some generations of ChatGPT won’t even recite Shakespeare, due to overzealous copyright filters that fail to correctly count it as public domain. The AI folks are trying!)

ytorf@lemmy.world · edit-2 1 year ago

What’ll be interesting is when people start asking, “write a song in the style of Marvin Gaye” given the ruling against Robin Thicke a few years back, since that was about the style of the song hedging too closely to Gaye’s output (edit for clarity)

Touching_Grass@lemmy.world · 1 year ago

That’s what got me into using chatgpt. I’d ask it a question like “how can I trouble shoot this issue I’m havibg”

It would give me this big answer then I’d ask it to give me the answer back as an Elton John song. So much fun. Can’t have nice things though anymore

Pennomi@lemmy.world · 1 year ago

The issue is still the output, not the model itself.

just another dev@lemmy.my-box.dev · 1 year ago

Just like a gun.

fubo@lemmy.world · edit-2 1 year ago

Perhaps every federal judge should receive a copy of Spider Robinson’s “Melancholy Elephants”. (It’s under a Creative Commons license.)

http://www.spiderrobinson.com/melancholyelephants.html

AlexanderESmith@kbin.social · 1 year ago

@fubo

It’s been suggested that AI art created without human input cannot be receive copyrights;

https://www.reuters.com/legal/ai-generated-art-cannot-receive-copyrights-us-court-says-2023-08-21/

FaceDeer@kbin.social · 1 year ago

That’s the Thaler v. Perlmutter case, which has been widely misreported and misunderstood.

Thaler generated some art using an AI art generator and then tried to claim that the AI should hold the copyright. The copyright office said “no, you imbecile, copyright can only be held by people.” Thaler tried to sue and the judge just confirmed what the copyright office said.

But note that Thaler never tried to claim the copyright himself. If he’d said “I made this art using AI as a tool” that would have been completely different. That’s not what the court case was about.

umami_wasabi@lemmy.ml · 1 year ago

This might upset you but for some uncensored model that have alignment removed will output such content. Is the content true? Don’t know cuz I haven’t read Harry Potter.

fubo@lemmy.world · edit-2 1 year ago

Sure, then whoever uses it to extract that text is infringing. If I memorize a copyrighted text, my brain is not an infringement; but if I publicly perform a recitation of that text, that act is infringing.

Really the precedent of search engines (and card catalogs and concordances before them) should yield the same result. Building an index of information about copyrighted works is librarianship; running off new copies of those works is infringement.

On the other hand, AI transparency is also an interesting problem. It may be that one day we can look at a set of neural network weights – or a human brain! – and say “these patterns here are where this system memorized Ginsberg’s ‘Kaddish’.” I hope we will not conclude that brains must be lobotomized to remove copyrighted memorized texts.

umami_wasabi@lemmy.ml · edit-2 1 year ago

If we treat model like a brain that “memorize” copyrighted text and generate new text based on that, your statement is valid. However, this will also prohibit any copyright claims on the model’s output, as the act of memorization isn’t a work. Only work can infringe on other works, which should the output of models defined as “work” is still under heavy debate. Even if it is defined as a work, can a model gain copyright while not being a legal person? Who should bear the liability then? What if the output is modify by an editor? This rabbit hole digs deep.

Sir_Kevin@lemmy.dbzer0.com · 1 year ago

I think that actually was ruled on a few months ago. No the model cannot hold copyright. Nor can the person that commissioned the model to create the work. I think where things are still a bit grey (someone correct me if I’m wrong), is when a person creates a work with the assistance of AI whereas it’s a mix of human and AI generated content.

AbouBenAdhem@lemmy.world · edit-2 1 year ago

The model doesn’t contain the training data—it can’t reproduce the original work even if it were instructed to, except by accident. And it wouldn’t know it had done so unless it were checked by some external process that had access to the original.

barsoap@lemm.ee · 1 year ago

In case anyone wants to try this out: Get ComfyUI and this plugin to get access to unsampling. Unsample to the full number of steps you’re using, and use a cfg=1 for both sampler and unsampler. Use the same positive and negative prompt for both sampler and unsampler (empty works fine, or maybe throw BLIP at it). For A1111: alternative img2img, only heard of it never used it.

What unsampling is doing is finding the noise that will generate a specific image, and it will find noises that you can’t even get through the usual interface (because there’s more possible latent images than noise seeds). Cfg=1 given the best reproduction possible. In short: The whole thing shows how well a model can generate a replica of something by making sure it gets maximally lucky.

This will work very well if the image you’re unsampling was generated by the model you’re using to unsample and regenerate it, it will work quite well with related models, imparting its own biases on it, and it’s way worse for anything else. If you ask it to re-create some random photograph it’s going to have its own spin on it changing up pretty much all of the details, if you try to do something like re-creating a page of text it’s going to fail miserably as stable diffusion just can’t hack glyphs.

Grimy@lemmy.world · 1 year ago

Having to pay for training data would rapidly sky rocket costs making it impossible for open source projects or even smaller for profit companies to survive. We are rapidly going to find ourselves in an AI driven economy and this would cement Google and Microsoft owning it.

Not to mention that not a dime would go to individuals. Companies like Reddit, Getty, Adobe and Penguin have all the data, we already gave it to them a long time ago.

They write strongly worded letters so we play right into their hands but the big AI companies are drooling at the thought of it. It would fuck us hard.

just_another_person@lemmy.world · edit-2 1 year ago

Gonna be real with you because I work directly with these companies in helping do this type of thing exactly: it’s astromically high already, and nowhere near profitable. It’s all startup grift.

FaceDeer@kbin.social · 1 year ago

“It’s already bad so we might as well make it incredibly worse” is not a very compelling argument IMO.

just_another_person@lemmy.world · 1 year ago

Are you suggesting everyone just go along with it?

If anyone was looking for the plant, here you go^^^

FaceDeer@kbin.social · 1 year ago

I’m suggesting that we should not do the thing that will make everything incredibly worse.

Who do you think I’m a “plant” for? I’m advising the course of action that would be less advantageous for big corporations, so what, I’m a plant for the little guy?

PorkSoda@lemmy.world · 1 year ago

Not to mention that not a dime would go to individuals. Companies like Reddit, Getty, Adobe and Penguin have all the data, we already gave it to them a long time ago.

This is one of the big reasons I never did 23andMe. Don’t get me wrong, I’m super curious about what it has to tell me, but giving (paying to give it actually) my DNA to a private company that’s amassing a huge repository of human DNA is a terrible idea.

alienanimals@lemmy.world · 1 year ago

Regardless of the AI companies, copyright has been broken for decades thanks to Disney. The system has been redesigned to benefit and serve large corporations with more money than anyone else.

magnetosphere@kbin.social · edit-2 1 year ago

It’s an interesting question. To me, it only makes sense that AI companies should respect artist copyrights - especially since AIs purpose is to replace and minimize/eliminate the need for artists. On the other hand, licensing fees would quickly add up and be absolutely enormous. Only the biggest, wealthiest corporations (the ones we love to hate) could afford to invest in AI. Small, new companies won’t be able to afford it.

(Sorry if this is covered in the article. I haven’t read it yet. It’s late and I’m falling asleep!)

Edit: okay, now I’ve read it, and the situation is about as bad as I was expecting.

Meta’s argument that copyright holders wouldn’t get much money anyway makes me want to punch someone. It’s about respecting creators, not just money, you dipshits! Congrats on missing the point!

To balance things out, we’ve got Andreessen Horowitz crying “won’t someone please think about the billionaires?!?” That one made me laugh.

Adobe gets points for actually citing case law. I still don’t agree with their reasoning, but I appreciate the effort to keep the discussion professional.

Mahlzeit@feddit.de · 1 year ago

It’s about respecting creators

Is it, though? Copyright holders and creators are completely different things.

Before you can pay those copyright holders their capital income, you have to know who they are. Which means you can’t just download random pictures of the internet. You need pictures with a known provenance. Well, it turns out that there are corporations dedicated to providing just such pictures. How lucky for them if society would choose to “respect creators” in this way. The payment to even a prolific stock photographer may be tiny, but they’d get a cut from each one.

It may not be about money for you, but the people who pay to push that talking point may have a different attitude.

magnetosphere@kbin.social · 1 year ago

That’s a good point. You’re entirely correct. I had a much simpler idea in mind - I was only thinking of small, independent artists who posted their images online and were the copyright holders of their own work.

burliman@lemm.ee · edit-2 1 year ago

And I agree with them. When I learn to paint or take a cool picture, I may learn and be inspired from copyright materials. No one asks successful artists to audit the training materials that inspired them. But start telling AI companies they must do that, and I guarantee the precedent will be set to go after a human for learning from them. Don’t you dare tell people who you were inspired from when you make it big in your craft.

When I pay AI companies for anything, it’s not a proxy for copyright material, it’s for a service they provide serving, processing, or training the model. We will still require artists and creative people, even if all they do is skillfully prompt an AI tool to render art. But doing only that will be banal and not the pinnacle of what can be achieved with AI-assisted art creation. Art will still require the toil and circumstance that it always has.

Restricting AI from training on copyright materials is a vain and pointless exercise, but one of many that are meant to bring us to fear and loathe AI. It is one of many fears that the powerful want us to adopt… This is a technology that can and will lift us all if we can stop fearing it. But if we can’t do that, it won’t simply go away… It will only be driven into the bowels of the rich and powerful, so that they alone will benefit from it.

All the shovel journalism out there has a very strong purpose… to scare us, so this great equalizer will not be open and free and accessible. Don’t let them do this.

jet@hackertalks.com · edit-2 3 days ago

I was going to make this exact point. Very well said.

If we start saying intelligence owes fees to its training data, we’re basically saying humans are restricted by the licenses of the things they’ve been exposed to.

It’s only a matter of time until artificial intelligence matches biological intelligence, and if the precedent is set now, it’s going to make the future very sticky.

just another dev@lemmy.my-box.dev · 1 year ago

So tired of this bs argument.

When I learn to paint

… you will never be able to generate millions of paintings per day, so why even pretend it’s relevant here?

burliman@lemm.ee · 1 year ago

Your argument is tired. Have you ever simply prompted a generative txt2img and told it to make 100 or even 200 in the batch? You might have 1 or 2 that shine and are interesting without any touch up. But almost every one will require inpainting, photoshop work, or other creative modifications to be worth a damn. And even then some won’t be.

Like I said in my comment. It will be banal without real creativity. It doesn’t even take millions of “paintings” to get there. No one will care about cheaply manufactured junk after the novelty wears off. We will demand more than that.

Ultimately it will be a tool that extends all our creativity. It already is. But if we fear it because of arguments like yours then laws will be made to keep it out of the hands of the common plebe. But it won’t disappear. You can bet your ass it won’t. It will just be used in dark places by powerful people, and not just for banal image prompting. And then you can fear it rightfully.

just another dev@lemmy.my-box.dev · 1 year ago

You’re missing (or ignoring) the point of my argument. A human who learns from other work can only apply that skill in a limited amount. Even if a human learns to copy Van Gogh’s style and continually churns out minor variations of his work, they cannot produce dozens per minute. Let alone learn to do that equally well from several hundreds (thousands?) of other artists. There’s a scale difference in human learning versus machine learning that is astronomical.

I’m not sure what you’re going on about with “fear”. But I think that training a model on non-public domain content, without the permission of (or even crediting) the creator should be illegal.

AlexanderESmith@kbin.social · 1 year ago

@admin

It was a hypothetical, I was just using myself as an example. Here’s one that’s not hypothetical:

I’m already a practiced in 3D modelling, UV unwrapping, texturing, lightning, rendering, compositing, etc. I could recreate a painting, pixel for pixel, in 3D space.

If I just hit render, is that my art now? It took a lot of research to learn how to do this, I should be able to make money on that effort, right?

I can do that millions of times and get the same result. I can set it on a loop and get as many as I want. It’s the same as copying the first render’s file, it just takes longer.

Now I decide to change the camera angle. Almost the entire image is technically different now, but the composition is the same. The colors, the subjects, relative placement in the scene, all the same, but it’s not really the same image anymore. Is it mine yet?

I can set the camera to a random X,Y,Z position, and have it point at a random object in the scene (so it never points off into blank space). Are those images mine? It’s never the same twice, but it still has the original artist’s style of subjects and lighting. I can even randomize each subjects position, size, hue, direction, add a modifier that distorts them to be wobbly or cubic… I can start generating random objects and throwing them in too, let’s call those “hallucinations”, thats a fun word…

At what specific point in this madness does the imagery go from someone else’s work to mine?

I absolutely can generate millions of unique images all day. Without using machine learning, based on work I recreated with my own human hands, and code I write uniquely from my experience and abilities. None of the work - artistically - is mine. I made no decisions on composition, style, meaning, mood, color theory, etc.

You may want to try to write these questions off, but I can tell you with certainty that other artists won’t.

AlexanderESmith@kbin.social · edit-2 1 year ago

@burliman

You can prompt an image genrater to just spit out the original art it trained on.

Imagine I had been classically trained as a painter. I study works from various artists. I become so familiar with those works - and skilled as a renderer of art in my own right - that I can reproduce, say, the Mona Lisa from memory with exacting accuracy. Should I be allowed to claim it as my art? Sign my name to it? Sell it as my own?

Now lets say we compare the original and my work at the micron level. I’m human, there’s no way I can match the original stroke for stroke, bristle to bristle. However small, there are differences. When does the work become transformative?

Let’s switch to an image generator. I ask for a picture of a smiling woman, renaissance style. The model happens to be biased to DaVinci, and it spits out almost exactly the same work as the Mona Lisa. Let’s say as a prompt engineer, I’ve never heard of or seen the Mona Lisa. I take the image, decide “meh, good enough for what I need right now”, and use it in some commercial product (say, a t-shirt). Should I be able to do that? What if it’s not the Mona Lisa, it’s a work from a living artist?

What if it’s not an image? Say I tell some model to make a song and it accidentally produces Greenday’s Basketcase (which itself is basically just a modified Pachelbel’s Canon), can I put that on a record and sell it? Who’s responsibility is it to make sure that a model’s output is unique or transformative? Shit, look at all the legal cases where musicians are suing other musicians because the chord progression is similar in two songs; What happens when it’s exactly the same because the prompt engineer for a music generation model isn’t paying attention?

You might have noticed that I haven’t referred to this technology as AI. That’s because it’s not. It’s Machine Learning. It has no intelligence. It neither seeks to create beautiful, original art, nor does it intend to rip someone off. It has no plans, no aspirations, no context, no whims. It’s a parrot, spitting out copies of things we ask it for. In general, these outputs are mixtures of various things, but sometimes they aren’t. They just output some of the training data, because that’s the output that - statistically - was the best match for the prompt.

As an artist myself, I don’t fear machine learned models. I fear that these greedy fuckin’ companies will warehouse any and every bit of data they can get their hands on, train their models on other people’s work, never pay them a dime, and rip off the essence of their art without any regard for what will happen to the original artists after some jackass execs tell all their advertising/webdesign/programming/scriptwriting/etc departments to just ask the “AI” to “design” everything.

You can already see this happening with game studios. Writers went on strike over it.

FaceDeer@kbin.social · 1 year ago

You can prompt an image genrater to just spit out the original art it trained on.

This is a common misconception. It’s not true, except in the extremely rare case of “overfitting” that all AI trainers work very hard to avoid. It’s considered a bug, because why would anyone spend millions of dollars and vast computer resources poorly replicating what can be accomplished with a simple copy/paste operation? That completely misses the point of all this.

If an AI art generator spits out copies of its training data it’s a failure of an AI art generator.

Cringe2793@lemmy.world · edit-2 1 year ago

You can prompt an image genrater to just spit out the original art it trained on.

This is incorrect. Have you tried doing it?

That’s not how AI works. It’s not magic, nor does it create “copies”. It creates entirely original works, with influences from other works (similar to what other humans do).

AutoTL;DR@lemmings.world · 1 year ago

This is the best summary I could come up with:

The US Copyright Office is taking public comment on potential new rules around generative AI’s use of copyrighted materials, and the biggest AI companies in the world had plenty to say.

We’ve collected the arguments from Meta, Google, Microsoft, Adobe, Hugging Face, StabilityAI, and Anthropic below, as well as a response from Apple that focused on copyrighting AI-written code.

There are some differences in their approaches, but the overall message for most is the same: They don’t think they should have to pay to train AI models on copyrighted work.

The Copyright Office opened the comment period on August 30th, with an October 18th due date for written comments regarding changes it was considering around the use of copyrighted data for AI model training, whether AI-generated material can be copyrighted without human involvement, and AI copyright liability.

There’s been no shortage of copyright lawsuits in the last year, with artists, authors, developers, and companies alike alleging violations in different cases.

Here are some snippets from each company’s response.

The original article contains 168 words, the summary contains 168 words. Saved 0%. I’m a bot and I’m open source!

mindbleach · 1 year ago

I don’t care if the robot learned English by reading all the books in the library.

I don’t care if it had to see every image on the internet to figure out what dogs are.

Nothing as overblown as copyright deserves to stop this technology from existing.

AlexanderESmith@kbin.social · 1 year ago

@mindbleach

That’s a terrible take, my dude.

Also, no one is arguing against the technology existing, they’re upset about how it’s being trained. Two different things.

KTVX94@lemmy.myserv.one · 1 year ago

Steal the content then undercut the people that produced it.