Welp ...

hypertown@ani.social · 9 个月前

Welp ...

driving_crooner@lemmy.eco.br · 9 个月前

Artistic depiction of AI models training on AI generated data.

0ops@lemm.ee · 9 个月前

That’s a hard-ass album cover

SidewaysHighways@lemmy.world · 9 个月前

Yeah that djents to the max

Hupf@feddit.de · 9 个月前

Least disturbing 40k lore

AllNewTypeFace@leminal.space · 9 个月前

Some have called it the Habsburg Singularity.

zcd@lemmy.ca · edit-2 9 个月前

AI “human-centipeding” itself with art and word salads Is going to make confirmed human generated content an extremely valuable resource. That’s partly why literally everybody is foaming at the mouth trying to scrape all of our shit. The signal (human) to noise is only going to get worse, The dead Internet is almost here if it isn’t already

GrayBackgroundMusic@lemm.ee · 9 个月前

I’d love for this to be true, but I’d need to see some proof. It feels like wish-fulfillment.

SaucySnake@lemmy.world · 9 个月前

https://arxiv.org/abs/2306.07899 here’s a paper that found that one of the biggest sources for LLM training data is corrupted by people using AI to complete the tasks. There are plenty of papers out there that show the effects of this, which they call “model collapse”.

FMT99@lemmy.world · 9 个月前

Same. I keep hearing folks mention this but it’s not like AI developers aren’t aware of this (apart from a bunch of shitty startups that would fail no matter what) One way to deal with it for example is Microsoft is shelling out so much for “pre-AI” datasets (Reddit) but I’m sure there’s a lot more of those kinds of initiatives.

Google on the other hand is going to be hard pressed to deal with the ever increasing deluge of AI spam.

Ultraviolet@lemmy.world · 9 个月前

That’s a way to deal with it, but in the long term, “pre-AI” becomes a longer and longer time ago, and less and less useful for any practical purposes.

I Cast Fist@programming.dev · 9 个月前

Google on the other hand is going to be hard pressed to deal with the ever increasing deluge of AI spam.

Given how they’re one of the main culprits of showing AI spam on the first page, I don’t think they care at all

hypertown@ani.social · 9 个月前

While I don’t have definitive proof I’ve seen both AI fanarts that are so good it’s hard to tell if it’s AI or not and AI abominations that are so bad and artificial it want to make you puke so I guess it really depends on the model.

Funny enough adobe now offers AI generated stock photos that are closer to those abominations rather than anything good. Though if you think about it AI stock art is so pointless… There are already so many stock photos you can choose from. Why would you go out of your way to choose a photo that looks almost the same as regular stock photos but people have 4 arms in it…

I Cast Fist@programming.dev · 9 个月前

Why would you go out of your way to choose a photo that looks almost the same as regular stock photos but people have 4 arms in it…

You never know what smut some people are writing. Maybe that 4 armed horror was exactly what they needed for their cover.

Even_Adder@lemmy.dbzer0.com · 9 个月前

I’ve heard training on synthetic data is fine now. Most datasets are augmented with synthetic data.

YourPrivatHater@ani.social · edit-2 8 个月前

Removed by mod

brucethemoose@lemmy.world · 9 个月前

Its far worse because LLMs are so data hungry. Getting quality data for image diffusion models is not nearly as much of an issue, though still a problem.

YourPrivatHater@ani.social · edit-2 8 个月前

Removed by mod

brucethemoose@lemmy.world · edit-2 9 个月前

Nah. I hate to sound bitter/salty, but all the AI haters are just going to fuel OpenAI’s crusade/lobbying against open source, and we will be stuck with expensive, inefficient, dumb corporate API models trained on copyrighted material in secret because the corporations literally don’t care. And it will do nothing to solve the environmental problems.

There’s tons of research on making training and especially inference more power efficient, on making data cleaner and fairer, and it’s getting squandered from the lobbying against open source that the “AI is all bad” crowd is fueling. All the money to even turn these experiments into usable models is getting funneled away already.

Everyone’s got it wrong, the battle isn’t between AI or no AI, it’s whether your own it and run it yourself, or big tech owns it and runs it. You know, like Lemmy vs Reddit.

So… that’s my rant.

webghost0101@sopuli.xyz · edit-2 9 个月前

Age of AI: Asset-flip Regenerated

“Death internet theory evolved”

yokonzo@lemmy.world · 9 个月前

I mean we knew about this a couple of years ago no?

MelodiousFunk@slrpnk.net · 9 个月前

Everybody be worried about Skynet, and I’m all, we need to trudge through Multiplicity first.

AVincentInSpace@pawb.social · 9 个月前

yeah I’ve seen this before

OP’s source is that they want it to be true

mindbleach · 9 个月前

More training has been more important than more input, once they hit critical mass for enough input.

gmtom@lemmy.world · 9 个月前

This is a problem in AI but these people MASSIVELY over state how big of a problem it is because, surprise surprise, the militant anti-AI crows are actually incredibly uneducated on the topic and just parrot the random bullshit they see online because it agrees with their existing beliefs.

Naz · 9 个月前

Hello, I coined the term “hallucinations” way back in 2019.

I heard “inbreeding” from another author and I’ve been using that in formal circles and I think it’s hilarious.

Yes, we’re inbreeding the visual models ;)))))

Kogasa@programming.dev · 9 个月前

https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)#Term

Naz · 9 个月前

Well shit, look at that, TIL. Nothing is new under the sun.

Still crazy though!

brucethemoose@lemmy.world · edit-2 9 个月前

I used the term “inbreeding” even before that, when finetuning ESRGAN on its own output.

That being said, it really isn’t as much of a problem for diffusion models as it is for others.

leftzero@lemmynsfw.com · 9 个月前

Anyone who played with a photocopier as a kid could have told you this was going to happen.