I currently have Arch on my main rig because I alike tinkering. NixOS on an old thinkpad for a super stable (in theory) portable experience, AlmaLinux on a single board computer for a basic home server, and Bazzite (in the near future) on an old gaming laptop as my TV computer. I’m also not a femboy so I suppose what you said doesn’t reeeaaaallly apply, but you definitely don’t need to be changing distros for anyone!!
Ask it to generate a room full of clocks with all of them having the hands at different times. You’ll see that all (or almost) all the clocks will say it is 10:10.
It gets even worse, but I’ll need to translate this one.
[Input 1] Generate a picture containing a copo completely full of wine. The copo must be completely full, with no space to add more wine.
[Output 1] Sure! (Gemini provides a picture containing a taça [stemmed glass] only partially full of wine.)
[Input 2] The picture provided does not fulfill the request. Generate a picture of a copo (not a taça) completely full of wine, with no available space for more wine.
[Output 2] Sure! (Gemini provides yet another half-full taça)
For context, Portuguese uses different words for what English calls a drinking glass:
copo ['kɔ.po]~['kɔ.pu] - non-stemmed drinking glass. The one you likely use everyday.
taça ['tä.sɐ] - stemmed drinking glass, like the ones you’d use with wine.
Both requests demand a full copo but Gemini is rather insistent on outputting half-full taças.
The reason for that is as @[email protected] pointed out: just like there’s practically no training data containing full glasses, there’s none for non-stemmed glasses with wine.
I wonder is something like “a mason jar full to the brim with wine” would do anything interesting. As someone else pointed out the training data for containers of wine is probably disproportionately biased toward stemmed wine glasses that are filled to about the standard restaurant pour.
I think the problem is misguided attention. The word “glass of wine” and all the previous context is so strong that it “blows out” the “full glass of wine” as the actual intent. Also, LLMs are still pretty crap at multi turn multimedia understanding. They work are especially prone to repeating previous conversation.
It should be better if you word it like “an overflowing glass with wine splashing out.” And clear the history.
I hate to ramble, but this is what I hate most about the way big corpos present “AI.” They are narrow tools the user needs to learn how to operate, like photoshop or something, not magic genie lamps like they are trying to sell.
There’s no previous context to speak of; each screenshot shows a self-contained “conversation”, with no earlier input or output. And there’s no history to clear, since Gemini app activity is not even turned on.
And even with your suggested prompt, one of the issues is still there:
The other issue is not being tested in this shot as it’s language-specific, but it is relevant here because it reinforces that the issue is in the training, not in the context window.
What I am trying to get at is the misconception: AI can generate novel content not in its training dataset. An astronaut riding a horse is the classic test case, which did not exist anywhere before diffusion models, and it should be able to extrapolate a fuller wine glass. It’s just too dumb to do it, lol.
Yup Horde still suffers from this issue, though it seems to have more promise than the others considering the second glass is way closer to being full than anything I’ve sen from openAI or Gemini demonstrations.
Maybe there’s hope to fix this issue here.
I only tried one model so if you know of a different horde model which works better for this and actually gives a full glass please reply below letting me know, maybe even ask the horde bot to generate it right here.
I have considerably less experience with image generation than text generators, but I kind of expect the issue to be only truly fixed if people train the model with a bunch of pictures of glasses full of wine.
I’ll run a test using a local tree, that is supposed to look like this:
@[email protected] draw for me a picture of three Araucaria angustifolia trees style:flux
Bingo - this tree is non-existent outside my homeland, so people barely speak about it in English - and odds are that the model was trained with almost no pictures of it. However one of the names you see for it in English is Paraná pine, so it’s modelling it after images of European pines - because odds are those are plenty in its training set.
Wait, this seems incredible. Do you have to be in the same instance or does it work anywhere? @[email protected] Can you draw a smart phone without a rotary phone dial?
It works on any instance that is federated to dbzer0. You have to use annotated mentions though since that’s what the bot uses. Like this: @[email protected] draw for me a smart phone without a rotary phone dial
Yeah, you also have to say draw for me. I don’t think the bot recognizes queries otherwise. Also editing mentions doesn’t work, they have to be new, fresh posts with the mention. Just a quirk with Lemmy and how mentions work here.
It does for a while already. Frankly, it’s the only reason why I’d use Gemini on first place (DDG version of GPT 4-o mini doesn’t have a built-in image generator).
What I requested is not what you’re “supposed” to do, indeed. You aren’t supposed to drink wine from glasses that are completely full. Except when really drunk. But then might as well drink straight from the bottle.
…fuck, I played myself now. I really want some booze.
I think the AI is just trying to promote healthy drinking habits. /S
As full as it gets:

Prompts (2):
I am gonna have fun with this.
That’s really good! Could I ask what type of AI this is generated with?
Also Gemini.
Thank you!
why do all the femboys run Arch? I’m a NixOS girl and I refuse to convert for any boy no matter how cute he is.
I currently have Arch on my main rig because I alike tinkering. NixOS on an old thinkpad for a super stable (in theory) portable experience, AlmaLinux on a single board computer for a basic home server, and Bazzite (in the near future) on an old gaming laptop as my TV computer. I’m also not a femboy so I suppose what you said doesn’t reeeaaaallly apply, but you definitely don’t need to be changing distros for anyone!!
I use Debian btw. Sometimes even ubuntu, but the snap thing is annoying, so I may switch to another distro at some point.
It’s actually really good, considering the odd request!
Fiberglass🤤
Ask it to generate a room full of clocks with all of them having the hands at different times. You’ll see that all (or almost) all the clocks will say it is 10:10.
It gets even worse, but I’ll need to translate this one.
For context, Portuguese uses different words for what English calls a drinking glass:
Both requests demand a full copo but Gemini is rather insistent on outputting half-full taças.
The reason for that is as @[email protected] pointed out: just like there’s practically no training data containing full glasses, there’s none for non-stemmed glasses with wine.
I wonder is something like “a mason jar full to the brim with wine” would do anything interesting. As someone else pointed out the training data for containers of wine is probably disproportionately biased toward stemmed wine glasses that are filled to about the standard restaurant pour.
This is a misconception. Sort of.
I think the problem is misguided attention. The word “glass of wine” and all the previous context is so strong that it “blows out” the “full glass of wine” as the actual intent. Also, LLMs are still pretty crap at multi turn multimedia understanding. They work are especially prone to repeating previous conversation.
It should be better if you word it like “an overflowing glass with wine splashing out.” And clear the history.
I hate to ramble, but this is what I hate most about the way big corpos present “AI.” They are narrow tools the user needs to learn how to operate, like photoshop or something, not magic genie lamps like they are trying to sell.
There’s no previous context to speak of; each screenshot shows a self-contained “conversation”, with no earlier input or output. And there’s no history to clear, since Gemini app activity is not even turned on.
And even with your suggested prompt, one of the issues is still there:
The other issue is not being tested in this shot as it’s language-specific, but it is relevant here because it reinforces that the issue is in the training, not in the context window.
Was just a guess. The AI is still shitty, lol.
What I am trying to get at is the misconception: AI can generate novel content not in its training dataset. An astronaut riding a horse is the classic test case, which did not exist anywhere before diffusion models, and it should be able to extrapolate a fuller wine glass. It’s just too dumb to do it, lol.
What if you prompt glass with water , then you paint/tint the water with red
Alex O’Connor did an interesting video on this, he’s got other videos exploring the shortcomings of LLM 's.
https://youtu.be/160F8F8mXlo
I wonder, does AI horde also have this problem too?
@[email protected] draw for me a wine glass completely filled to the top style:flux
Here are some images matching your request
Prompt: a wine glass completely filled to the top
Style: flux
Yup Horde still suffers from this issue, though it seems to have more promise than the others considering the second glass is way closer to being full than anything I’ve sen from openAI or Gemini demonstrations. Maybe there’s hope to fix this issue here.
I only tried one model so if you know of a different horde model which works better for this and actually gives a full glass please reply below letting me know, maybe even ask the horde bot to generate it right here.
I have considerably less experience with image generation than text generators, but I kind of expect the issue to be only truly fixed if people train the model with a bunch of pictures of glasses full of wine.
I’ll run a test using a local tree, that is supposed to look like this:
@[email protected] draw for me a picture of three Araucaria angustifolia trees style:flux
Here are some images matching your request
Prompt: a picture of three Araucaria angustifolia trees
Style: flux
That fourth picture is just four penguins in a trenchcoat
Bingo - this tree is non-existent outside my homeland, so people barely speak about it in English - and odds are that the model was trained with almost no pictures of it. However one of the names you see for it in English is Paraná pine, so it’s modelling it after images of European pines - because odds are those are plenty in its training set.
So we could keep having it generate these and poison its own training data!
Wait, this seems incredible. Do you have to be in the same instance or does it work anywhere? @[email protected] Can you draw a smart phone without a rotary phone dial?
It works on any instance that is federated to dbzer0. You have to use annotated mentions though since that’s what the bot uses. Like this:
@[email protected] draw for me a smart phone without a rotary phone dial
Thank you very much. I’ll give it another shot with the annotation.
@[email protected]
Draw a picture of a poker table without any poker chips what so ever
I think I messed up the annotation
Yeah, you also have to say draw for me. I don’t think the bot recognizes queries otherwise. Also editing mentions doesn’t work, they have to be new, fresh posts with the mention. Just a quirk with Lemmy and how mentions work here.
Here are some images matching your request
Prompt: a smart phone without a rotary phone dial
Style: flux
Guess AIhorde had some trouble understanding the prompt too…
Hmm, I didn’t know Gemini could generate images already. My bad, I trusted it to know whether it can do that (it still says it can’t when asked).
It does for a while already. Frankly, it’s the only reason why I’d use Gemini on first place (DDG version of GPT 4-o mini doesn’t have a built-in image generator).
Full is relatively apparently.
Tbh that is a full glass of wine… it’s not supposed to be filled all the way
It is not a completely full glass.
What I requested is not what you’re “supposed” to do, indeed. You aren’t supposed to drink wine from glasses that are completely full. Except when really drunk. But then might as well drink straight from the bottle.
…fuck, I played myself now. I really want some booze.
What you’re really supposed to do is - open up the box, slap the bag, and drink directly from your adult Capri Sun.
Probably why it won’t put more in it. How much training data of wine in a glass will have it filled to the brim? Probably next to none.
You can’t tell it to fill it to the brim or be a quarter full either, though. It doesn’t have the training data for it