not going to be the overhyped LLM doing the analysis
Here indeed I don’t think so but other vision models, e.g. https://github.com/vikhyat/moondream are relying on LLM to generate the resulting description.
Well to be fair, and even though I did spend a bit of time to write about the broader AI hype BS cycle https://fabien.benetou.fr/Analysis/AgainstPoorArtificialIntelligencePractices LLMs are in itself not “bad”. It’s an interesting idea to rely on our ability to produce and use languages to describe a lot of useful things around us. So using statistics on it to try to match is actually pretty smart. Now… there are so many things that went badly for the last few years I won’t even start (cf link) but the concept per se, makes sense to rely on it sometimes.
At least it’s not going to be the overhyped LLM doing the analysis, it seems, considering the input is a photo data.
Here indeed I don’t think so but other vision models, e.g. https://github.com/vikhyat/moondream are relying on LLM to generate the resulting description.
My gosh, what is with people’s reliance on single thing
Well to be fair, and even though I did spend a bit of time to write about the broader AI hype BS cycle https://fabien.benetou.fr/Analysis/AgainstPoorArtificialIntelligencePractices LLMs are in itself not “bad”. It’s an interesting idea to rely on our ability to produce and use languages to describe a lot of useful things around us. So using statistics on it to try to match is actually pretty smart. Now… there are so many things that went badly for the last few years I won’t even start (cf link) but the concept per se, makes sense to rely on it sometimes.