They support Claude, ChatGPT, Gemini, HuggingChat, and Mistral.
And I still can’t convince it to stop caching the images because it does not follows the RFC.
Luckily, it seems to be disabled by default. At the moment.
Sigh. I’m glad to have switched to LibreWolf.
I switched a while back before all the Ai and “privacy preserving” telemetry stuff.
Every update note I see for Firefox now just reinforces my decision.
Wasn’t this there for a while, or just me.
It is since version 128 I think
I think 130
I wonder if this can be removed at compile time, like Pocket.
Unpopular opinion, I think they’re doing it right as well as it can be at least. It’s completely optional and doesn’t seem to be intrusive.
I agree
yeah its not google chrome level which i’m thankful about.
I’m way more pissed about restarting my PC after an update and having Copilot installed without my permission.
Didn’t want it in Opera, don’t want it in Firefox. I mean they can keep trying and I’ll just keep on ignoring this shit :/
hopefully, it’ll be possible to opt out somehow.
as the screenshot shows, it is opt-in
Wow, great job Firefox. Thanks.
If I wanted unreliable bullshit like AI, I’d use Chrome.
I don’t understand the hate. It’s just a sidebar for the supported LLMs. Maybe I’m misunderstanding?
Yes, I would prefer Mozilla focus on the browser, but to me, this seems like it was done in an afternoon.
It seems like common cynicism. Mozilla add this feature, as not to yield major features to other browsers. Mozilla’s lets you natively have lots of different AI solutions to pick from.
Not every feature is for everyone. Not every feature is done being improved on at release.
And in spite of popular opinions, organizations don’t do just one thing and then do just the next thing and the thing after that. Organizations can and do focus on and prioritize many things at the same time.
And for people who are naysaying AI at every mention, it has a lot of great and fascinating uses, and if you think otherwise, you really should try them more. I’ve used it plenty for work and life. It’s not going away, might as well do some nice things with it.
I want my browser to be a browser. I don’t want Pocket, I don’t want AI, I don’t want bullshit. There are plugins for that.
that’s the great thing: you don’t have to use it
This happened ages ago, didn’t it? Am I missing something new?
Yeah, it did. That feature has been there at least since when Mozilla enabled “Firefox labs” section in settings by default a few months ago, and maybe even earlier than that
TIL a month is an age.
Well, this month in particular…
True. ❤️
I only saw it now, maybe it happened before on a different version.
If they do it in a privacy-preseeving way, this could help them get back market share which will generally benefit an open internet.
But it’s gonna be very difficult when you’ve got Google and OpenAI up there.
It’s an open source project, you can keep it in a box and people are able to check it.
I really wish there was another way.
Why would anybody want to have AI in their browser? It’s a fucking browser.
Because browsers are the most useful tool on most computers. Ordinary People go on google/ask chatgpt for mundane questions. If their browser can do that they need 1 app less and it will be more convenient which is what especially non-tech savy people care about.
I will say, the Le Chat provider is pretty decent. You really can use it more natural language. “Rewrite it with a better rhyme scheme” “remove the last line” and it just got it.
Why no local option though? Why no anonmysing option?
Edit: There is a right click option which does make this officially actually useful for me now (summarize this!).
Other models do have RAG options and Mist real supports making agents with specified documentation too to at least fine tune too (not as good as full grounding though IMHO)
Thing is, for your average user with no GPU and whp never thinks about RAM, running a local LLM is intimidating. But it shouldn’t be. Any system with an integrated GPU, and the more RAM the better, can run simple models locally.
The not so dirty secret is that ChatGPT 3 vs 4 isn’t that big a difference, and neither are leaps and bounds ahead of the publically available models for about 99% of tasks. For that 1% people will ooh and aah over it, but 99% of use cases are only seeing marginal gains on 4o.
And the simplified models that run “only” 95% as well? They can use 90% fewer resources give pretty much identical answers outside of hyperspecific use cases.
Running a a “smol” model as some are called, gets you all the bang for none of the buck, and your data stays on your system and never leaves.
I’ve been yelling from the rooftops to some stupid corporate types that once the model is trained, it’s trained. Unless you are training models yourself, there is no need for the massive AI clusters, just for the model. Run it local on your hardware at a fraction of the cost.
Idk I noticed pretty significant differences between models of various sizes. I mean there are lots of metrics on this
Can you point me to some resources to running smol llm?
My use case prob just to help “typing” miscellaneous idea I have or check for my grammatical error, in english.
Thanks, in advance.
Here you go: Review of SmolVLM https://www.marktechpost.com/2024/11/26/hugging-face-releases-smolvlm-a-2b-parameter-vision-language-model-for-on-device-inference/
Model itself: https://huggingface.co/spaces/HuggingFaceTB/SmolVLM
And you can use Ollama to run it locally, and Open WebUI to access it in browser.
Last time I tried using a local llm (about a year ago) it generated only a couple words per second and the answers were barely relevant. Also I don’t see how a local llm can fulfill the glorified search engine role that people use llms for.
Try again. Simplified models take the large ones and pare them down in terms of memory requirements, and can be run off the CPU even. The “smol” model I mentioned is real, and hyperfast.
Llama 3.2 is pretty solid as well.
These are the answers they gave the first time.
Qwencoder is persistent after 6 rerolls.
Anyways, how do I make these use my gpu? ollama logs say the model will fit into vram / offloaing all layers but gpu usage doesn’t change and cpu gets the load. And regardless of the model size vram usage never changes and ram only goes up by couple hundred megabytes. Any advice? (Linux / Nvidia) Edit: it didn’t have cuda enabled apparently, fixed now
Nice.
Yea I don’t trust any AI models for facts, period. They all just lie. Confidently. The smol model there at least tried and got it right at first… Before confusing the sentence context.
Qwen is a good model too. But if you wanted something to run home automation or do text summaroes, smol is solid enough. I’m using CPU so it’s good enough.
They’re fast and high quality now. ChatGPT is the best, but local llms are great, even with 10gb of vram.
I mean, if you’re going to do it, where’s the Ollama love?
I was disappointed there was no local option…
I don’t get it, ollama is a provider no?
A provider that can be run locally.
I think the point is it’s open source
and so is firefox, so why use another model provider