It’s a small model by comparison. If you want something that’s offline and actually closer to comparing to ChatGPT 3.5, you’ll want the Mixtral 8x7B model instead (running on a beefy machine):
I always find it interesting how text is so much slower than image generation. I can do a 1024x1024 in probably 20s, but I get like 1 word a second with text.
I’d love to see some consumer level AI stuff, sadly it all seems to be designed for server farms and by the time it ages out into consumer prices it’s so obsolete there’s no point in getting it.
Nice! Thats a cool project, ill have to give it a try. I love the idea of self hosting local LLMs. Ive been playing around with: https://lmstudio.ai/ and it directly downloads from hugging face.
Direct link to the GitHub repo:
https://github.com/nickbild/local_llm_assistant?tab=readme-ov-file
It’s a small model by comparison. If you want something that’s offline and actually closer to comparing to ChatGPT 3.5, you’ll want the Mixtral 8x7B model instead (running on a beefy machine):
https://mistral.ai/news/mixtral-of-experts/
Sick, I only need 90gb of VRAM!
I’ve got it running with a 3090 and 32GB of RAM.
There are some models that let you run with hybrid system RAM and VRAM (it will just be slower than running it exclusively with VRAM).
Yeah but damn does it get slow.
I always find it interesting how text is so much slower than image generation. I can do a 1024x1024 in probably 20s, but I get like 1 word a second with text.
Languages are complex and, more importantly, much less forgiving to error
Removed by mod
Graphic cards without video connection exists since a while.
I’d love to see some consumer level AI stuff, sadly it all seems to be designed for server farms and by the time it ages out into consumer prices it’s so obsolete there’s no point in getting it.
Do they want consumer ai cards to exist though?
Think about the data!
Card makers? They only want money, if theres enough consumer level demand they will make them.
I guess your right.
Removed by mod
Nice! Thats a cool project, ill have to give it a try. I love the idea of self hosting local LLMs. Ive been playing around with: https://lmstudio.ai/ and it directly downloads from hugging face.
There’s also ollama which seems to be similar. Not sure if LMStudio is open source but ollama is.
Removed by mod
How fast are they with a good GPU?
Removed by mod
Sorry, I’m just curious in general how fast these local LLMs are. Maybe someone else can give some rough info.