Everybody’s talking about Mistral, an upstart French challenger to OpenAI

daredevil@kbin.social · 1 year ago

Everybody’s talking about Mistral, an upstart French challenger to OpenAI

Ashyr · 1 year ago

It’s neat, but I hear you need a really beefy system to make it work.

It may be an insurmountable hurdle to bring such capabilities to lesser systems, so I’m not necessarily complaining, I just wish it was more accessible.

joneskind@lemmy.world · edit-2 1 year ago

I run it fine on a base model MacBook Air with 8Gb of RAM and absolutely crazy on a 30 GPU cores M2 Max. Didn’t try on my company’s M1 Pro but I will tomorrow.

I use the LMStudio app and download Mistral from there. The heavier model for my beefy Mac and a 3Gb one for the Air. GPU acceleration with Metal enabled.

I tried a lot of models for development purposes and this one blew my mind.

cheese_greater@lemmy.world · edit-2 1 year ago

Seriously? Might have to try it

Can you, like, “have” or keep it?

joneskind@lemmy.world · 1 year ago

You download the model and it’s on your computer for as long as you want. The whole point is to be able to use it locally.

cheese_greater@lemmy.world · edit-2 1 year ago

So it is entirely local? Schweet! How large is it (3GB for Air or something?)

joneskind@lemmy.world · 1 year ago

So it is entirely local? Absolutely

How large is it? 12 models of quantization, from 3.08GB to 7.70GB

I use mistral-7b-instruct-v0.1.Q3_K_L.gguf 3.82GB on the MBA

Note that it might crash sometimes during computation. Just push the button “reload” then “continue” and the model finish its sentence as if nothing happened. I don’t know if its related to MLStudio (the app using the model) or the model itself though.

bioemerl@kbin.social · 1 year ago

Mixtral GPTQ can run on a 3090

Mistral 7b can run on most modern gpus

joneskind@lemmy.world · edit-2 1 year ago

Oh boy, I missed Mixtral GPTQ and only tried Mistral 7b

Currently downloading mixtral-8x7b-v0.1.Q4_K_M.gguf

Thank you!

EDIT: mixtral-8x7b-v0.1.Q4_K_M.gguf was to heavy for my Mac but mixtral-8x7b-v0.1.Q3_K_M.gguf runs fine AF

bioemerl@kbin.social · 1 year ago

Be warned, prompt processing is slow

joneskind@lemmy.world · 1 year ago

It is indeed. I’m switching to the instruct model to see if I can get better results for code and documentation.

daredevil@kbin.social · 1 year ago

I’m looking forward to the day where these tools will be more accessible, too. I’ve tried playing with some of these models in the past, but my setup can’t handle them yet.

joneskind@lemmy.world · 1 year ago

You should definitely try Mistral. It runs on a potato

daredevil@kbin.social · edit-2 1 year ago

I’ll give it a shot later today, thanks

edit: Tried out mistral-7b-instruct-v0.1.Q4_K_M.ggufvia the LM Studio app. it runs smoother than I expected – I get about 7-8 tokens/sec. I’ll definitely be playing around with this some more later.

GBU_28@lemm.ee · 1 year ago

Are you running llama.cpp and a gguf format of the model?

daredevil@kbin.social · 1 year ago

I believe I was when I tried it before, but it’s possible I may have misconfigured things

GBU_28@lemm.ee · 1 year ago

Have you checked out llama-cpp-python? The API is very simple, from the readme

daredevil@kbin.social · 1 year ago

I haven’t, but I’ll keep this in mind for the future – thanks.

iopq@lemmy.world · 1 year ago

For this one, you should be able to run it on anything with 8GB of VRAM. That said, it may not be fast. You will probably want a Turing or newer card with as much VRAM bandwidth as possible.

daredevil@kbin.social · 1 year ago

That’s good to know. I do have 8GB VRAM, so maybe I’ll look into it eventually.

RmDebArc_5@lemmy.ml · 1 year ago

deleted by creator

ichbinjasokreativ@lemmy.world · 11 months ago

Something like mistral-dolphin (4GB) and mixtral-dolphin (26GB) are running very smoothly on my 6900xt on rocm 6

Everybody’s talking about Mistral, an upstart French challenger to OpenAI

Everybody’s talking about Mistral, an upstart French challenger to OpenAI

Mixture of experts