LLM for GTX1080 (8GB) for local use.

pythia@lemmy.dbzer0.com · 1 year ago

LLM for GTX1080 (8GB) for local use.

SkySyrup · 1 year ago

try openorca-mistral-7b, it should fit in your GPU. Try using exllama2 to speed up interference.

pythia@lemmy.dbzer0.com · 1 year ago

thx, this one? https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-GPTQ

SkySyrup · 1 year ago

yeah that should work!

pythia@lemmy.dbzer0.com · 1 year ago

Yes it does and fits the GPU just fine. Didn’t hallucinate but it was slow like 60s+ in the first run but did it’s job. Thanks.

SkySyrup · 1 year ago

good to hear it worked, it’s weird it’s so slow. I’m lucky to have access to a 3060, which isn’t that far out from a 1080, and get at least 40t/s on it. Are you running on CPU or are you using exllama?

pythia@lemmy.dbzer0.com · 1 year ago

It’s running on gpu, the task-manager shows 92% GPU utilization and i chose exllamaV2.

SkySyrup · 1 year ago

that’s really weird, I’m not sure how to help you there unfortunately :(