Beginner questions thread

noneabove1182 · 1 year ago

Beginner questions thread

doodlebob@lemmy.world · 1 year ago

I have two 3090 Turbo GPUs and it seems like oobabooga doesn’t split the load between the two cards when I try to run TheBloke/dolphin-2.7-mixtral-8x7b-AWQ.

Does anyone know how to make text generation webui use both cards? Do I need an nvlink between the two cards?

noneabove1182 · 1 year ago

You shouldn’t need nvlink, I’m wondering if it’s something to do with AWQ since I know that exllamav2 and llama.cpp both support splitting in oobabooga

doodlebob@lemmy.world · 1 year ago

I think you’re right. Saw a post on Reddit basically mentioning the same things I’m seeing.

It looks like autoawq supports it but it might be an issue with how oobabooga implements it or something…