Hello y’all, i was using this guide to try and set up llama again on my machine, i was sure that i was following the instructions to the letter but when i get to the part where i need to run setup_cuda.py install i get this error
File "C:\Users\Mike\miniconda3\Lib\site-packages\torch\utils\cpp_extension.py", line 2419, in _join_cuda_home raise OSError('CUDA_HOME environment variable is not set. ' OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. (base) PS C:\Users\Mike\text-generation-webui\repositories\GPTQ-for-LLaMa>
i’m not a huge coder yet so i tried to use setx to set CUDA_HOME to a few different places but each time doing echo %CUDA_HOME
doesn’t come up with the address so i assume it failed, and i still can’t run setup_cuda.py
Anyone have any idea what i’m doing wrong?
Since you are using Windows, you can try setting the
CUDA_HOME
to point to your CUDA installation folder through the “Edit Environment Variables” window.However, this guide seems pretty convoluted. I would recommend using one of the many Llama models people have already compiled and shared in HuggingFace.
I think I have the one I downloaded back when you needed to get approved by meta to download it, however I was just looking for the guide to actually start the thing, since I’m so used to using a GUI, I guess I didn’t realize I was actually building the damn thing lol
Agree with others, this guide is a bit more work than you probably need. I don’t really run windows much anymore but I did have an easier time with WSL like the other poster mentioned.
And just to check, are you planning on fine-tuning a model? If so then the whole anaconda / miniconda, pytorch, etc… path makes sense.
But if you’re not fine-tuning and you just want to run a model locally, I’d suggest ollama. If you want a UI on top of it, open-webui is great.
Nah I’m just wanting to run for now, maybe If I get more interested down the Line, but I will check those out
I had much better success using WSL, but I haven’t used it or even updated it in a long while. (I have been meaning to see how AMD GPU support has evolved over the last few months. Back in January’ish, AMD support was still bad.)
Anything that even is remotely Linux related is much easier to get working with WSL, btw. Almost all of my personal python stuff is running under it and it works great with VS Code
I mean Linux is an option but haven’t people been saying nvida drivers are a huge hassle to use on Linux?
They can be, I suppose. However, the AI libraries that I was tinkering with seemed to all be based around Ubuntu and Nvidia. With Docker, GPU passthrough is much better under Linux and Nvidia.
WSL improved things a bit after I got an older GTX 1650. For my AMD GPU, ROCm support is (was?) garbage under Windows using either Docker or WSL. I don’t remember having much difficulty with Nvidia drivers though… I think there might have been some strange dependency problems I was able to work through though.
AMD GPU passthrough on Windows to Docker containers was a no-go. I remember that fairly clear though.
My apologies. It has been a few months since I messed with this stuff.
Nah. There are some nvidia issues with wayland (that are starting to get cleared up), and nvidia’s drivers not being open-source rubs some people the wrong way, but getting nvidia and cuda up and running on linux is pretty easy/reliable in my experience.
WSL is a bit different but there are steps to get that up and running too.
What’s the python traceback? Can you add import os; os.getenv(“CUDA_HOME”) into the python script just to verify you’re setting correctly?
Have you looked at LocalAI seems pretty useful, if I was setting up again I’d go containerized…