AMD announces unified UDNA GPU architecture — bringing RDNA and CDNA together to take on Nvidia's CUDA ecosystem

vegeta@lemmy.world · 8 months ago

AMD announces unified UDNA GPU architecture — bringing RDNA and CDNA together to take on Nvidia's CUDA ecosystem

Russ@bitforged.space · 8 months ago

Ah, strange. I don’t suppose you specifically need a Fedora container? If not, I’ve been using this Ubuntu based distrobox container recipe for anything that requires ROCM and it has worked flawless for me.

If that still doesn’t work (I haven’t actually tried out kobolcpp yet), and you’re willing to try something other than kobolcpp, then I’d recommend the text-generation-webui project which supports a wide array of model types, including the GGUF types that Kobolcpp utilizes. Then if you really want to get deep into it, you can even pair it with SillyTavern (it is purely a frontend for a bunch of different LLM backends, text-generation-webui is one of the supported ones)!

DarkThoughts@fedia.io · edit-2 4 months ago

Removed by mod

Russ@bitforged.space · 8 months ago

Hmm, gotcha. I just tried out a fresh copy of text-gen-webui and it seems like the latest version is borked with ROCM (I get the CUDA error: invalid device function error).

My next recommendation then would be LM Studio which to my knowledge can still output an OpenAI compatible API endpoint to be used in SillyTavern - I’ve used it in the past before and I didn’t even need to run it within Distrobox (I have all of the ROCM stuff installed locally, but I generally run most of the AI stuff in distrobox since it tends to require an older version of Python than Arch is currently using) - it seems they’ve recently started supporting running GGUF models via Vulkan, which I assume probably doesn’t require the ROCM stuff to be installed perhaps?

Might be worth a shot, I just downloaded the latest version (the UI has definitely changed a bit since I last used it) and just grabbed a copy of the Gemma model and ran it, and it seemed to work without an issue for me directly on the host.

The advanced configuration settings no longer seem to directly mention GPU acceleration like it used to, however I can see it utilizing GPU resources in nvtop currently, and the speed it was generating at (the one in my screenshot was 83 tokens a second) couldn’t have possibly been done on the CPU so it seems to be fine on my side.

DarkThoughts@fedia.io · edit-2 4 months ago

Removed by mod