I don’t consider myself very technical. I’ve never taken a computer science course and don’t know python. I’ve learned some things like Linux, the command line, docker and networking/pfSense because I value my privacy. My point is that anyone can do this, even if you aren’t technical.

I tried both LM Studio and Ollama. I prefer Ollama. Then you download models and use them to have your own private, personal GPT. I access it both on my local machine through the command line but I also installed Open WebUI in a docker container so I can access it on any device on my local network (I don’t expose services to the internet).

Having a private ai/gpt is pretty cool. You can download and test new models. And it is private. Yes, there are ethical concerns about how the model got the training. I’m not minimizing those concerns. But if you want your own AI/GPT assistant, give it a try. I set it up in a couple of hours, and as I said… I’m not even that technical.

  • coffee_with_cream
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    5
    ·
    edit-2
    4 months ago

    You probably want 48gb of vram or more to run the good stuff. I recommend renting GPU time instead of using your own hardware, via AWS or other vendors - runpod.io is pretty good.

    • NotMyOldRedditName@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      4 months ago

      Kinda defeats the purpose of doing it private and local.

      I wouldn’t trust any claims a 3rd party service makes with regards to being private.

    • Terrasque@infosec.pub
      link
      fedilink
      English
      arrow-up
      4
      ·
      4 months ago

      Llama3 8b can be run at 6gb vram, and it’s fairly competent. Gemma has a 9b I think, which would also be worth looking into.

    • 31337
      link
      fedilink
      English
      arrow-up
      3
      ·
      4 months ago

      IDK, looks like 48GB cloud pricing would be 0.35/hr => $255/month. Used 3090s go for $700. Two 3090s would give you 48GB of VRAM, and cost $1400 (I’m assuming you can do “model-parallel” will Llama; never tried running an LLM, but it should be possible and work well). So, the break-even point would be <6 months. Hmm, but if Severless works well, that could be pretty cheap. Would probably take a few minutes to process and load a ~48GB model every cold start though?

      • ffhein@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 months ago

        Assuming they already own a PC, if someone buys two 3090 for it they’ll probably also have to upgrade their PSU so that might be worth including in the budget. But it’s definitely a relatively low cost way to get more VRAM, there are people who run 3 or 4 RTX3090 too.