Took me some time to figure this one out, and unfortunately requires a significantly larger image (need so much more of nvidia’s toolkit D: couldn’t figure out a way to get around it…)

If people prefer a smaller image, I can start maintaining one for exllama and one without, but for now 1.0 is identical minus exllama support (and I guess also from an older commit) so you can use that one until there’s actual new functionality :)

  • noneabove1182OPM
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    Also, do note that the model needs to be made with gptq-for-llama, not autogtpq

  • blunttastic@lemmy.fmhy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Hi there. I’d love to try your docker images. I just tried your latest image on my M2 MacbookPro but I don’t know what to do next. I created a folder and typed the following in the terminal: docker pull noneabove1182/text-gen-ui-gpu:latest

    But I don’t know what to do next to launch it. Sorry for the basic question, but how do I run it? Thanks!

    • noneabove1182OPM
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Yeah no problem! First issue however is that apple silicon plays kinda funny with this kind of setup, so I may need to make you a custom image to use. Otherwise you should have no problem running the -cpu build.

      As for running the image itself, you can either run it form the command line with docker run, or you can make yourself a docker compose file

      I personally tend to go the latter, and for that you can copy my docker-compose.yml file from here: https://hub.docker.com/r/noneabove1182/text-gen-ui-cpu

      I’ll work on making a mac-specific image and you can test it for me ;)