Promising stuff from their repo, claiming “exceptional performance, achieving a [HumanEval] pass@1 score of 57.3, surpassing the open-source SOTA by approximately 20 points.”

https://github.com/nlpxucan/WizardLM

  • notfromhere@lemmy.one
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 years ago

    From the Twitter post

    New StarCoder coding model from @WizardLM_AI

    “WizardCoder-15B-v1.0 model achieves 57.3 pass@1 on the HumanEval Benchmarks … 22.3 points higher than the SOTA open-source Code LLMs.”

    My quants: https://huggingface.co/TheBloke/WizardCoder-15B-1.0-GGML https://huggingface.co/TheBloke/WizardCoder-15B-1.0-GPTQ

    Original: huggingface.co WizardLM/WizardCoder-15B-V1.0 · Hugging Face

    11:21 AM · Jun 14, 2023

    • LinuxFanatic@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 years ago

      On The Bloke’s hugging face repo, it says the GGML quants are not compatible with llama.cpp, anyone know why?

      • Kerfuffle
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        2 years ago

        It’s a different type of model. llama.cpp only supports LLaMA models while GGML (the machine learning library llama.cpp is based on) has examples of various models with different architectures. WizardCoder, MPT, Bloom, probably very soon Falcon. Also some separate projects use GGML to support other models (including some of the ones I listed). For example the Rust “llm” project can support LLaMA models, MPT, BLOOM.

        • noneabove1182OPM
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 years ago

          Looks like gpt4all supports it, thought it was based on llama for some reason going to have to give it a try

          • Kerfuffle
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 years ago

            It looks like a frontend that just bundles a bunch of stuff together. Oobabooga’s webui thing is similar: you can run stuff with llama.cpp, GPTQ, etc. What models and features are supported is going to depend on how the frontend manages that stuff. There are also forks of llama.cpp like koboldc++ which may support different models/features/formats (I know koboldc++ supports some older GGML file formats that llama.cpp broke compatibility with).

            • noneabove1182OPM
              link
              fedilink
              English
              arrow-up
              1
              ·
              2 years ago

              Oh wait does ooba support this? Nvm then I’m enjoying using that, I’m just a little lost sometimes haha

              • Kerfuffle
                link
                fedilink
                English
                arrow-up
                2
                ·
                2 years ago

                I don’t know if it does or doesn’t, I was just saying those two projects seemed similar: presenting a frontend for running inference on models while the user doesn’t necessarily have to know/care what backend is used.

                • noneabove1182OPM
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  2 years ago

                  Gotcha, koboldcpp seems to be able to run it, all of it is only a tiny bit confusing :D