• thatKamGuy
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    2 days ago

    DeepSeek’s model is open-sourced and can be run locally; though I think there some bits related to its training data they have been kept obscured (if I remember correctly) - likely due to the dubious nature of how it was acquired.

    • rImITywR@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      ·
      2 days ago

      Unless training data is made available, a model is not open source. DeepSeek is better described as “open weight”.

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      2 days ago

      some bits related to its training data

      AKA ANY details about its training data, and its training hyperparameters, and literally any other details about its training. An ‘open’ secret among LLM tinkerers is that the Chinese companies seem to have particularly strong English/Chinese training data (not so much other languages though), and I’ll give you one guess on how.

      Deepseek is unusal in that they are open sourcing the general techniques they used and even some (not all) of the software frameworks they use.

      Don’t get me wrong, I think any level of openness should be encouraged (unlike OpenAI being as closed as physically possible), but they are still very closed. Unlike, say, IBM Granite models which should be reproducible.