JetBrains’ AI code suggestions were only trained on code where authors gave explicit permission for it, but that’s the only one I know from the top of my head.
Most chat-oriented LLMs (ChatGPT, Claude, Gemini…) were almost certainly trained using corporate piracy.
There are a number of open weight open source models out there with all their data sourced from the public domain. Look up BLOOM and Falcon. There are others.
Which AI is the ethically-sourced one
JetBrains’ AI code suggestions were only trained on code where authors gave explicit permission for it, but that’s the only one I know from the top of my head. Most chat-oriented LLMs (ChatGPT, Claude, Gemini…) were almost certainly trained using corporate piracy.
There are a number of open weight open source models out there with all their data sourced from the public domain. Look up BLOOM and Falcon. There are others.