as a reference for those looking to train a LLM locally
It took me hours to finetune a small (for today’s standards) BERT model with an RTX 4090, I can’t imagine doing anything on chips like those referenced in the article, even inference.
I wouldn’t do any training that’s not at least with a 7800/7900 XTX, if you can get them to work.
It took me hours to finetune a small (for today’s standards) BERT model with an RTX 4090, I can’t imagine doing anything on chips like those referenced in the article, even inference.
I wouldn’t do any training that’s not at least with a 7800/7900 XTX, if you can get them to work.