Retentive Network: A Successor to Transformer for Large Language Models

noneabove1182 · 2 years ago

Retentive Network: A Successor to Transformer for Large Language Models

kakes · 2 years ago

They mention the possibility of parallelization in training. Is this something that could allow (or lead to allowing) distributed training? Something like Folding@Home for LLMs?

If so, I’m beyond excited. I honestly think that’ll be a major step forward in the democratization of AI, if we can crowdsource training.

noneabove1182 · 2 years ago

That’s definitely a nifty idea, we’ve got people getting distributed inferencing, I can’t see why we couldn’t do something similar for training, especially if we learn better ways to combine training samples