minus-squareYahmatoLocalLLaMA•Retentive Network: A Successor to Transformer for Large Language ModelslinkfedilinkEnglisharrow-up2·1 year agoIts about time we start looking into alternatives to the transformer model. We are starting to come up on the limits of what is capable with local hardware and there are obvious missing components to the transformer model. linkfedilink
Its about time we start looking into alternatives to the transformer model. We are starting to come up on the limits of what is capable with local hardware and there are obvious missing components to the transformer model.