☆ Yσɠƚԋσʂ ☆ to [email protected]English • 7 months ago1-bit LLM performs similarly to full-precision Transformer LLMs with the same model size and training tokens but is much more efficient in terms of latency, memory, throughput, and energy consumption.arxiv.orgmessage-square4fedilinkarrow-up123arrow-down19 cross-posted to: [email protected]
arrow-up114arrow-down1external-link1-bit LLM performs similarly to full-precision Transformer LLMs with the same model size and training tokens but is much more efficient in terms of latency, memory, throughput, and energy consumption.arxiv.org☆ Yσɠƚԋσʂ ☆ to [email protected]English • 7 months agomessage-square4fedilink cross-posted to: [email protected]
Why use lot bit when one bit do trick?
Bits together weak