• Finadil@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    That with a fp16 model? Don’t be scared to try even a 4 bit quantization, you’d be surprised at how little is lost and how much quicker it is.