Announcing GPTQ & GGML Quantized LLM support for Huggingface Transformers – PostgresML

postgresml.org

Announcing GPTQ & GGML Quantized LLM support for Huggingface Transformers – PostgresML

postgresml.org

noneabove1182M to

LocalLLaMAEnglish · 2 years ago

GPTQ & GGML allow PostgresML to fit larger models in less RAM. These algorithms perform inference significantly faster on NVIDIA, Apple and Intel hardware. Half precision floating point, and quantization optimizations are now available for your favorite LLMs downloaded from Huggingface.

You must log in or register to comment.

Chat