LocalLLaMAEnglish · 2 years ago

Any way to prune LLMs?

2

12

Any way to prune LLMs?

LocalLLaMAEnglish · 2 years ago

2

Hey, I’m working on some local LLM applications and my goal is to run the smallest model possible without crippling performance. I’m already using 4 bit GPTQ but I want something smaller. These models have been trained on such a massive amount of data but my specific use case only touches a very very small fraction of that, so I would imagine it’s possible to cut away large chunks of the model that I don’t care about. I’m wondering if there has been any work on runtime pruning of LLMs (not just static pruning based on model weights) based on “real world” data. Something like: you run the model a bunch of times with your actual data and monitor the neuron activations to inform some kind of pruning process. Does anyone here know about something like that?

You must log in or register to comment.

Chat

Zeth0s@lemmy.world
link
fedilink
English
arrow-up
2·
2 years ago
The closest that I know is distillation, you can google to get few resources (e.g. https://huggingface.co/papers/2306.08543). I don’t know if it is what you are looking for
minipasila@lemmy.fmhy.ml
link
fedilink
English
arrow-up
2·
2 years ago
I don’t know about that, but you could try GGML (llama.cpp). It has quantization up to 2-bits so that might be small enough.

LocalLLaMA

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

68 users / day
291 users / week
501 users / month
901 users / 6 months
582 local subscribers
2.88K subscribers
306 Posts
1.44K Comments
Modlog