I’ve been using TheBlokes Q8 of https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B, but now this one (https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-7b-v3-1-7B) I think is killing it. Has anyone else tested it?
Hmm had interesting results from both of those base models, haven’t tried the combo yet, will start some exllamav2 quants to test
What’s it doing well at?
quant link for anyone who may want: https://huggingface.co/bartowski/OpenHermes-2.5-neural-chat-7b-v3-1-7B-exl2
I haven’t tried neural-chat, but the combined model seems to be better (anecdotally) than OH2.5/Mistral at following instructions, reasoning, some of the overall quirks with llama.cpp seem to be ironed out with it too.
What hardware are you running this off of?
Seems quite intelligent. But I’m currently trying storywriting and that seems to be the one thing it doesn’t excel at. It mostly stays being very abstract and summarizing the plot instead of playing it out.
Yeah this seems less focused on creativity, there’s a lot of really good models out there tuned for story telling that will far exceed generalized SoTA models
How to download it for MLCChat(Android)?