cross-posted from: https://lemmit.online/post/225981
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/homeassistant by /u/janostrowka on 2023-07-19 12:49:02.
Hopefully this will come in handy for our Year of the Voice.
TL;DR: Justin Alvey replaces Google Nest Mini PCB with ESP32 custom PCB which he’s open-sourcing. Shows demo of running LLM voice assistant paired with Beeper to send and receive messages.
Tweet text thread (I would also highly recommend checking out the video demos on Twitter):
I “jailbroke” a Google Nest Mini so that you can run your own LLM’s, agents and voice models. Here’s a demo using it to manage all my messages (with help from @onbeeper) 📷 on, and wait for surprise guest! I thought hard about how to best tackle this and why
After looking into jailbreaking options, I opted to completely replace the PCB. This let’s you use a cheap ($2) but powerful & developer friendly WiFi chip with a highly capable audio framework. This allows a paradigm of multiple cheap edge devices for audio & voice detection…
& offloading large models to a more powerful local device (whether your M2 Mac, PC server w/ GPU or even “tinybox”!) In most cases this device is already trusted with your credentials and data so you don’t have to hand these off to some cloud & data need never leave your home
The custom PCB uses @EspressifSystem’s ESP32-S3 I went through 2 revisions from a module to a SoC package with extra flash, simplifying to single-sided SMT (< $10 BOM) All features such as LED’s, capacitive touch, mute switch are working, & even programmable from Arduino (/IDF)
For this demo I used a custom “Maubot” with my @onbeeper credentials (a messaging app which securely bridges your messaging clients using the Matrix protocol & e2e encryption) which runs locally serving an API
I’m then using GPT3.5 (for speed) with function calling to query this
Fro the prompt I added details such as family & friends, current date, notification preferences & a list additional character voices that GPT can respond in. The response is then parsed and sent to @elevenlabsio
I’ve been experimenting with multiple of these, announcing important messages as they come in, morning briefings, noting down ideas and memos, and browsing agents. I couldn’t resist - here’s a playful (unscripted!) video of two talking to each other prompted to be AI’s from "Her
I’m working on open sourcing the PCB design, build instructions, firmware, bot & server code - expect something in the next week or so. If you don’t want to source Nest Mini’s (or shells from AliExpress) it’s still a great dev platform for developing an assistant! Stay tuned!
When I first bought the speakers/displays they seemed to be releasing software updates at a quick pace, but it seems like they have slowed to a crawl. Classic Google to abandon products to move onto the next shiny thing.
Over the last year or so there was big cuts to the Google Assistant teams, and then they merged the Assistant and Bard teams together.
Personally, I’ve recently had issues with one of mine not turning off the display at night. I was also really disappointed when they stopped integrating with third party shopping list apps. My devices are definitely less useful now.