Until now, speech recognition has relied upon a device being connected to the internet. This is because the algorithms typically used for this process require significant amounts of temporary random access memory (RAM) which is usually provided by powerful data center servers. Indeed, try switching your smartphone to airplane mode and see how far your voice commands get you. But change is in the air.
A new algorithm developed by Professor Panagiotis Karras from the University of Copenhagen’s Department of Computer Science, together with linguist Nassos Katsamanis of the Athena Research Center in Greece, and researchers from Aalto University in Finland and KTH in Sweden, allows even smaller devices like smartphones to decode speech without needing substantial memory—or internet access.
The code, recently presented in a scientific article, employs a clever strategy: it “forgets” what it doesn’t need in real-time.
Things like whisper have already solved this for the most part. I guess the issue is eating up a GB or lore of memory isn’t ideal for a single feature.
Until now, speech recognition has relied upon a device being connected to the internet.
My family’s Gateway 2000 had local speech recognition… in 1998. That machine had sixteen megs of memory and a 200 MHz P2.
The fucking Macintosh Classic had local speech recognition. Yeah, it gave us “I helped Apple wreck a nice beach,” but things have improved since then.
One day your phone might be as functional as Windows 95.