25th International Conference on Text, Speech and Dialogue
TSD 2022, Brno, Czech Republic, September 6–9 2022
TSD 2022 Tutorial: Speech recognition on the edge
The TSD 2022 conference will be accompanied by a hands-on tutorial
Speech recognition on the edge
Prof. Daniel Hromada; Hyungjoong Kim
DigiEduBerlin team :: University of the Arts / Einstein Center Digital Future

key words: Speech-to-text; DeepSpeech; RaspberryPi; NVIDIA Jetson; Python; Linux; Websockets
During this tutorial, participants will be introduced to diverse ways how speech-to-text (STT) inferences can be realized on non-cloud, local (i.e. edge-computing) architectures. Participants will acquire knowledge and competence concerning intricacies and nuances of execution of two different types of ASR systems (DeepSpeech and Random Forests) on three different hardware architectures (e.g. RaspberryPiZero (armv6); RaspberryPi 4 (armv7 without CUDA) and NVIDIA Jetson Xavier (armv8 / aarch64 with CUDA). Thus, in 90 minutes of a hands-on tutorial participants will experience the transformation of all three hardware platforms into a low-cost local STT inference engine.
