Introduction
In recent years, voice interfaces have become increasingly popular, allowing users to interact with devices using natural language. One essential component of voice interfaces is speech-to-text conversion, which enables the translation of spoken words into written text. In this blog post, we’ll explore how to implement a speech-to-text converter on a Brainy Pi using Python and a microphone. This guide is for developers with a basic understanding of Python to implement Speech Recognition on Brainy Pi .
Prerequisites
Before we begin, ensure you have the following:
A Brainy Pi Dev Board
A USB microphone
An Internet connection
Basic knowledge of Python and Brainy Pi
Setting Up the Environment
To start, we need to set up our Brainy Pi environment. Follow these steps:
Connect a compatible USB microphone to your Brainy Pi.
Open a terminal or SSH into your Brainy Pi.
Now, update the package list by running the following command:
sudo apt-get update
Install the necessary dependencies by executing the following commands:
sudo apt-get install portaudio19-dev python3-dev flac pip install SpeechRecognition pip install Pyaudio
Implementation Steps
Now that we have our environment ready, let’s implement the speech-to-text converter on the Raspberry Pi using Python:
Open your favorite text editor and create a new Python script. Let’s call it
speech_to_text.py
.So, let’s Import the SpeechRecognition library, for speech recognition in Python.
import speech_recognition as sr
Create a recognizer object to recognize speech.
r = sr.Recognizer()
After object is initialized then get the list of available microphones on the system, using the
list_microphone_names()
method.mic_list = sr.Microphone.list_microphone_names()
Iterate through the list of microphones, and look for the USB mic. Set the
device_id
variable to its index in themic_list
.mic_name = "USB PnP Sound Device: Audio (hw:2,0)" for i, microphone_name in enumerate(mic_list): if microphone_name == mic_name: device_id = i
Now lets capture audio input from the selected microphone and store it in the
audio
variable.with sr.Microphone(device_index=device_id) as source: print("Say something...") r.adjust_for_ambient_noise(source, duration=0.2) audio = r.listen(source)
Convert the captured audio to text using the Google Speech Recognition API.
try: text = r.recognize_google(audio) print("You said:", text) except sr.UnknownValueError: print("Speech recognition could not understand audio.") except sr.RequestError as e: print("Could not request results from Speech Recognition service; {0}".format(e))
To handle errors, we add exception handling to catch any unknown value errors and request errors that may occur. This will ensure that our program does not crash in case of unexpected errors.
Finally, we print out the text recognized text by the Google Speech Recognition API.
Then save the script and exit the text editor.
Final Code for Speech Recognition on Brainy Pi
Here is the complete code for speech_to_text.py
:
import speech_recognition as sr # Create a recognizer object r = sr.Recognizer() mic_list = sr.Microphone.list_microphone_names() # Select the USB microphone by name mic_name = "USB PnP Sound Device: Audio (hw:2,0)" device_id = next((i for i, microphone_name in enumerate(mic_list) if microphone_name == mic_name), None) # Check if the selected microphone was found if device_id is None: print("Could not find the selected microphone. Available microphones:") print(mic_list) exit() # Start listening to the microphone with sr.Microphone(device_index=device_id) as source: print("Say something...") # Adjust the recognizer's sensitivity to ambient noise r.adjust_for_ambient_noise(source, duration=0.2) audio = r.listen(source) # Convert speech to text try: text = r.recognize_google(audio) print("You said:", text) except sr.UnknownValueError: print("Could not understand audio") except sr.RequestError as e: print("Could not request results from Speech Recognition service; {0}".format(e))
Running the code
python3 speech_to_text.py