Instead of transcribing the complete speech, you can also transcribe a particular segment of the audio file. You can see that the file has not been 100% correctly transcribed, yet the accuracy is pretty reasonable. The above output shows the text of the audio file. Output: 'Bristol O2 left shoulder take the winding path to reach the lake no closely the size of the gas tank degrees office 30 face before you go out the race was badly strained and hung them the stray cat gave birth to kittens the young girl gave no clear response the meal was called before the bells ring what weather is in living' Execute the following script: recog.recognize_google(audio_content) Now we can simply pass the audio_content object to the recognize_google() method of the Recognizer() class object and the audio file will be converted to text. Now if you check the type of the audio_content variable, you will see that it has the type speech_recognition.AudioData. We need to pass the AudioFile object to the record() method, as shown below: with sample_audio as audio_file: To convert our audio file to an AudioData object, we can use the record() method of the Recognizer class. However, the recognize_google() method requires the AudioData object of the speech_recognition module as a parameter. We will be using the recognize_google() method to transcribe our audio files. In the above code, update the path to the audio file that you want to transcribe. Execute the following script: sample_audio = speech_recog.AudioFile( 'E:/Datasets/my_audio.wav') The path of the audio file that you want to translate to text is passed to the constructor of the AudioFile class. To recognize speech from an audio file, we have to create an object of the AudioFile class of the speech_recognition module. recognize_sphinx(): Uses PocketSphinx APIĪmong all of the above methods, the recognize_sphinx() method can be used offline to translate speech to text.recognize_ibm(): Uses IBM Speech to Text API.recognize_houndify(): Uses Houndify API by SoundHound.recognize_google_cloud(): Uses Google Cloud Speech API.recognize_google(): Uses Google Speech API.recognize_bing(): Uses Microsoft Bing Speech API.Depending upon the underlying API used to convert speech to text, the Recognizer class has following methods: To convert speech to text the one and only class we need is the Recognizer class from the speech_recognition module. import speech_recognition as speech_recog In this case, we only need to import the speech_recognition library that we just downloaded. The first step, as always, is to import the required libraries. Download the file to your local file system. The audio file that we will be using as input can be downloaded from this link. In this section, you will see how we can translate speech from an audio file to text. Installing SpeechRecognition LibraryĮxecute the following command to install the library: $ pip install SpeechRecognition Speech Recognition from Audio Files However we will be using the SpeechRecognition library, which is the simplest of all the libraries. Several speech recognition libraries have been developed in Python. In this tutorial, you will see how we can develop a very simple speech recognition application that is capable of recognizing speech from audio files, as well as live from a microphone. Speech recognition has various applications ranging from automatic transcription of speech data (like voice-mails) to interacting with robots via speech. If you have ever interacted with Alexa or have ever ordered Siri to complete a task, you have already experienced the power of speech recognition. Speech recognition is one of the most important tasks in the domain of human computer interaction. Speech recognition, as the name suggests, refers to automatic recognition of human speech.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |