Discover the Power of OpenAI Whisper for VitalPBX Voicemail Transcription

Introduction

Voicemail transcription with OpenAI Whisper is the process of converting a voicemail message left by a caller into text format. It is simpler to comprehend the message without having to listen to the entire voicemail since this text version of the voicemail message can be seen in a user’s email or messaging app.

A powerful tool for communication and productivity in both personal and professional settings is voice-to-text transcription. By enabling users to read voice messages rapidly and readily, it also helps them stay organized and prioritize productivity.

Voicemail transcription can be a good option for several reasons:

  • Saves time: You can quickly get the gist of the caller’s message through voice-to-text transcription, which allows you to skip through the complete message without listening to all of it. This can be helpful when you’ve been occupied and can’t make time to thoroughly listen to a voicemail.
  • Improves accessibility: The voicemail transcription feature also makes it possible for people with hearing impairments or deafness to comprehend the message more easily. It may also benefit people that are in a loud environment and cannot use the transcription feature.
  • Increases accuracy: Sometimes voicemails can come off as garbled or hard to understand, particularly if the called party is hard to understand or has a strong accent. Voicemail transcription can help alleviate these issues and provide a clear rendition of the message.
  • Convenience: With voicemail transcription, you can receive your voicemails by email, text message, or by logging into a secure website. This is easier than having to dial your voicemail’s number.

Advantages of OpenAI Whisper over Google Cloud (Speech-to-Text V2)

OpenAI Whisper chatbot ends up with a host of advantages, which here are the top eight.

  1. Cost, although Google Cloud (Speech-to-Text V2) gives us the first 60 minutes for free every month, after these 60 minutes it charges us US$0.024 per minute. OpenaAI Whisper charges us $0.006. four times less. If we are going to use more than 70 minutes per month, Whisper from OpenAI is the best option.
  2. Installation, to use OpenAI Whisper, it is not mandatory to install anything, while in Google Cloud (Speech-to-Text V2) we need to install a program on our server.
  3. Obtaining the API, in Google Cloud (Speech-to-Text V2) this process is quite complex, and we must authenticate the Server with a URL generated from a script. While in OpenAI Whisper it takes us a few seconds to get the API Key and copy it into our file that handles the transcription.
  4. Language detection, in OpenAI Whisper we have the option of language detection, which facilitates the Transcription in multiple languages automatically, while in Google Cloud (Speech-to-Text V2) the request must be sent transcription with the possible languages of the audio and Google Cloud (Speech-to-Text V2) will select the one that best suits the audio sent.
  5. Greater accuracy, Whisper Speech to Text is designed to produce more accurate speech recognition results than is Google Speech to Text because Whisper uses more advanced language models that can detect subtle nuances and ambiguities in the speech.
  6. Better context understanding, Whisper is also able to better understand the context in which speech is being used, which can improve the accuracy of speech recognition. This is due to Whisper being able to understand and analyze the meaning of the words being spoken, which makes it better to deal with different scenarios and environments.
  7. Greater flexibility and customization, Whisper offers greater flexibility and customization than Google Cloud Speech to Text, as users can customize it to their needs. In addition, Whisper integrates with other services and applications, enhancing interoperability and flexibility.
  8. Support for multiple languages, Whisper is also capable of transcribing and interpreting foreign speech, which can be useful for people who may need to work with different speakers of different languages or in multilingual settings.


OpenAI Whisper offers substantial benefits over Google Cloud Speech to Text in terms of accuracy, context understanding, flexibility, and support for multiple languages.

However, the selection of either service will depend on the needs of the user and the context in which they will be used.

Preparing VitalPBX to be able to send email

Email Settings

To be able to send Voicemail by email in VitalPBX it is necessary to have an email account and configure it in ADMIN/System Settings/Email Settings.

You must configure all the fields and do the email test.

Extension

Now we go to the extensions that we want to configure so that when a voicemail arrives, it is sent by email.

We configure the Email Address in the GENERAL TAB.

And in the VOICEMAIL TAB we enable the VoiceMail.

Disable MP3 Attachments in SETTINGS/Voicemail Settings/Voicemail Settings

Obtaining API Key from OpenAI

Let’s assume that you already have a OpenAI account, so the first step go to:

https://platform.openai.com/account/api-keys

openai whisper

Press “Create new secret key”

Create the Key and copy it, since it will not be shown again.

Now we return to the console of our VitalPBX server and do the following steps

Install

Install required dependencies

				
					apt install curl apt-transport-https gnupg jq sox flac dos2unix gnupg
				
			

Download the sendmail-openai file and copy it to /usr/sbin/

				
					
wget -P /usr/sbin/ https://raw.githubusercontent.com/VitalPBX/vitalpbx-voicemail-transcription-openai/main/sendmail-openai
				
			

We edit the /usr/sbin/sendmail-openai file and add the API Key that we copied earlier.

				
					
nano /usr/sbin/sendmail-openai
				
			
				
					API_KEY=””
API_URL=”https://api.openai.com/v1/audio/transcriptions”
				
			

Later we create the file voicemail__60-general.conf in /etc/Asterisk/vitalpbx/

				
					nano /etc/asterisk/vitalpbx/voicemail__60-general.conf
				
			

And we add the following content.

				
					[general](+)
;You override the default program to send e-mail to use the script
mailcmd=/usr/sbin/sendmail-openai
				
			

Now we assign the corresponding permissions.

Finally we do a reload of the voicemail module.

				
					
cd /usr/sbin/
chown asterisk:asterisk sendmail-openai
chmod +x sendmail-openai
chmod +x /usr/bin/dos2unix
				
			
				
					
asterisk -rx”module reload app_voicemail.so”
				
			

Conclusions

Overall, voicemail transcription can be a useful tool for anyone who receives voicemails regularly, and it can provide several benefits, including time savings, improved accessibility, increased accuracy, and convenience.

Our Latest Post