Game Changer: VitalPBX’s AI Operators Reshaping Call Management

Introductions For AI Operators

AI operators are transforming how businesses handle calls, offering innovative solutions that enhance both customer experience and operational efficiency. In the digital era, efficiency and effectiveness in communication management are crucial for business success. 

Increased Efficiency in Call Management:

AI operators can handle a significantly higher volume of calls simultaneously compared to traditional systems. This results in shorter waiting times for customers and quick resolution of their concerns. Additionally, AI can more accurately direct calls to the appropriate department or individual, reducing transfer errors and increasing overall efficiency.

Improving Customer Experience:

AI offers quick and accurate responses, tailored to the specific needs of each customer. This personalization significantly enhances customer satisfaction as clients feel heard and well-serviced. AI’s ability to learn from previous interactions allows for more intuitive and effective service.

Reducing Operational Costs:

Implementing AI operators in call management can lead to a notable reduction in labor and training costs. By delegating repetitive, high-volume tasks to AI, businesses can optimize their workforce and focus on areas that require more direct human touch.

24/7 Availability:

Unlike human employees, AI operators can function 24 hours a day, 7 days a week. This means businesses can offer constant assistance to their customers, regardless of time zone or working hours, thereby improving accessibility and customer satisfaction.

Necessary Resources

  1. OpenAI Account API Key. Check our blog post to create an Open AI api key.
  2. Microsoft Azure TTS API Key. Check our blog post to create a Microsoft Azure api key.
  3. VitalPBX 4
  4. Python and some dependency

1.- Create Asterisk AGI and Dial-Plan

Now we are going to show you how to create the AGI that will do all the magic.

1.1.- Installing dependencies

We install certain dependencies.

				
					apt update
apt install python3 python3-pip
pip install azure-cognitiveservices-speech
pip install pyst2
pip install python-dotenv
pip install openai

				
			

If we are going to use the Embedded model, we must install the following dependencies.

				
					pip install langchain==0.0.331rc2
pip install pypdf==3.8.1
pip install docx2txt==0.8
pip install chromadb==0.3.22
pip install tiktoken==0.4.0
				
			

Now we must go to the /var/lib/asterisk/agi-bin folder on our VitalPBX server. And create the .env file.

				
					cd /var/lib/asterisk/agi-bin
nano .env

				
			

1.2.- Creating .env file to save global variables

Copy and paste the following content into your file. Replace the OPENAI_API_KEY, AZURE_SPEECH_KEY, AZURE_SERVICE_REGION, PATH_TO_DOCUMENTS and PATH_TO_DATABASE with your own values.

				
					OPENAI_API_KEY = "sk-"
AZURE_SPEECH_KEY = ""
AZURE_SERVICE_REGION = "eastus"
OPENAI_ASSISTANT_ID = "asst_"
OPENAI_AIOPERATOR_INSTRUCTIONS = ""
				
			

1.3.- Using Script

Next we use a Script to create all the files and we will skip steps 4.4, 4.5, 4.6 and 4.7.

				
					wget https://raw.githubusercontent.com/VitalPBX/vitalpbx_operator_ai_chatgpt/main/install.sh
				
			
				
					chmod +x install.sh
				
			
				
					./install.sh
				
			

Now we will proceed to create the Voice Guides.

				
					cd /var/lib/asterisk/agi-bin
				
			

The format to record a prompt is as follows: ./record-prompt.py file-name “Text to record” language

  • file-name –> file name if extension mp3, remember that in the Agent AI script, the welcome audio is: welcome-en (English), welcome-es (Spanish), and the wait audio is: wait-en (English), and wait-es (Spanish).
  • languaje –> could be “en-US” or “es-ES”

If you want to add more languages, you must modify the scripts

Below we show an example of how you should use the script to record the prompt.

				
					./record-prompt.py op_ai_welcome-en "I am your AI Operator, after hearing the tone, could you please tell me the name of the person or the area you wish to communicate with?" "en-US"
./record-prompt.py op_ai_wait-en "Wait a moment please." "en-US"
./record-prompt.py op_ai_transfer-en "Transferring your call, please hold." "en-US"
./record-prompt.py op_ai_short-message-en "Your message is too short, please try again." "en-US"
./record-prompt.py op_ai_user_not_found-en "I'm sorry, we were unable to find the information you requested. Please try again." "en-US"
./record-prompt.py op_ai_welcome-es "Soy su Operador de IA, despues de escuchar el tono, ¿podría decirme el nombre de la persona o el área con la que desea comunicarse?" "es-ES"
./record-prompt.py op_ai_wait-es "Espere un momento por favor." "es-ES"
./record-prompt.py op_ai_transfer-es " ./record-prompt.py op_ai_transfer-es "Transfiriendo su llamada, por favor espere." "es-ES"
./record-prompt.py op_ai_short-message-es "Tu mensaje es demasiado corto, inténtalo de nuevo." "es-ES"
./record-prompt.py op_ai_user_not_found-es "Lo sentimos, no pudimos encontrar la información que solicitaste. Inténtalo de nuevo." "es-ES"

				
			

1.4.- Creating AI Operator with Assistants information

Now we must go to the /var/lib/asterisk/agi-bin folder on our VitalPBX server.

				
					cd /var/lib/asterisk/agi-bin
nano operator-ai.py

				
			

Copy and paste the following content into your file.

				
					#!/usr/bin/env python3
import time
import os
import sys
from dotenv import load_dotenv
import openai
from openai import OpenAI
import re
# For Asterisk AGI
from asterisk.agi import *

# Load environment variables from a .env file
load_dotenv("/var/lib/asterisk/agi-bin/.env")
OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY')
OPENAI_ASSISTANT_ID = os.environ.get('OPENAI_ASSISTANT_ID')

client = OpenAI()
thread = client.beta.threads.create()

agi = AGI()

# Check if a file name was provided
uniquedid = sys.argv[1] if len(sys.argv) > 1 else None
context = sys.argv[2] if len(sys.argv) > 1 else None
language = sys.argv[3] if len(sys.argv) > 1 else None
tts_engine = sys.argv[4] if len(sys.argv) > 1 else None
free_dial = sys.argv[5] if len(sys.argv) > 1 else None

if uniquedid is None:
    print("No filename provided for the recording.")
    sys.exit(1)

# Check if a file name was provided
recording_path = f"/tmp/rec{uniquedid}"

if language == "es":
    azure_language = "es-ES" 
    azure_voice_name = "es-ES-ElviraNeural"
    wait_message = "/var/lib/asterisk/sounds/op_ai_wait-es.mp3"
    transfer_message = "/var/lib/asterisk/sounds/op_ai_transfer-es.mp3"
    short_message = "/var/lib/asterisk/sounds/op_ai_short-message-es.mp3"
    user_not_found = "/var/lib/asterisk/sounds/op_ai_user_not_found-es.mp3"
else:
    azure_language = "en-US" 
    azure_voice_name = "en-US-JennyNeural"
    wait_message = "/var/lib/asterisk/sounds/op_ai_wait-en.mp3"
    transfer_message = "/var/lib/asterisk/sounds/op_ai_transfer-en.mp3"
    short_message = "/var/lib/asterisk/sounds/op_ai_short-message-en.mp3"
    user_not_found = "/var/lib/asterisk/sounds/op_ai_user_not_found-en.mp3"

# Files can also be added to a Message in a Thread. These files are only accessible within this specific thread.
# After having uploaded a file, you can pass the ID of this File when creating the Message.

def main():
    try:
        # We send the 'raw' command to record the audio, 'q' for no beep, 2 seconds of silence, '30' max duration, 'y' to overwrite existing file
        sys.stdout.write('EXEC Record ' + recording_path + '.wav,3,30,y\n')
        sys.stdout.flush()
        # We await Asterisk's response
        result = sys.stdin.readline().strip()

        if result.startswith("200 result="):

            # Please wait while I search for the extension number.
            agi.appexec('MP3Player', wait_message)
           
            #DEBUG
            agi.verbose("Successful Recording",2)

            # Once everything is fine, we send the audio to OpenAI Whisper to convert it to Text
            openai.api_key = OPENAI_API_KEY
            audio_file = open(recording_path + ".wav", "rb")
            transcript = client.audio.transcriptions.create(
                model="whisper-1", 
                file=audio_file
            )
            chatgpt_question = transcript.text
            chatgpt_question_agi = chatgpt_question.replace('\n', ' ') 

	    # If nothing is recorded, Whisper returns "you", so you have to ask again.
            if chatgpt_question == "you":
                # Your message is too short, please try again.
                agi.appexec('MP3Player', short_message)
                agi.verbose("Message too short",2)
                sys.exit(1)

            #DEBUG
            agi.verbose("AUDIO TRANSCRIPT: " + chatgpt_question_agi,2)

            if free_dial == "1":
                # Remove Space and point
                chatgpt_question_remove = ''.join(['' if c in [' ', '.'] else c for c in chatgpt_question_agi])
                chatgpt_question_get_number = re.findall(r'\d+', chatgpt_question_remove)
                # If the user mentions a number in the question, it transfers them to that number immediately.
                if len(chatgpt_question_get_number) >= 1:
                    extension_number = chatgpt_question_get_number[0]
                    agi.verbose("EXTENSION NUMBER: " + extension_number,2)
                    # Transferring your call, please hold.
                    agi.appexec('MP3Player', transfer_message)
                    # Priority to use
                    priority = "1"
                    # Make the transfer
                    agi.set_context(context)
                    agi.set_extension(extension_number)
                    agi.set_priority(priority)
                    sys.exit(1)

            if len(extensions_in_answer) >= 1:
                extension_number = extensions_in_answer[0]
                agi.verbose("EXTENSION NUMBER: " + extension_number,2)
                # Transferring your call, please hold.
                agi.appexec('MP3Player', transfer_message)
                # Priority to use
                priority = "1"
                # Make the transfer
                agi.set_context(context)
                agi.set_extension(extension_number)
                agi.set_priority(priority)
                sys.exit(1)

            message = client.beta.threads.messages.create(
                thread_id=thread.id,
                role="user",
                content=chatgpt_question
            )

            run = client.beta.threads.runs.create(
                thread_id=thread.id,
                assistant_id=OPENAI_ASSISTANT_ID,
                instructions=OPENAI_INSTRUCTIONS
            )

            runStatus = client.beta.threads.runs.retrieve(
                thread_id=thread.id,
                run_id=run.id
            )

            # Wait for Assistant to respond
            while runStatus.status != "completed":
                time.sleep(1)
                runStatus = client.beta.threads.runs.retrieve(
                thread_id=thread.id,
                run_id=run.id
            )

            # Get the last message
            messages = client.beta.threads.messages.list(
                thread_id=thread.id
            )

            response = messages.data[0].content[0].text.value

            #DEBUG
            response_agi = response.replace('\n', ' ')
            agi.verbose("OPERATOR AI RESPONSE: " + response_agi,2)

            extensions = re.findall(r'\d+', response_agi)
            if len(extensions) >= 1:
                extension_number = extensions[0]
                agi.verbose("EXTENSION NUMBER: " + extension_number,2)
                # Transferring your call, please hold.
                agi.appexec('MP3Player', transfer_message)
                # Priority to use
                priority = "1"
                # Make the transfer
                agi.set_context(context)
                agi.set_extension(extension_number)
                agi.set_priority(priority)
                sys.exit(1)
            else: # Ask Again
                agi.verbose("Extension number not found.")
                # I'm sorry, we were unable to find the information you requested. Please try again.
                agi.appexec('MP3Player', user_not_found)
                sys.exit(1)
        else:
            agi.verbose("Error while recording: %s" % result)

    except Exception as e:
        agi.verbose("ERROR:" + str(e))

if __name__ == "__main__":
    main()

				
			

Once the operator-ai.sh file has been saved, we must give it execution permissions.

				
					chmod +x operator-ai.py
				
			

Other very important data that we can configure with the following       

Record(filename.format,[silence,[maxduration,[options]]])

  • filename
  • filename
  • format – Is the format of the file type to be recorded (wav, gsm, etc).
  • silence – Is the number of seconds of silence to allow before returning.
  • maxduration – Is the maximum recording duration in seconds. If missing or 0 there is no maximum.
  • options
  • a – Append to existing recording rather than replacing.
  • n – Do not answer, but record anyway if line not yet answered.
  • – Exit when 0 is pressed, setting the variable RECORD_STATUS to ‘OPERATOR’ instead of ‘DTMF’
  • q – quiet (do not play a beep tone).
  • s – skip recording if the line is not yet answered.
  • t – use alternate ‘*’ terminator key (DTMF) instead of default ‘#’
  • u – Don’t truncate recorded silence.
  • x – Ignore all terminator keys (DTMF) and keep recording until hangup.
  • k – Keep recorded file upon hangup.
  • y – Terminate recording if any DTMF digit is received.

 

The “q” option is very important, since if we add it, we will not know when to start asking.

Also, the “silence” option is very important since it is the wait that is taken to finish the question, it must be kept between 2 or 3 seconds.

Since OpenaAI Whisper automatically detects the language in which we speak to it, it is not necessary to define it. However, in Azure TTS we must define the language and voice to use, you can add the voice that you like the most by changing the section: #Azure TTS info

In this example, 2 Azure voices are defined, Spanish and English.

1.5.- Create Embedded Model

We can also use the Embedded model, which is cheaper and the query is done in a local Chomadb database on the server. Below are the steps to follow.

				
					cd /var/lib/asterisk/agi-bin
nano operator-ai-embedded.py
				
			

Copy and paste the following content into your file.

				
					#!/usr/bin/env python3
import time
import os
import sys
from dotenv import load_dotenv
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
import openai
from openai import OpenAI
import re
# For Asterisk AGI
from asterisk.agi import *

# Load environment variables from a .env file
load_dotenv("/var/lib/asterisk/agi-bin/.env")
OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY')
OPENAI_ASSISTANT_ID = os.environ.get('OPENAI_ASSISTANT_ID')
PATH_TO_DATABASE = os.environ.get('PATH_TO_DATABASE')
OPENAI_AIOPERATOR_INSTRUCTIONS = os.environ.get('OPENAI_AIOPERATOR_INSTRUCTIONS')

embeddings = OpenAIEmbeddings()
vectordb = Chroma(persist_directory=PATH_TO_DATABASE, embedding_function=embeddings)

client = OpenAI()
agi = AGI()

# Check if a file name was provided
uniquedid = sys.argv[1] if len(sys.argv) > 1 else None
context = sys.argv[2] if len(sys.argv) > 1 else None
language = sys.argv[3] if len(sys.argv) > 1 else None
tts_engine = sys.argv[4] if len(sys.argv) > 1 else None
free_dial = sys.argv[5] if len(sys.argv) > 1 else None

if uniquedid is None:
    print("No filename provided for the recording.")
    sys.exit(1)

# Check if a file name was provided
recording_path = f"/tmp/rec{uniquedid}"

if language == "es":
    azure_language = "es-ES" 
    azure_voice_name = "es-ES-ElviraNeural"
    wait_message = "/var/lib/asterisk/sounds/op_ai_wait-es.mp3"
    transfer_message = "/var/lib/asterisk/sounds/op_ai_transfer-es.mp3"
    short_message = "/var/lib/asterisk/sounds/op_ai_short-message-es.mp3"
    user_not_found = "/var/lib/asterisk/sounds/op_ai_user_not_found-es.mp3"
else:
    azure_language = "en-US" 
    azure_voice_name = "en-US-JennyNeural"
    wait_message = "/var/lib/asterisk/sounds/op_ai_wait-en.mp3"
    transfer_message = "/var/lib/asterisk/sounds/op_ai_transfer-en.mp3"
    short_message = "/var/lib/asterisk/sounds/op_ai_short-message-en.mp3"
    user_not_found = "/var/lib/asterisk/sounds/op_ai_user_not_found-en.mp3"

# Files can also be added to a Message in a Thread. These files are only accessible within this specific thread.
# After having uploaded a file, you can pass the ID of this File when creating the Message.

def main():
    try:
        # We send the 'raw' command to record the audio, 'q' for no beep, 2 seconds of silence, '30' max duration, 'y' to overwrite existing file
        sys.stdout.write('EXEC Record ' + recording_path + '.wav,3,30,y\n')
        sys.stdout.flush()
        # We await Asterisk's response
        result = sys.stdin.readline().strip()

        if result.startswith("200 result="):

            # Please wait while I search for the extension number.
            agi.appexec('MP3Player', wait_message)
           
            #DEBUG
            agi.verbose("Successful Recording",2)

            # Once everything is fine, we send the audio to OpenAI Whisper to convert it to Text
            openai.api_key = OPENAI_API_KEY
            audio_file = open(recording_path + ".wav", "rb")
            transcript = client.audio.transcriptions.create(
                model="whisper-1", 
                file=audio_file
            )
            chatgpt_question = transcript.text
            chatgpt_question_agi = chatgpt_question.replace('\n', ' ') 

	    # If nothing is recorded, Whisper returns "you", so you have to ask again.
            if chatgpt_question == "you":
                # Your message is too short, please try again.
                agi.appexec('MP3Player', short_message)
                agi.verbose("Message too short",2)
                sys.exit(1)

            #DEBUG
            agi.verbose("AUDIO TRANSCRIPT: " + chatgpt_question_agi,2)

            if free_dial == "1":
                # Remove Space and point
                chatgpt_question_remove = ''.join(['' if c in [' ', '.'] else c for c in chatgpt_question_agi])
                chatgpt_question_get_number = re.findall(r'\d+', chatgpt_question_remove)
                # If the user mentions a number in the question, it transfers them to that number immediately.
                if len(chatgpt_question_get_number) >= 1:
                    extension_number = chatgpt_question_get_number[0]
                    agi.verbose("EXTENSION NUMBER: " + extension_number,2)
                    # Transferring your call, please hold.
                    agi.appexec('MP3Player', transfer_message)
                    # Priority to use
                    priority = "1"
                    # Make the transfer
                    agi.set_context(context)
                    agi.set_extension(extension_number)
                    agi.set_priority(priority)
                    sys.exit(1)


            # create our Q&A chain
            pdf_qa = ConversationalRetrievalChain.from_llm(
                ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo'),
                retriever=vectordb.as_retriever(search_kwargs={'k': 1}),
                return_source_documents=True,
                verbose=False
            )

            chat_history = []
            result = pdf_qa(
                {"question": f"{OPENAI_AIOPERATOR_INSTRUCTIONS}: '{chatgpt_question}'", "chat_history": chat_history})

            response = result["answer"]

            #DEBUG
            response_agi = response.replace('\n', ' ')
            agi.verbose("OPERATOR AI RESPONSE: " + response_agi,2)

            extensions = re.findall(r'\d+', response_agi)
            if len(extensions) >= 1:
                extension_number = extensions[0]
                agi.verbose("EXTENSION NUMBER: " + extension_number,2)
                # Transferring your call, please hold.
                agi.appexec('MP3Player', transfer_message)
                # Priority to use
                priority = "1"
                # Make the transfer
                agi.set_context(context)
                agi.set_extension(extension_number)
                agi.set_priority(priority)
                sys.exit(1)
            else: # Ask Again
                agi.verbose("Extension number not found.")
                # I'm sorry, we were unable to find the information you requested. Please try again.
                agi.appexec('MP3Player', user_not_found)
                sys.exit(1)
        else:
            agi.verbose("Error while recording: %s" % result)

    except Exception as e:
        agi.verbose("ERROR:" + str(e))

if __name__ == "__main__":
    main()
				
			

Once the operator-ai.sh file has been saved, we must give it execution permissions.

				
					chmod +x operator-ai-embedded.py
				
			

Other very important data that we can configure with the following       

Record(filename.format,[silence,[maxduration,[options]]])

  • filename
  • filename
  • format – Is the format of the file type to be recorded (wav, gsm, etc).
  • silence – Is the number of seconds of silence to allow before returning.
  • maxduration – Is the maximum recording duration in seconds. If missing or 0 there is no maximum.
  • options
  • a – Append to existing recording rather than replacing.
  • n – Do not answer, but record anyway if line not yet answered.
  • – Exit when 0 is pressed, setting the variable RECORD_STATUS to ‘OPERATOR’ instead of ‘DTMF’
  • q – quiet (do not play a beep tone).
  • s – skip recording if the line is not yet answered.
  • t – use alternate ‘*’ terminator key (DTMF) instead of default ‘#’
  • u – Don’t truncate recorded silence.
  • x – Ignore all terminator keys (DTMF) and keep recording until hangup.
  • k – Keep recorded file upon hangup.
  • y – Terminate recording if any DTMF digit is received.

 

The “q” option is very important, since if we add it, we will not know when to start asking.

Also, the “silence” option is very important since it is the wait that is taken to finish the question, it must be kept between 2 or 3 seconds.

Since OpenaAI Whisper automatically detects the language in which we speak to it, it is not necessary to define it. However, in Azure TTS we must define the language and voice to use, you can add the voice that you like the most by changing the section: #Azure TTS info

In this example, 2 Azure voices are defined, Spanish and English.

1.6.- Creating the Dial Plan

Now we will create the dial-plan to access the AGI.

				
					cd /etc/asterisk/vitalpbx
nano extensions__71-operator-ai.conf

				
			

Copy and paste the following content into your file.

				
					;This is an example of how to use the AI Operator

;For English
exten => *885,1,Answer()
 same => n,Set(INVALIDATTEMPTS=0)
 same => n,MP3Player(/var/lib/asterisk/sounds/op_ai_welcome-en.mp3)
 same => n(AskAgaing),AGI(operator-ai.py,${UNIQUEID},"cos-all","en", "Azure","1")
 same => n,Set(INVALIDATTEMPTS=$[${INVALIDATTEMPTS}+1])
 same => n,GotoIf($[${INVALIDATTEMPTS}>=3]?invalid)
 same => n,Goto(AskAgaing)
 same => n(invalid),Goto(app-termination,hangup,1)
 same => n,Hangup()

;For Spanish
exten => *886,1,Answer()
 same => n,Set(INVALIDATTEMPTS=0)
 same => n,MP3Player(/var/lib/asterisk/sounds/op_ai_welcome-es.mp3)
 same => n(AskAgaing),AGI(operator-ai.py,${UNIQUEID},"cos-all","es", "Azure","1")
 same => n,Set(INVALIDATTEMPTS=$[${INVALIDATTEMPTS}+1])
 same => n,GotoIf($[${INVALIDATTEMPTS}>=3]?invalid)
 same => n,Goto(AskAgaing)
 same => n(invalid),Goto(app-termination,hangup,1)
 same => n,Hangup()

				
			

Now restart the Asterisk dialplan and you can call *885 for English or *886 for Spanish.

				
					asterisk -rx "dialplan reload"
				
			

1.7.- Creating voice guides

To record our own prompt, we are going to create the following script.

				
					cd /var/lib/asterisk/agi-bin
nano record-prompt.py

				
			
				
					#!/usr/bin/env python3
import sys
import os
import time
from dotenv import load_dotenv 
from pydub import AudioSegment
import azure.cognitiveservices.speech as speechsdk

load_dotenv("/var/lib/asterisk/agi-bin/.env")
AZURE_SPEECH_KEY = os.environ.get('AZURE_SPEECH_KEY')
AZURE_SERVICE_REGION = os.environ.get('AZURE_SERVICE_REGION')

# The format to record a prompt is as follows:
# ./record-prompt.py file-name "Text to record" language
# file-name --> file name if extension mp3, remember that in the Agent AI script, the welcome audio is: welcome-en (English), welcome-es (Spanish), and the wait audio is: wait-en (English), and wait-es (Spanish).
# languaje --> could be "en-US" or "es-ES"
# If you want to add more languages you must modify the scripts

# Check if a file name was provided
audio_name = sys.argv[1] if len(sys.argv) > 1 else None
audio_text = sys.argv[2] if len(sys.argv) > 1 else None
language = sys.argv[3] if len(sys.argv) > 1 else None

if audio_name is None:
    print("No filename provided for the recording.")
    sys.exit(1)

if audio_text is None:
    print("No text to record audio.")
    sys.exit(1)

if language == "es-ES":
    azure_language = "es-ES" 
    azure_voice_name = "es-ES-ElviraNeural"
else:
    azure_language = "en-US" 
    azure_voice_name = "en-US-JennyNeural"

audio_path = f"/var/lib/asterisk/sounds/{audio_name}.mp3"
print(audio_path)

def main():

    # Sets API Key and Region
    speech_config = speechsdk.SpeechConfig(subscription=AZURE_SPEECH_KEY, region=AZURE_SERVICE_REGION)

    # Sets the synthesis output format.
    # The full list of supported format can be found here:
    # https://docs.microsoft.com/azure/cognitive-services/speech-service/rest-text-to-speech#audio-outputs
    speech_config.set_speech_synthesis_output_format(speechsdk.SpeechSynthesisOutputFormat.Audio16Khz32KBitRateMonoMp3)

    # Select synthesis language and voice
    # Set either the `SpeechSynthesisVoiceName` or `SpeechSynthesisLanguage`.
    speech_config.speech_synthesis_language = azure_language 
    speech_config.speech_synthesis_voice_name = azure_voice_name

    # Creates a speech synthesizer using file as audio output.
    # Replace with your own audio file name.
    speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)
    result = speech_synthesizer.speak_text_async(audio_text).get()
 
    stream = speechsdk.AudioDataStream(result)
    stream.save_to_wav_file(audio_path)

    # Path to the original MP3 file and path for the trimmed file
    original_file = audio_path
    trimmed_file = "/tmp/tmp.mp3"

    # Load the original audio file in MP3 format
    audio = AudioSegment.from_mp3(original_file)

    # Get the total duration of the file in milliseconds
    total_duration = len(audio)

    # Calculate the new duration without the last second
    new_duration = total_duration - 750  # Subtract 1000 milliseconds (1 second)

    # Trim the audio file
    trimmed_audio = audio[:new_duration]

    # Save the trimmed file as MP3
    trimmed_audio.export(trimmed_file, format="mp3")

    # Remove the original file
    os.remove(original_file)

    # Rename the trimmed file to the original file name
    os.rename(trimmed_file, original_file)

if __name__ == "__main__":
    main()

				
			

We proceed to give execution permissions.

				
					chmod +x record-prompt.py
				
			

The format to record a prompt is as follows:

./record-prompt.py file-name “Text to record” language

  • file-name –> file name if extension mp3, remember that in the Agent AI script, the welcome audio is: welcome-en (English), welcome-es (Spanish), and the wait audio is: wait-en (English), and wait-es (Spanish).
  • languaje –> could be “en-US” or “es-ES”
  • If you want to add more languages, you must modify the scripts

 

Below we show an example of how you should use the script to record the prompt.

				
					./record-prompt.py op_ai_welcome-en "I am your AI Operator, after hearing the tone, could you please tell me the name of the person or the area you wish to communicate with?" "en-US"
./record-prompt.py op_ai_wait-en "Wait a moment please." "en-US"
./record-prompt.py op_ai_transfer-en "Transferring your call, please hold." "en-US"
./record-prompt.py op_ai_short-message-en "Your message is too short, please try again." "en-US"
./record-prompt.py op_ai_user_not_found-en "I'm sorry, we were unable to find the information you requested. Please try again." "en-US"
./record-prompt.py op_ai_welcome-es "Soy su Operador de IA, despues de escuchar el tono, ¿podría decirme el nombre de la persona o el área con la que desea comunicarse?" "es-ES"
./record-prompt.py op_ai_wait-es "Espere un momento por favor." "es-ES"
./record-prompt.py op_ai_transfer-es " ./record-prompt.py op_ai_transfer-es "Transfiriendo su llamada, por favor espere." "es-ES"
./record-prompt.py op_ai_short-message-es "Tu mensaje es demasiado corto, inténtalo de nuevo." "es-ES"
./record-prompt.py op_ai_user_not_found-es "Lo sentimos, no pudimos encontrar la información que solicitaste. Inténtalo de nuevo." "es-ES"

				
			

Limitations of AI Agents in Telephony

  1. Data Dependency: The effectiveness of an AI agent heavily relies on the quality and volume of data it has been trained on. If the data set is limited or biased, the agent might malfunction or make erroneous decisions.
  2. Complexity and Cost: Implementing AI solutions in telephony systems may require significant investments in terms of hardware, development, and training.
  3. Privacy Concerns: AI agents processing calls or messages might have access to personal or sensitive information. This raises concerns about data privacy and security.
  4. Limited Human Interaction: While AI agents can handle many tasks autonomously, there are still situations that require the human touch. Over-reliance on AI can lead to customer frustration if they can’t connect with a real person when needed.
  5. Adaptability and Learning: While AI can learn and adapt over time, it may initially not be prepared to handle atypical or emerging situations that weren’t present in its training data.
  6. Resource Consumption: Some AI solutions, especially the more advanced ones, might require a significant amount of computational resources, influencing infrastructure and operational costs.
  7. Errors and Misunderstandings: AI agents might misinterpret verbal commands or contexts, especially in noisy environments or with varied accents and dialects.
  8. Updates and Maintenance: AI technology evolves quickly. This means implemented solutions might require frequent updates, implying a constant commitment to resources.
  9. Ethical Considerations: Using AI in telephony might lead to ethical debates, especially concerning recordings, sentiment analysis, and other aspects that might be perceived as intrusive.
  10. Delay in response: Sometimes you will notice a delay in the response as it is necessary to convert the question from audio to text (Whisper), send it to ChatGPT, wait for the response and then convert the response to audio again.

 

Despite these limitations, it’s undeniable that AI holds the potential to transform the telephony industry, offering significant improvements in efficiency and customer experience. However, it’s crucial to address these limitations to ensure successful and user-centric implementation.

Conclusion

Implementing AI operators in call management offers numerous benefits, from improving efficiency and customer experience to reducing costs and gathering valuable data. Businesses adopting this technology are well-positioned to lead in the digital transformation era, offering superior customer service and more agile operations.

Are you ready to experience the revolution in call management? Implement the AI operator in your VitalPBX and transform communication in your company.

Source Code

Our Latest Post