Giving IBM’s Watson Speech-To-Text a try on FreePBX
If you’ve ever read computer transcribed voicemails, you know they can range from utterly unintelligible to downright hilarious. But oftentimes you can glean just enough from the transcript to give you a sense of what the call is about. Plus, you can easily tell who among your colleagues and friends are, in fact, robots from another planet by their suspiciously accurate voicemail transcripts.
Welcoming your robot overlord:
Starting with instructions we found on Nerdvittles, we set out to amuse ourselves. Assuming you already have voicemail-to-email with attachments working, the process works something like this:
- Setup an IBM Bluemix account
- Compile & Install some dependencies
- Grab some scripts, make a few tweaks
- Tell FreePBX to use the scripts to send voicemails
We’ve seen various recipes for connecting FreePBX and/or Asterisk with speech-to-text (STT) engines before. This particular method was surprisingly easy, and the results are not just entertaining, but also pretty useful. Though not entirely free, IBM’s STT service is exceptionally cheap. The first 1,000 minutes per month are free, and it’s only 2¢ per minute thereafter.
Perhaps a bit too literal
While we find IBM’s STT engine to be nearly as good at actually interpreting human speech as Google’s, it’s (perhaps unsurprisingly) far less contextually accurate. For example, whereas Google would interpret a person speaking a series of 10 digits as a phone number and transcribe it as “4145551234”, Watson will transcribe it as “four one four five five five one two three four”. Sometimes it gets confused by sound-alikes, and “2025551234” becomes “two oh two five five five won too three four”.
Check out an example.
What I said: “Hello, this is Nate from freepbxhosting.com testing Watson’s transcription capabilities. Please call me back when you get this at 414-555-1234. Thanks!”
What Watson heard:
hello this is Nate from free PP X. hosting dot com testing Watson’s transcription capabilities please call me back when you get this at four one four five five five one two three four thanks
What Google heard:
“Hello, this is nate from free PBx hosting.com testing Google Transcription capabilities. Please call me back when you get this at 414 555 1234. Thanks.”
All in all, not bad. As expected, Watson and Google both had problems with a somewhat uncommon proper name (FreePBX) but both were able to produce a usable result. Bear in mind I was taking care to speak clearly and at an even pace. Your results will almost certainly vary, significantly and often hilariously.
The script that intercepts the voicemail notification, grabs the audio, and sends it to IBM was based on a script originally created to transcode the WAV email attachment to MP3 format, hence the Lame dependency. At some point down the road, we might alter the script to remove the MP3 conversion (and thus the lame dependency) as it’s largely unnecessary for our purposes. But some people might appreciate it.
We did have to make a few minor adjustments to make this procedure work properly on the latest versions of FreePBX 13 and 14.
The Procedure:
Get Bluemix.
Before you start, you need to create an IBM Bluemix account, set up an organization & space, and add the Watson Speech to Text service. You’ll need to start a free 30-day trial, after which point you’ll need to give IBM your credit card info if you want to keep using the service. Once you set up Bluemix, grab your STT service credentials and read on.
Once you set up Bluemix, grab your STT service credentials and read on.
Install Dependencies: Lame (FreePBX 13 only; lame is included in FreePBX 14):
- Download: wget http://sourceforge.net/projects/lame/files/lame/3.99/lame-3.99.5.tar.gz/download
- Extract: tar xf download
- Compile & Install: cd lame-3.99.5
- ./configure
- make
- make install
Dos2Unix:
yum install -y dos2unix
Setup FreePBX:
- SSH into the server and run the following to install the sendmail scripts:
cd /usr/local/sbin
wget https://freepbxhosting.com/wp-content/uploads/sendmailmp3-bluemix.tar.gz
tar zxvf sendmailmp3-bluemix.tar.gz
rm -f sendmailmp3-bluemix.tar.gz
2. Edit sendmailmp3-bluemix and insert Bluemix STT credentials on lines 28 and 29. Save file.
3. Choose Settings -> Voicemail Admin -> Settings in the GUI 6:
4. In the format field, insert:
wav|wav49
5. Go to the Email tab
6. In the mailcmd field, insert:
/usr/local/sbin/sendmailmp3-bluemix
7. Click Submit to save your settings and then click “Apply Changes” to reload FreePBX.
8. Place a test call to the extension and record a voicemail when prompted. Your message should be transcribed and delivered via email.
Credits & Acknowledgements:
- Instructions based on this Nerdvittles article: http://nerdvittles.com/?p=21703
- sendmailmp3: https://github.com/NicolasBernaerts/debian-scripts/blob/master/asterisk/sendmailmp3