A place to ask questions, connect with others, and stay in the know
We are currently embarking on a new project that involves implementing a text-to-speech bot for delivering appointment reminder messages via a call campaign. To better plan our project, we are interested in obtaining cost estimates as well as time estimates for its implementation.
Does anyone know if Genesys offers a cost calculator or any resources that can help us estimate the costs and time required for this particular project?
I have created a calculator for most all resources that we provide to our customers. Basically for TTS, if you assume that average speakers use 1000 characters per minute, you can use the average IVR time and discount by 15% for user input to get the minutes of TTS that might be used and then multiply the minutes by 1000 to get the number of characters per session. Now multiply that by the number of calls per month and divide by 1,000,000 and you will get the units you need to pay fwiorth TTS ($5.00/standard and $20/advanced) per million characters. The other option is to take all the TTS prompts for a given flow and put them into Word and use the character count into the same equation. Of course this is just an estimate, but if you do the calculation, you will find that TTS is not really an inexpensive solution after all.
One thing to consider is a hybrid approach where the most common prompts are recoded and only some of the feedback to the caller it TTS. Some companies think they can just record their TTS as prompts, but that is usually against copyright laws that TTS engine companies have in place.
Adding on to Robert's excellent recommendation of a hybrid approach, prototyping your flow with TTS and then later converting the "static" phrases/prompts to pre-recorded user prompts is a super easy way to go from working prototype to cost-efficient production flow, as well.
In Architect, you can even see which user prompts are TTS-only, vs. ones you have localized recordings for.
And taking it a step further, if you haven't already looked at using a cloud platform to generate your prompts, that speeds things up even more (especially if your shop is/was using outside voice talent to record). My company users speech.microsoft.com for this, and for this use case of generating the user prompts, it's nearly impossible to even use it enough to go into their paid tier. We use them for TTS, so we do hit the paid tier, but for generating user prompts you'd upload to Genesys, we've never gone past their free tier.
Another benefit of the hybrid approach is reducing the lag time you get (dead air) when the system is getting the audio for that TTS. That's not quite realtime, but it's close. However, it's a noticeable lag if your TTS blurb is more than a couple words.
Check out the Genesys Knowledge Network - your all-in-one access point for Genesys resources
Every year, Genesys® orchestrates more than 70 billion remarkable customer experiences for organizations in more than 100 countries. Through the power of our cloud, digital and AI technologies, organizations can realize Experience as a Service℠, our vision for empathetic customer experiences at scale. With Genesys, organizations have the power to deliver proactive, predictive, and hyper personalized experiences to deepen their customer connection across every marketing, sales, and service moment on any channel, while also improving employee productivity and engagement. By transforming back-office technology to a modern revenue velocity engine Genesys enables true intimacy at scale to foster customer trust and loyalty.