Genesys Cloud - Main

 View Only

Sign Up

  • 1.  Capture Caller Speech Input - USA Address

    Posted 21 days ago

    I'm building a call flow where the caller provides a spoken input containing a full USA address. My initial idea was to use the Collect Input action to capture the entire speech input, convert it to text, and store it in a variable. Then, I planned to use external logic to parse the text into street number, street name, city, state, and ZIP code.

    However, I've hit a roadblock: it seems Genesys Cloud doesn't provide a way to collect free-form speech input and store it as text in a variable.

    Has anyone implemented something similar? Any suggestions or best practices for achieving this requirement?


    #ArchitectandDesign
    #ConversationalAI(Bots,VirtualAgent,etc.)
    #Implementation
    #Telephony

    ------------------------------
    Mohamed Salim
    ------------------------------


  • 2.  RE: Capture Caller Speech Input - USA Address

    Posted 21 days ago

    Hi Mohamed,

    I think the Collect Input action is more for collecting DTMF or Specific Terms like "Account Ballance" as opposed to large text.

    You could use a bot flow with slots for Street Number, Street Name, City etc to capture the information.  

    Maybe others in the community have a better way of doing this though.



    ------------------------------
    Sam Jillard
    Online Community Manager/Moderator
    Genesys - Employees
    ------------------------------



  • 3.  RE: Capture Caller Speech Input - USA Address

    Posted 13 days ago

    Challenge with Speech Input for Multiple Slot Types in a Single Utterance

    We've been working on improving the approach to capture speech input that includes business name, U.S. address, and purpose of the call within a single utterance. However, the Virtual Agent is struggling to correctly identify all Slot Types-particularly the Address SlotType, which fails consistently during live speech input.

    Even though NLU and regex validation tests pass in isolation, the issue occurs when customers provide free-form speech. For example:

    "I am calling from Walmart located at 123 Main Street Miami Florida 12345 to follow up for my shipments."

    In this case:

    • BusinessName and CallerIntent SlotTypes are assigned correctly.
    • AddressSlotType is not being recognized.

    Since we allow callers to speak naturally without restrictions, they often provide all slot values in one sentence. This flexibility seems to be causing recognition challenges for the address component.



    ------------------------------
    Mohamed Salim Nagoor Ghani
    ------------------------------



  • 4.  RE: Capture Caller Speech Input - USA Address
    Best Answer

    Posted 12 days ago
    Edited by Mohamed Salim Nagoor Ghani 8 days ago

    Quick questions @Mohamed Salim Nagoor Ghani:

    Have you tried using an AI-Powered Slot Type for the AddressSlotType within your Virtual Agent? If so, can you tell where it's failing when you replay your phone call using flow execution history? Also worth keeping the Virtual Agent slot authoring recommendations and limitations in mind.



    ------------------------------
    Brian T. Jones | Ascension | Senior Specialist - Technology
    ------------------------------



  • 5.  RE: Capture Caller Speech Input - USA Address

    Posted 10 days ago
    Edited by Mohamed Salim Nagoor Ghani 8 days ago

    Hi Brian, 

    Your suggestion works well for just collecting the address input by using AI powered slot and "Ask for Slot" block in Bot flow. But requirement is to collect all three(name, address and call purpose) inputs in one utterances. Caller by giving the below utterances inputs(bold highlighted) has to be captured and stored in variable for further logic.

    Caller Speech Input "I am calling from XYZ Company located at 123 Main Street, Miami, Florida 12345 for my CNC machine failure."

    Initially, I was trying to meet the exact requirement by using "Ask for Intent" block and assigning the SlotType to each keywords in utterances. For the speech input given above, the BusinessSlotType and CallerIntentSlotType are getting assigned but not the StreetAddressSlotType. I even tried with various options (Regex, List, Dynamic List) for StreetAddressSlotType but not able to succeed. The value is not getting set to the StreetAddressSlotType.  

    After multiple attempts ended up using 2 step to collect the input, "Ask for Intent" to collect business name and call purpose and "Ask for Slot" to collect the business address and it works well as you suggested but it doesn't meet the key requirement.



    ------------------------------
    Mohamed Salim Nagoor Ghani
    ------------------------------



  • 6.  RE: Capture Caller Speech Input - USA Address

    Posted 10 days ago

    This would be a good case for Virtual Agents with Guides. Allowing the guide to parse the utterance into multiple variables.



    ------------------------------
    Steve Alix
    EDCi
    ------------------------------



  • 7.  RE: Capture Caller Speech Input - USA Address

    Posted 10 days ago

    Hi Steven, 

    Thank you for your input. We were able to meet this requirement using AI Guide; however, due to token usage cost considerations, we had to discontinue that approach. As an alternative, we tried implementing a BOT virtual agent using the "Ask for Intent" block and storing values in the following slot types: BusinessNameSlotType (list), CallerIntentSlotType (list), and BusinessAddressSlotType (list and RegEx).

    While BusinessNameSlotType is consistently set successfully, BusinessAddressSlotType does not always get populated-even when providing an address that already exists in its list.

    I am able to meet this requirement with 2 prompts. In first I collect the business name and caller intent using "Ask for Intent"and then I ask for the address using "Ask for Slot". But this does not meet our requirement. 



    ------------------------------
    Mohamed Salim Nagoor Ghani
    ------------------------------