Hi David,
The good news is that you're not going crazy! This is totally normal behaviour. What you discovered is exactly how NLU models learn, and you actually figured out the solution on your own!
Here's What's Going On:
Even though your RegEx slot knows what a valid pattern looks like (4 digits), the NLU model still needs to see multiple real-world examples to understand how people actually say it. When you only had "ext 1234" as a training phrase, the model was like "okay, I've seen THIS exact thing" but didn't know that other 4-digit combos would work the same way. As you added those 6-8 more examples, the lightbulb went on - "Ohhhh, it's 'ext' plus ANY 4-digit number!"
The Sweet Spot:
Genesys recommends 10-20 diverse examples per intent to get things working smoothly. Mix it up with different phrasings:
- "extension 1234"
- "ext 5678"
- "transfer me to extension 2020"
- "I need ext 9876"
- "connect me to extension 3456"
More tips here: https://help.genesys.cloud/articles/best-practices-to-build-and-test-your-natural-language-understanding/
About that "extension" > "ext" thing: Yeah, speech recognition loves to abbreviate! You handled it perfectly by adding both versions. That's exactly the kind of real-world quirk your training data should cover.
You figured this out perfectly - RegEx slots just need more examples than List slots because they're dealing with variable content. Keep doing what you're doing and those confidence scores will keep improving!
------------------------------
Josh Coyle
Senior Professional Services Consultant
------------------------------
Original Message:
Sent: 04-09-2026 12:15
From: Dave Halderman
Subject: NLU confidence score issues when mapping slots to utterances
I've been fighting with an issue this morning that I haven't noticed before. The goal is for a bot to be able to handle when the caller says "extension 1234". The extension currently could be any of 16 different 4-digit extensions. I wanted to make it more flexible for the future, so the slot type is a RegEx slot that's looking for 4 numeric digits. That part works great. I created a "Dial by extension" intent in my bot and added "extension 1234" as an utterance with the "1234" mapped to the Extension slot.
When I tested it, it was asking me to confirm every time. That was annoying so I looked at the utterance history to see what it was hearing me say. I was clearly saying "extension 1610", but it was showing "ext 1610" in the utterance history. I'm not sure why it was abbreviating that, but I figured I'd just go add "ext 1234" to my intent, map the numerical part to the slot, and solve the confidence issue. That didn't work. I used the NLU testing to enter "ext " followed by my different extensions. They were all different confidence scores, and all had a low enough confidence score that it would ask me to confirm every time. When I entered "ext 1234", I got a confidence score of 1 like I expected all of them to get.
I thought when I mapped a slot in an utterance like this, it could do better at abstracting what I was looking for. I was trying to tell it that it was going to hear "ext" or "extension" followed by 4 numeric digits, not literally "1234". Is this not how it works? I entered 6-8 more utterances in my intent. Each had a different extension number as an example. With each one I entered, my NLU confidence scores got better for all of the others that I hadn't entered. Is this normal behavior? I typically use a List slot type with a fairly small list of values. I had never noticed those cases behaving like this.
#ArchitectandDesign
#ConversationalAI(Bots,VirtualAgent,etc.)
------------------------------
Dave Halderman
Business Analyst
------------------------------