Hi David,
The good news is that you're not going crazy! This is totally normal behaviour. What you discovered is exactly how NLU models learn, and you actually figured out the solution on your own!
Here's What's Going On:
Even though your RegEx slot knows what a valid pattern looks like (4 digits), the NLU model still needs to see multiple real-world examples to understand how people actually say it. When you only had "ext 1234" as a training phrase, the model was like "okay, I've seen THIS exact thing" but didn't know that other 4-digit combos would work the same way. As you added those 6-8 more examples, the lightbulb went on - "Ohhhh, it's 'ext' plus ANY 4-digit number!"
The Sweet Spot:
Genesys recommends 10-20 diverse examples per intent to get things working smoothly. Mix it up with different phrasings:
- "extension 1234"
- "ext 5678"
- "transfer me to extension 2020"
- "I need ext 9876"
- "connect me to extension 3456"
More tips here: https://help.genesys.cloud/articles/best-practices-to-build-and-test-your-natural-language-understanding/
About that "extension" > "ext" thing: Yeah, speech recognition loves to abbreviate! You handled it perfectly by adding both versions. That's exactly the kind of real-world quirk your training data should cover.
You figured this out perfectly - RegEx slots just need more examples than List slots because they're dealing with variable content. Keep doing what you're doing and those confidence scores will keep improving!
------------------------------
Josh Coyle
Senior Professional Services Consultant
------------------------------