Environment info: IC 4.0 SU4, Nuance Recognizer 10 off-server, connects using MRCP.
I am using the following grammar file:
<grammar xmlns="http://www.w3.org/2001/06/grammar"
version="1.0" xml:lang="en-US" root="Menu"
tag-format="swi-semantics/1.0">
<rule id="Menu">
<item repeat="0-1"> Uh </item>
<item repeat="0-1"> I </item>
<item repeat="0-1"> want to </item>
<item repeat="0-1"> wanna </item>
<item repeat="0-1"> would like to </item>
<item repeat="0-1"> Let me </item>
<one-of>
<item>
<ruleref uri="#NoAccount"/>
<tag>_value='NoAccount'</tag>
</item>
<item>
<ruleref uri="#Transfer"/>
<tag>_value='Transfer'</tag>
</item>
</one-of>
<item repeat="0-1"> Please </item>
</rule>
<rule id="NoAccount">
<one-of>
<item>Don't have</item>
<item>Don't know</item>
<item>No account</item>
</one-of>
<item repeat="0-1">it</item>
</rule>
<rule id="Transfer">
<item repeat="0-1">Transfer</item>
<item repeat="0-1">Talk</item>
<item repeat="0-1">me</item>
<item repeat="0-1">to</item>
<item repeat="0-1">a</item>
<item repeat="0-1">an</item>
<item repeat="0-1">operator</item>
<item repeat="0-1">live person</item>
<item repeat="0-1">someone</item>
<item repeat="0-1">somebody</item>
<item repeat="0-1">person</item>
<item repeat="0-1">human</item>
</rule>
</grammar>
I3 fails to load this as a pre-registered grammar:
Snippet from E:\I3\IC\Logs\2014-01-29\RecoSubsystem.ininlog
Message Timestamp (UTC-05:00) Topic Thread Level
GrammarManagerImpl::RegisterGrammar() : Registering grammar failed: ParseError -- Line=28, Col=36, Event='error.badfetch.grammar.syntax', Message="[Line 28, Col 36]: Invalid characters in token: "Don't"", ErrorText="Invalid characters in token: "Don't"" 15:17:42.1008113_0012 GrammarManager 0x734 21
Nuance doesn?t seem to mind this, in fact according to their documentation, they support it:
Escaped characters
All grammars (and any embedded ECMAScript code) must respect characters reserved by the XML standard. For example, the ampersand ?&? functions as an escape character: any XML or GrXML parser will interpret it as the beginning of code, rather than as the ampersand character itself.
To represent special characters, you must ?escape? them: encode them so they will be interpreted correctly. The basic code for each such character consists of an ampersand followed by a letter or number code, and ending in a semi-colon.
The characters that must be escaped to ensure correct interpretation include:
Character name XML Code
quote (") "
apostrophe (') '
ampersand (&) &
less than (<) <
greater than (>) >
For example, to encode the company name AT&T, you would have to represent the ampersand with its XML code equivalent:
<item>AT&T</item>
Their parser tool certainly doesn?t mind it:
D:\Nuance\amd64\bin>parseTool.exe "D:\Grammars\BJWCEscape.grxml" -utt -test_sent
ences -debug_output
20140129152502254| 960|||| ** WARNING **| 0| SWI_SUCCESS| success| findDefaultL
anguage | Default language 'en.us' autocomputed!
20140129152511986| 960|||| ** ERROR => SVC AFFECT **| 27100| SWIREC| SWIrec API
| LockFeatureMulti | Licensing: no speech license available.
20140129152512009| 960|||| ** WARNING **| 27100| SWIREC| SWIrec API| LockFeatur
eMulti | No valid license allocated.
20140129152512029| 960|||| ** ERROR => SVC AFFECT **| 27100| SWIREC| SWIrec API
| LockFeature | Licensing: Unable to lock feature 'osr_nl_u'.
20140129152512062| 960|||| ** WARNING **| 604| UNKERR| unknown error| nuan:SBin
et |
PROG parsetool.exe:
arg <spec-filename> == D:\Grammars\BJWCEscape.grxml
arg <-test_sentences> == -test_sentences
arg <-debug_output> == -debug_output
arg <-utt> == -utt
next sentence: don't have it
Parsing 'don't have it' with uri 'D:\Grammars\BJWCEscape.grxml'...
Step 0: Menu
Enviro: {SWI_vars:{}NoAccount:{SWI_literal:Don't have it SWI_spoken:Don't have
it SWI_confidence:1 SWI_literalConfidence:1 }}
Input : {}
Script: _value='NoAccount'
Result: {_value:NoAccount }
Parse 0: {{Don't have it NoAccount} Menu}
<result>
<interpretation grammar="ParseToolGrammar" confidence="100">
Don't have it
<instance>
<_value confidence="100">
NoAccount
<SWI_literal>
Don't have it
</SWI_literal>
<SWI_grammarName>
ParseToolGrammar
</SWI_grammarName>
<SWI_meaning>
{_value:NoAccount}
</SWI_meaning>
</instance>
</interpretation>
</result>
Parse successful, line 1
next sentence:
However the RecoGrammarValidatorU.exe doesn't like it.
D:\I3\IC\Server>RecoGrammarValidatorU.exe "D:\I3\IC\ASRGrammars\BJWCEscape.grxml
"
Opening file "D:\I3\IC\ASRGrammars\BJWCEscape.grxml"
MIME type indicates GrXML grammar (application/srgs+xml)
Parsing grammar...
Parse Error:
Line: 28
Column: 36
Event: error.badfetch.grammar.syntax
Error Text: Invalid characters in token: "Don't"
Any suggestions?