I believe I've found an issue with the managed speech recognition libraries. The short problem description is that I am unable to use SRGS/.grxml files that contain "ruleref" elements pointing to other files with System.Speech.Recognition.SpeechRecognitionEngine (InProc). They load fine when using SpeechRecognizer.
After debugging (and finally making a custom build of System.Speech.dll), I've narrowed down on the issue. When you use an InProc recognizer, RecognizerBase.cs sets a custom grammar loader (ISpGrammarResourceLoader). Unfortunately, the RecognizerBase.LoadResource method that is used will receive a null value for the last argument ("pbstrRedirectUrl"), and cause a null exception when it tries to do a split on the null string. This also explains why SpeechRecognizer does not have any issues with the same files.
The same issue happens with both 3.5 and 4.0 versions of the Framework.
I can’t get a speech recognition project I started in 2008 to correctly load the grammar definition files (.grxml / SRGS) in my current environment (Windows 7 64-bit) with SpeechRecognitionEngine.
Well, turns out that the shared SpeechRecognizer will correctly load the exact same GRXML files with external ruleref definitions. After hunting through the .NET source code, the culprit might be in the RecognizerBase.cs method named LoadSapiGrammarFromCfg(). It sets a custom grammar loader for SAPI only when using an InProc recognizer (which SpeechRecognitionEngine is). The comments even state “The rulerefs will be resolved locally.”
So, I have the following options:
- Continue debugging .NET to see if the behavior has changed, and I’m simply missing some element in the XML file. This doesn’t seem likely, as ProcMonitor clearly states that only the first file even gets a open/read attempt. I’ve tried setting the base URI, used absolute paths, did a quick attempt at loading from a URL. The problem is made worse by the fact that I can’t step through the .NET code in either Visual Studio 2008 or 2010. Fixing that will take its own time… I could try using the .NET Framework code (fingers crossed all the required files are available) directly.
- Switch to using SpeechRecognizer. I don’t want this, because I’d end up with a broader dictionary which will reduce accuracy.
- Merge my GRXML files. Yuck. There’s a reason for ruleref support in the files. In addition, the app is meant to be extensible, and having to merge files just makes things harder for the developer and end users.
- Go down the path of using SpeechLib/SAPI, but looking at the amount of code in the Framework, this seems totally redundant.
- SapiErrorInvalidImport A rule reference to an imported grammar cannot be resolved.