|Thread title||Replies||Last modified|
|But how to actually use Simon for voice recognition?||7||20:47, 9 April 2015|
I have spent several days with Simon. Very very cool. Have installed it, trained it, downloaded the Firefox scenario, installed the Mouseless browsing addon for Firefox, read the manual. Very comprehensive preparation. Well done.
But how do I actually use it for voice recognition?
It's running with the main page (Scenarios, Training, Acoustic Model, Recognition) open. The VU meter is fluttering. "Volume is correct" is checked with a green check mark. "Connected and activated" is shown, and "Finished" is at 100%.
So, now what? How to actually use it?
Yes, thank you, but I did not find anything about actually using it, just how to set it up. Maybe I don't have the right manual - the downloadable PDF?
I've just gone through the link you gave, which is what I have. I see how to make recordings, I've done that. I see how to train it, I've done that. Like I said, I've done all the precursors, but cannot find anything that says, for example:
In order to control Firefox click on xxxx, say yyy etc.
In order to convert speech to text, start up kate and click zzz
Easy when you know how, I'm sure. But I don't see anything that shows how.
Well, several hours more, here is what I've started to write to help augment the manual. It turns out that there is a LOT of stuff that has to happen after the installation wizard stops at the Overview screen. Mostly, installing a recognition engine (HTC or CMU, neither of which succeed for me) and installing, configuring, and running plugins, which of course cannot happen until first the HTK/CMU issue is resolved.
The CMU issue is that SphinxTrain seems not to be included in the openSUSE 12.3 repositories (although SphinxBase and pocketsphinx are) and then trying to install from tar.gz fails with a weird error.
The HTK issue is that you have to download from source and again an error at line 77 of the Makefile prevents further progress.
I'll plug at it a bit more, but suspect it is beyond my current abilities.
= Here is the draft of the manual addendum I'm working on
Using Simon To use Simon you have to install various elements and do some training by following the First Use Wizard. When you're done with this you arrived at what is called the Overview Screen: This comprises four sections: Scenarios Training Acoustic Model Recognition You need to do a number of things beyond the Wizard. For starters, you need to choose a recognition model for the simond server. There are two choices: CMUSphinx and HTK Julius. Both are third party applications. The openSUSE repositories do not contain sphinxtrain and trying to install it directly from a tar.gz fails with undefined reference to `lineiter_lineno' We are not able to find the source of that error. We then install the HTK Julius model as described at: http://userbase.kde.org/Simon/Installation#Optional:_HTK_installation_2 But it fails with: Makefile:77: *** missing separator (did you mean TAB instead of 8 spaces?). Stop. Examining Makefile at line 77 shows that the argument $(BOOK) is used instead of what should be $(HTKBOOK) but correcting that does not correct the error. So we are pretty much confounded until we can install a recognition engine. Beyond that however, you need to install some plugins to proceed any further. You can install plugins from the Overview screen by clicking on the second button under the Scenario section. In my case it is Open “Standard”. This brings up a new screen labeled Direct Execution Of Simon Commands. On the left there are 5 icons: Word List Training Grammar Context Commands There are two panels to the right. There is a button called Manage Plugins under the first panel. Click on Manage plugins and this brings up the Manager Actions screen: This has three tabs: General Lists Autorun clicking on the Add button rings an Add actions screen: This has a list of all the available plugins. Dictation is the second one. Select it and Click OK. It will be added to the Manage Actions list on the left.
all of these plugins are described in the manual beginning that section 4.8 commands on page 86.
highlight the desired plug in and click OK. this adds the plug in to the command list.
So definitely a labour of love, but a work in progress, unless I'm seriously missing something: how to install the engines...
Kind regards, Andy
Simon developer here.
You are right in saying that you need a recognizer (3rd party application). I suspect that you installed the "unstable" package from the OBS? I just looked at it and the package is quite broken (*at least* a few missing dependencies). I reported the issue(s) to the maintainer of the package.
From what I can see, you should still be able to get PocketSphinx to run, if you manually install it after you install Simon (again, this should be a dependency). If you can live with a static base model (no training), you need neither the HTK nor SphinxTrain. PocketSphinx (for SPHINX models) and Julius (for HTK models) will suffice.
If you have problems compiling the HTK or SphinxTrain, please notify the appropriate maintainers. This is sadly out of my control. FWIW, Nickolay, the maintainer of SPHINX is extremely responsive and very helpful. You can find him on the IRC channel #cmusphinx on Freenode (nickname: nshm).
As for the "what to do after installation" bit. If you have installed some scenarios and a fitting base model (look at the tags: "[EN/H4W]..." scenarios need the "[EN/H4W] ..." base model, etc.), Simon should auto-activate right after completing the wizard (EN/H4W requires PocketSphinx). From there on you can say the commands that are defined in the scenarios you downloaded. The scenario description usually contains a list. E.g.: http://kde-files.org/content/show.php/%5BEN%2BH4W%5D+Firefox?content=156100
Please note that at the moment, there is no free dictation ("speech to text"), just commands.
Best regards, Peter