Simon/What do I need?

    From KDE UserBase Wiki
    Revision as of 18:03, 17 August 2012 by Claus chr (talk | contribs)
    (diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

    To get Simon to recognize speech and react to it you need what is called a speech model.

    Speech models describe how your voice sounds, what words exist, how they sound and what word combination ("sentences" or "structures") exist.

    A speech model basically consists of two parts:

    • Language model: Describes all existing words and what sentences are grammatically correct
    • Acoustic model: Describes how words sound

    You need both these components to get Simon to recognize your voice.

    Language Model

    In most cases you only need to install the appropriate scenario for your use case to set up your language model.

    To create your own language model, you can use Simon to add / edit / remove words and grammar structures.

    To make the adding of words easier, you can import a Shadow dictionary.

    Acoustic Model

    To create your own acoustic model you can simple read the trainings texts that come with your selected scenarios a couple of times.

    If you are creating your own scenario you can easily create trainings texts yourself. See the Simon manual for details.

    You can, however use static or adapted base models to avoid using the HTK or to improve the recognition rate.