Simon is licensed under the GPL license making it free and open source software.
However, as most larger software projects we use a lot of third party products and pragmatically decided to use those that for the time being seemed to work best. This is why one of our dependencies and the trainings modules use external software that is not licensed in a GPL compatible way.
- 1 Disclaimer
- 2 Julius
- 3 HTK
- 3.1 Usage in Simon
- 3.2 The Problem
- 3.3 Solution
I am not a lawyer. As such everything written here is just my interpretation of various licenses and complicated legal situations.
I might be wrong.
Usage in Simon
Simon uses the Large Vocabulary Continuous Speech Recognition Engine Julius for the recognition.
Julius, while being free and open source software as well uses the Original 4 clause BSD License which, according to Gnu is a recognized free software license but not compatible with the GPL.
Simons License includes a special exception allowing to link to Julius.
Below is the affected part of the license. The added part is highlighted.
3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. In addition, as a special exception, the copyright holders give permission to link the code of portions of this program with the Julius library under certain conditions as described in each individual source file, and distribute linked combinations including the two. You must obey the GNU General Public License in all respects for all of the code used other than Julius. If you modify file(s) with this exception, you may extend this exception to your version of the file(s), but you are not obligated to do so. If you do not wish to do so, delete this exception statement from your version. If you delete this exception statement from all source files in the program, then also delete it here. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code.
To the best of my knowledge, the same issue has been handled in the same way with OpenSSL: .
Usage in Simon
For a speech recognition to work, you need a speech model which itself consists of two parts: The first one describes what words, sentences, etc. exist and of which sounds they are composed of ("language model"). The second one describes how these phonemes sound ("acoustic model").
Simon builds and manages the language model on his own (with the help of the Julius LVCSR in some places).
If you want to create your own acoustic model instead of just using a predefined one or want to adapt a general model to your voice, Simon provides the ability to use the HTK for that.
The HTK license is not a free license.
You are allowed to browse and edit the source code but you can not re-distribute the downloaded version of the HTK.
However, you are encouraged to file bug reports and re-distribute patches if you write them.
The HTK is never directly linked to Simon but started as external program.
This way both the GPL and the HTK license can be fullfilled to the letter.
Simon is functional without the HTK
Since Simon 0.3 many features of Simon do no longer depend on the HTK at all.
Even without ever downloading the HTK the user can set up a completely working speech recognition system including custom use case scenarios, commands and many of the advanced features.
The only part of Simon that doesn't work without the HTK is the user specific acoustic model training.
The following example usage of Simon doesn't need the HTK:
- User downloads Simon
- User downloads Firefox, Amarok and window management scenario to control Mozilla Firefox, Amarok and his window manager
- User downloads the Voxforge GPL acoustic model for English and sets it to be a static base model
The user can then control his computer through voice commands.
The following example usage of Simon still doesn't need the HTK:
- The user would like to change the command "Start Browser" to "Launch Internet Explorer":
- Adding the new words to the language model
- Creating a new grammar structure allowing Simon to recognize this sentence
- Recompiling the language model
- Add a new command "Launch Internet Explorer" to start the browser.
The user can then use "Launch Internet Explorer" to start the browser.
On the other hand, the following example usage of Simon does require the HTK:
- The recognition performance is poor so he decides to train the acoustic model some more through the integrated training procedure.
In short: Without the HTK Simon provides still much more control over the speech model than comparable open source solutions like GnomeVoiceControl which provides no way to change the acoustic model at all.
Speech models generated with the HTK can be free
Speech mdoels created with the HTK do not inherit the HTK license.
The HTK license does not limit the redistribution of the models created with it.
The license covers these parts (direct quote):
All source code, object or executable code, associated technical documentation and any data files in this HTK distribution.
Models created with the software are not covered by this license. In fact, the official FAQ clearly says so (direct quote):
Can I build & sell products based on HTK3?
Yes. You can for example use HTK3 to train models that are then used in your products.
There are free acoustic models available
As models created by the HTK can be free and open source, there are of course free acoustic models already available.
In fact this Wiki contains a list which Free and open source acoustic models that can be used with Simon.
The HTK file format is not proprietary
The models created by the HTK are use a quite simple ASCII file format that is very well documented.
You can for example create your models using the free and open source SPHINX speech recognition toolkit to compile an acoustic model and convert it to the HTK format.
There are multiple model converters between SPHINX and HTK models available (a quick google search turns up a lot more).
The HTK is interchangable
Simon is only very loosely bound to the HTK.
Technically the free and open source SPHINX speech recognition toolkit could replace both the HTK and Julius.
Simon has been designed for both these components to be interchangable.
The only reason why SPHINX support was not yet imported is that the documentation of SPHINX is - in my opinion - not as good as the one from the HTK and the University of Technology Graz that helps us with the speech recognition part of Simon has never used CMU SPHINX before.