05.05.06
ASR for Programmers
So, I was thinking about this a while ago. The big problem with speech recognition is, well, it sucks. Accuracy for general transcription just isn’t good enough for widespread adoption. It’s currently being used primarily in areas with an extremely limited grammar and lexicon. For things like this, it seems to work pretty well. Also, programmers type a lot, in fact they often end up getting carpal tunnel and related problems. Coincidentally, programming languages are, by definition, rigorously defined and limited languages. They have clearly defined grammars and lexicons. Coincidentally, the lack of these properties are precisely why natural languages are so darn difficult for speech recognition systems. So using speech recognition for dictation of computer programming languages should be easy, right? I got motivated and spent a while looking around online for some kind of programmer’s dictation application or plugin. I failed, but that doesn’t mean there isn’t anything out there, so if anyone finds anything, feel free to let me know. I did find SpeechClipse, but that’s a plugin for editing commands and such, not dictation. So, as of now, I think I have a project to start this summer. Sphinx-4 is a well-organized, well-documented, ASR system implemented in java. The Java language specification is, well, a complete syntactic description of java. I just have to reformat the grammar and create a language model and setup sphinx to use that language model. Also, I just love eclipse, so I figure I’ll try to do all of this as an eclipse plugin. There are tons of good tutorials. Here is one I like.