Speech Recognition & Programming [closed]

https://stackoverflow.com/questions/1408874

speech-recognition

05-07-2019
|

Question

Has anyone had success with Dragon Naturally Speaking voice recognition software when it comes to programming?

I am wondering because I think it would be a lot faster than me typing by hand, and easier on my carpol-tunnel.

I program from day to day in visual basic 6 ide, visual studio 2008 ide + team explorer, writing emails, and chatting over Windows Live IM.

I have a need for a command-based interface where I can bind voice commands to keystrokes, switch between spelling / saying words / saying words without spaces, etc.

Any comments are much appreciated.

Solution

I think that "voice programming" and "programming by voice" search better "speech recognition programming". It has been tried but not yet caught on.

Here is a OpenSource project: VoiceCode. Here is a video of it in action. Voicecode ~~seems to have been inactive for more than a year~~ appears to be active again.
Here is a OpenSource project: ShortTalk and EmacsListen. Here is a video of it in action.
Another option that come up in searches is Harmonia.

The first hand accounts I've read all seem to agree that programming by voice can be tough on the vocal cords. Then they go on to say how it is getting better and a really usable system may be right around the corner. The first time I read that was in the late 1990s...

OTHER TIPS

I tried to program using general purpose speech recognition and came to the conclusion that programming is too far from regular spoken language. You need a specific grammar that it is tailored to coding (not necessarily language specific). As a result of this experience I looked into programming using speech recognition. It's still only a proof of concept, but to some extent I believe it is doable.

Things to consider:

If you are healthy and can code at full speed with both hands, you will be faster with a keyboard/mouse. I type at around 60 wpm and there's no way I can go faster with voice. However, I'm a very slow typer with only one hand. I believe that you can decrease the amount of strain on your arms considerably by being assisted by voice commands as opposed to going voice only.
There are activities within a programming IDE that are not coding/typing. Being able to perform many of these tasks using voice should further reduce strain.
Not everyone works in an environment where it is feasible to sit and talk to the computer.

A short video of the POC is on Youtube. http://www.youtube.com/watch?v=x3Lm9nrFeMk

Dictation usually works by having a language model (a mapping between phonemes to strings). Unfortunately, the language model for programming simply isn't a good match to English, so your recognition error rate would be quite high.

Spacing and navigation are the least of your worries; you could build a set of macros to take advantage of Visual Studio's knowledge of your code (goto method, etc.)

IM and emails would be well handled by DNS (or Windows Speech Recognition, for that matter).

I developed RSI (tenosynovitis), similar to carpal tunnel in both wrists a few years ago, so I certainly can understand the need to want to switch to speech for coding.

Unfortunately there's really not a lot out there that gets the job done in a decent way - as has already been mentioned code navigation is extremely frustrating by voice alone, and the wide array of unusual characters us programmers need just don't help the matter for general use!

I personally used Dragon Naturally Speaking for around 3 months but eventually decided that it simply wouldn't work as a long term solution. It was suggested to me by a physiotherapist to try an ergonomic keyboard, Maltron (with the Maltron layout) specifically. Considering that I cripple in pain with a standard keyboard I can now code pain-free all day long. They do (or used to) a rental model so that you can try it out. Even if you're not in a position to be using a keyboard now, it might be worth considering in the future.

I think that voice recognition can help reduce the number of keystrokes required for programming. I am using Dragon NaturallySpeaking to write PHP code, and I have created a number of commands to output frequently used statements. As mentioned by others, navigation within the code is a difficulty. I would advise anyone with repetitive strain injury to try to minimise their programming in as many ways as possible. For example, think about what you want to do carefully before you sit down at your monitor. Use a pen and pencil to write pseudocode. Make your code as reusable as possible. Stick to best programming practices. Get away from your screen; read books. Vary your work position; I lie on the floor with my iPad. Try android voice recognition for answering short emails or text messages; it's free and multilingual, and pretty accurate in a quiet environment. Stand up and walk around. Think about getting someone else to do your programming for you.

I developed tenosyvitis on both wrists and I've used dragon for about two years to do basic typing. I have basic programming ability but I've found it extremely cumbersome to use dragon for coding which has resulted in me choosing a different career path. I use a Microsoft ergonomic keyboard and evoluent mouse which help but don't allow for hours of endless typing and mouseing.

I think a library of commands for dragon could be written (for each language) but it couldn't become a true substitute for a keyboard.

I'm not sure if speech recognition will be able to solve really your problem - aren't there just too many symbols which are used rarely in natural language, but common for programming (curly brackets, semi-colon, quotation marks)?

But what will probably hamper the experience most is that -- unlike normal text -- code is seldom written in a linear manner but involves jumping between lines, methods, and classes (at least that's what I often do). Of course one might find additional spoken commands for this as well, but I guess the overall experience would not be too satisfying.

Please go through the following link for more details on PSPD. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5072009

This paper presents the method for extracting the phase space point distribution parameter for improving the accuracy of speech recognition systems. By utilizing nonlinear or chaotic signal processing techniques to extract time domain based phase space features, a method is suggested for speech recognition

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow