Which language to use to write a Speech Recognition software?

https://stackoverflow.com/questions/664984

21-08-2019
|

Question

I want to write a basic Speech Recognition software which can convert speech to text. I wanted to know which language is most suited to write such a software. Is Java suited for this job?

edit : Thank you all for the responses. I want to build a tool for a college project. I don't want to write it from scratch. Just want to demo the power of Speech Recognition. The tool should just write whatever a user says on a text editor like notepad. It need not be too accurate. I just want to experiment and learn the various algorithms behind Speech Recognition as I find this field very interesting.

Thank you, Deepak

Solution

Java may be suited for an interface to it but speech recognition requires seriously raw grunt. I'd be choosing a compiled close-to-the-metal language like C for the actual recognition engine.

This is not something to be undertaken lightly, by the way. There's an awful lot of theory you'll need to learn even before you begin. Myself, I would license one of the existing engines if possible, and concentrate on building a decent product around it.

That's if your intent is to build a product. If you just want to experiment, by all means write your own. It'll be fun (up to a point :-).

OTHER TIPS

My students are using Sphinx. It is written in Java (a port from C++ I believe). It might not be suitable for what you want (I think you would need to create your own dictionary) but worth checking out.

I agree with Pax that this is potentially quite a big project, and that the most practical solution is probably to just licence an existing engine.

If the scope of what you want to do is just distinguish between a few previously known possible utterances, it's a significantly smaller project, but still considerable.

But... if you decide you really really really do want to start developing your own, I can't see a reason not to use Java. The idea that "C is faster" is largely a myth (or based on out-of-date information).

A agree with almost everything Pax said, so I'm going to be contrarian and argue for the opposite. The conventional wisdom is that speech recognition "requires seriously raw grunt" and it may be because this is true.

But it also may be that everyone believes that because that's how it's always been done. Arguing from the fact that the human brain doesn't do huge amounts of brute force data churning to recognize speech, I would suggest that there exist clever feature extraction algorithms to do the job much more efficiently.

If that is the case, and if you seek to find such an algorithm, a higher level language may be better suited to the task. Anything you loose in efficiency you'll make up and more in algorithmic expressiveness.

That said, he's probably right.

I think that java can be a good option, it all depends on how will you receive the input. There are some nice librarys for sounds in Java.

The language is not going to be the problem because it will be a matter of recognizing the patterns. If java is the language you are most familiar with, I would use it.

Java is turing complete so it can handle every programming job. Whether you want to do something in Java is entirely up to you.

We had moderate success with Shynx framework written in Java, but the real hard work lies in understanding algorithms and math involved in the area and then in fine tuning engine to your particular needs.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow