As I know, there often used transition to the frequency domain by fast Fourier transform (FFT) and it analyzing. Also need some dictionary of speeched words for frequency correlation.
Please see this links:
CMU Sphinx have java implementation.
David Wagner have a good article and matlab implementation.
P.S. Ohh, if you speak in russian, why you don't read this article - very simple, with java examples.
P.P.S. Honestly, I never use this framework, but if you have only a superficial knowledge about speech recognition, robust and easyest way is to use existing complete solutions like frameworks or libraries, otherwise you need spend time to possess the necessary knowledge threshold. In this case you can read this article.