Question

The Android platform lets us record audio buffers using the AudioRecord object. What we get is an array of shorts containing the sampled values. Fine, now if I say "one two three four five six seven" in the mic, here is the result (see pic).

What I want is to split the buffer into 7, one part for each word. I'm no signal processing expert, so what would be the common way to do that? My eye is capable to split the waveform instantly, so an algorithm should exist for that. It doesn't even seem necessary to filter the signal in that case.

EDIT: what I'm building is a way for a user to input a numeric challenge by voice, so only digits are pronounced, and the user is required to leave a short pause between two digits.

enter image description here

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top