How do I interpret audio encoded binary data?

https://stackoverflow.com/questions/10455143

05-06-2021
|

Question

I have built a little program that encodes binary data into a sound. For example the following binary input:

00101101

will produce a 'sound' like this:

################..S.SS.S################

where each character represents a constant unit of time. # stands for a 880 Hertz sine wave which is used to determine start and end of transmission, . stands for silence, representing the zeroes, and S stands for a 440 Hertz sine wave, representing the ones. Obviously, the part in the middle is much longer in practice.

The essence of my question is: How can I invert this operation?

The sound file is transmitted to the recipient via simple playback and recording of the sound. That means I am not trying to decode the original sound file which would be easy.

Obviously I have to analyze the recorded data with respect to frequency. But how? I have read a bit about Fourier Transform but I am quite lost here.

I am not sure where to start but I know that this is not trivial and probably requires quite some knowledge about signal processing. Can somebody point me in the right direction?

BTW: I am doing this in Ruby (I know, it's slow - it's just a proof of concept) but the problem itself is not programming language specific so any answers are very welcome.

La solution

Your problem is clearly trying to demodulate an FSK modulated signal. I would recommend implementing a correlation bank tuned to each frequency, it is a lot faster than fft if speed is one of your concerns

Autres conseils

If you know the frequencies and the modulation rate, you can try using 2 sliding Goertzel filters for FSK demodulation.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow