Question

I am developing an Android application that needs to send short (<60 second) voice messages to a server.

File size is very important because we don't want to eat up data plans. Sound quality is important to the point the message needs to be recognizable, but it should require significantly less bandwidth/quality than music files.

Which of the standard Android audio encoders (http://developer.android.com/reference/android/media/MediaRecorder.AudioEncoder.html) and file formats (http://developer.android.com/reference/android/media/MediaRecorder.OutputFormat.html) are likely to be best for this application?

Any hints on good starting places for bit rates, etc. would be welcome as well.

We need to ultimately be able to play them on Windows and iOS, but it's okay if that takes some back-end conversion. There doesn't seem to be an efficient cross-platform format/encoder so that's where we'll put in the work.

Was it helpful?

Solution

AMR is aimed precisely at speech compression, and is the codec most commonly used for normal circuit-switched voice calls.
The narrow-band variant (AMR-NB, 8kHz sample rate) is still the most widely used and should be supported on pretty much any mobile phone you can find.
The wide-band variant (AMR-WB, 16kHz sample rate) offers better quality and is preferred if the target device supports it and you can spare the bandwidth.
Typical bitrates for AMR ranges from around 6 to 14 kbit/s.
I'm not sure if there are any media players for Windows that handle .3GP files with AMR audio directly (VLC might). There are converters that can be used, though.

HE-AAC (v1) could also be used for speech encoding, however this page suggests that encoding support on Android is limited to Android 4.1 and above. Suitable rates might be 16 kHz / 64 kbps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top