You have to slide a time window over your data (say .25 seconds worth of data) and compute the root mean square to see if that period of time is silent or not. Exactly how many bytes constitues .25 seconds depends on the audio rate that your sample is.
So assuming you have you data in byte[] audioData, and that audio data is signed 8 bit PCM data, you'd compute the RMS like below... and then use a value like 1000 as your silence threshold.
/** Computes the RMS volume of a group of signal sizes */
public double volumeRMS(int start, int length) {
long sum = 0;
int end = start + length;
int len = length;
if (end > audioData.length) {
end = audioData.length;
len = end - start;
}
if (len == 0) {
return 0;
}
for (int i=start; i<end; i++) {
sum += audioData[i];
}
double average = (double)sum/len;
double sumMeanSquare = 0;;
for (int i=start; i<end; i++) {
double f = audioData[i] - average;
sumMeanSquare += f * f;
}
double averageMeanSquare = sumMeanSquare/len;
double rootMeanSquare = Math.sqrt(averageMeanSquare);
return rootMeanSquare;
}