Buffering requires prediction, but "Prediction is very difficult, especially if it's about the future". :-)
If you're doing something simple, your buffer size determines the latency from when you hit play until you hear audio. If, as a user, you can tolerate a long delay there, set a big buffer accordingly. If not, many of the better buffering algorithms (likely including your phone's voice channel) vary the playback rate, playing audio back slightly slower than the nominal rate, until a large buffer is built up. If you have that kind of control of your audio hardware, it's the best solution--you can slowly build up several MB of buffer without impacting the latency from clicking play to hearing audio. Users normally don't notice moderate rate changes--most US over-the-air stations speed up song playback by 2%+ to fit in more commercials, and almost no one notices. At 5% many people do notice. At a certain buffer size you can return to the nominal rate, and enjoy uninterrupted playback.
There are many schemes that attempt to train to the perfect size, but your local wifi, and the reliability of the site's playback, make a one-size-fits-all algorithm difficult. People will point to YouTube, Netflix, Hulu, etc--but those aren't live so that's a different problem. Twitch.tv has live content, and there's a buffering latency on start.