Question

When I try to create pipeline that uses H264 to transmit video, I get some enormous delay, up to 10 seconds to transmit video from my machine to... my machine! This is unacceptable for my goals and I'd like to consult StackOverflow over what I (or someone else) do wrong.

I took pipelines from gstrtpbin documentation page and slightly modified them to use Speex:

This is sender pipeline: #!/bin/sh

gst-launch -v gstrtpbin name=rtpbin \
        v4l2src ! ffmpegcolorspace ! ffenc_h263 ! rtph263ppay ! rtpbin.send_rtp_sink_0 \
                  rtpbin.send_rtp_src_0 ! udpsink host=127.0.0.1 port=5000                            \
                  rtpbin.send_rtcp_src_0 ! udpsink host=127.0.0.1 port=5001 sync=false async=false    \
                  udpsrc port=5005 ! rtpbin.recv_rtcp_sink_0                           \
        pulsesrc ! audioconvert ! audioresample  ! audio/x-raw-int,rate=16000 !    \
                  speexenc bitrate=16000 ! rtpspeexpay ! rtpbin.send_rtp_sink_1                   \
                  rtpbin.send_rtp_src_1 ! udpsink host=127.0.0.1 port=5002                            \
                  rtpbin.send_rtcp_src_1 ! udpsink host=127.0.0.1 port=5003 sync=false async=false    \
                  udpsrc port=5007 ! rtpbin.recv_rtcp_sink_1

Receiver pipeline:

!/bin/sh

gst-launch -v\
    gstrtpbin name=rtpbin                                          \
    udpsrc caps="application/x-rtp,media=(string)video, clock-rate=(int)90000, encoding-name=(string)H263-1998" \
            port=5000 ! rtpbin.recv_rtp_sink_0                                \
        rtpbin. ! rtph263pdepay ! ffdec_h263 ! xvimagesink                    \
     udpsrc port=5001 ! rtpbin.recv_rtcp_sink_0                               \
     rtpbin.send_rtcp_src_0 ! udpsink port=5005 sync=false async=false        \
    udpsrc caps="application/x-rtp,media=(string)audio, clock-rate=(int)16000, encoding-name=(string)SPEEX, encoding-params=(string)1, payload=(int)110" \
            port=5002 ! rtpbin.recv_rtp_sink_1                                \
        rtpbin. ! rtpspeexdepay ! speexdec ! audioresample ! audioconvert ! alsasink \
     udpsrc port=5003 ! rtpbin.recv_rtcp_sink_1                               \
     rtpbin.send_rtcp_src_1 ! udpsink host=127.0.0.1 port=5007 sync=false async=false

Those pipelines, a combination of H263 and Speex, work fine enough. I snap my fingers near camera and micropohne and then I see movement and hear sound at the same time.

Then I changed pipelines to use H264 along the video path.

The sender becomes: #!/bin/sh

gst-launch -v gstrtpbin name=rtpbin \
        v4l2src ! ffmpegcolorspace ! x264enc bitrate=300 ! rtph264pay ! rtpbin.send_rtp_sink_0 \
                  rtpbin.send_rtp_src_0 ! udpsink host=127.0.0.1 port=5000                            \
                  rtpbin.send_rtcp_src_0 ! udpsink host=127.0.0.1 port=5001 sync=false async=false    \
                  udpsrc port=5005 ! rtpbin.recv_rtcp_sink_0                           \
        pulsesrc ! audioconvert ! audioresample  ! audio/x-raw-int,rate=16000 !    \
                  speexenc bitrate=16000 ! rtpspeexpay ! rtpbin.send_rtp_sink_1                   \
                  rtpbin.send_rtp_src_1 ! udpsink host=127.0.0.1 port=5002                            \
                  rtpbin.send_rtcp_src_1 ! udpsink host=127.0.0.1 port=5003 sync=false async=false    \
                  udpsrc port=5007 ! rtpbin.recv_rtcp_sink_1

And receiver becomes: #!/bin/sh

gst-launch -v\
    gstrtpbin name=rtpbin                                          \
    udpsrc caps="application/x-rtp,media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264" \
            port=5000 ! rtpbin.recv_rtp_sink_0                                \
        rtpbin. ! rtph264depay ! ffdec_h264 ! xvimagesink                    \
     udpsrc port=5001 ! rtpbin.recv_rtcp_sink_0                               \
     rtpbin.send_rtcp_src_0 ! udpsink port=5005 sync=false async=false        \
    udpsrc caps="application/x-rtp,media=(string)audio, clock-rate=(int)16000, encoding-name=(string)SPEEX, encoding-params=(string)1, payload=(int)110" \
            port=5002 ! rtpbin.recv_rtp_sink_1                                \
        rtpbin. ! rtpspeexdepay ! speexdec ! audioresample ! audioconvert ! alsasink \
     udpsrc port=5003 ! rtpbin.recv_rtcp_sink_1                               \
     rtpbin.send_rtcp_src_1 ! udpsink host=127.0.0.1 port=5007 sync=false async=false

This is what happen under Ubuntu 10.04. I didn't noticed such huge delays on Ubuntu 9.04 - the delays there was in range 2-3 seconds, AFAIR.

Was it helpful?

Solution

With some help from "Sharktooth" in #x264 on Freenode, I found out that the git version of gst-plugins-ugly supports the "zero-latency" preset.

http://cgit.freedesktop.org/gstreamer/gst-plugins-ugly

I tweaked your example to set "x264enc pass=qual quantizer=20 tune=zerolatency", and the latency seems to stay at 0.7 - 0.9 seconds. I can't figure out how to get it any lower.

OTHER TIPS

Something in there is buffering, most likely the encoder. The more data it has to work with, the more effective compression it can achieve. I'm not familiar with that encoder, but there is usually a setting for the amount of buffering.

x264 by default buffers the input to have more data to work with. The increase of delay from Ubuntu 10.04 is most probably because it was stuck at an old x264 version, before the introduction of --mbtree and --rc-lookahead.

In This page of Mewiki you can see how to calculate the latency, in number of frames, and the following, on what you should disable first to reduce the latency:

Reducing x264's latency is possible, but reduces quality. If you want no latency, set --tune zerolatency. If you can handle a small latency (ie under 1 second), it is well worth tuning the options to allow this. Here is a series of steps you can follow to incrementally reduce latency. Stop when your latency is low enough:

  1. Start with defaults
  2. Kill sync-lookahead
  3. Drop rc-lookahead to no less than ~10
  4. Drop threads to a lower value (i.e. say 6 instead of 12)
  5. Use sliced threads
  6. Disable rc-lookahead
  7. Disable b-frames
  8. Now you're at --tune zerolatency

So, you should first try to add to your command-line something like

sync-lookahead=0, rc-lookahead=10 (I'm not sure of the formatting of the command-line in your application)

This should shave most of the latency, at a low compression efficiency cost. If you have many cores (an quadcore with HT, for example), it may also be worth do number 4., at a little speed cost.

Use tune=zerolatency, as per Sharktooth advice, if this still isn't enough.

More on the topic: http://x264dev.multimedia.cx/archives/249

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top