Difficoltà alla sincronizzazione di libavformat / ffmpeg con x264 e rtp

https://stackoverflow.com//questions/11691921

12-12-2019
|

Domanda

Ho lavorato su alcuni software di streaming che prendono feed live da vari tipi di telecamere e flussi sulla rete utilizzando H.264. Per realizzare questo, sto usando direttamente l'encoder x264 (con la preset di "zerolatenza") e nali di alimentazione come sono disponibili per lagavformat per imballare in RTP (in definitiva RTSP). Idealmente, questo L'applicazione dovrebbe essere il più in tempo reale possibile. Per la maggior parte, Questo ha funzionato bene.

Sfortunatamente, tuttavia, c'è una sorta di problema di sincronizzazione: Qualsiasi riproduzione video sui clienti sembra mostrare alcuni fotogrammi fluidi, seguito da una breve pausa, quindi più fotogrammi; ripetere. Inoltre, Sembra che ci sia circa un ritardo di 4 secondi. Questo succede con Ogni lettore video che ho provato: Totem, VLC e conduttori di Gstreamer di base.

Ho bollito tutto fino a un caso di prova un po 'piccolo:

#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <x264.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>

#define WIDTH       640
#define HEIGHT      480
#define FPS         30
#define BITRATE     400000
#define RTP_ADDRESS "127.0.0.1"
#define RTP_PORT    49990

struct AVFormatContext* avctx;
struct x264_t* encoder;
struct SwsContext* imgctx;

uint8_t test = 0x80;


void create_sample_picture(x264_picture_t* picture)
{
    // create a frame to store in
    x264_picture_alloc(picture, X264_CSP_I420, WIDTH, HEIGHT);

    // fake image generation
    // disregard how wrong this is; just writing a quick test
    int strides = WIDTH / 8;
    uint8_t* data = malloc(WIDTH * HEIGHT * 3);
    memset(data, test, WIDTH * HEIGHT * 3);
    test = (test << 1) | (test >> (8 - 1));

    // scale the image
    sws_scale(imgctx, (const uint8_t* const*) &data, &strides, 0, HEIGHT,
              picture->img.plane, picture->img.i_stride);
}

int encode_frame(x264_picture_t* picture, x264_nal_t** nals)
{
    // encode a frame
    x264_picture_t pic_out;
    int num_nals;
    int frame_size = x264_encoder_encode(encoder, nals, &num_nals, picture, &pic_out);

    // ignore bad frames
    if (frame_size < 0)
    {
        return frame_size;
    }

    return num_nals;
}

void stream_frame(uint8_t* payload, int size)
{
    // initalize a packet
    AVPacket p;
    av_init_packet(&p);
    p.data = payload;
    p.size = size;
    p.stream_index = 0;
    p.flags = AV_PKT_FLAG_KEY;
    p.pts = AV_NOPTS_VALUE;
    p.dts = AV_NOPTS_VALUE;

    // send it out
    av_interleaved_write_frame(avctx, &p);
}

int main(int argc, char* argv[])
{
    // initalize ffmpeg
    av_register_all();

    // set up image scaler
    // (in-width, in-height, in-format, out-width, out-height, out-format, scaling-method, 0, 0, 0)
    imgctx = sws_getContext(WIDTH, HEIGHT, PIX_FMT_MONOWHITE,
                            WIDTH, HEIGHT, PIX_FMT_YUV420P,
                            SWS_FAST_BILINEAR, NULL, NULL, NULL);

    // set up encoder presets
    x264_param_t param;
    x264_param_default_preset(&param, "ultrafast", "zerolatency");

    param.i_threads = 3;
    param.i_width = WIDTH;
    param.i_height = HEIGHT;
    param.i_fps_num = FPS;
    param.i_fps_den = 1;
    param.i_keyint_max = FPS;
    param.b_intra_refresh = 0;
    param.rc.i_bitrate = BITRATE;
    param.b_repeat_headers = 1; // whether to repeat headers or write just once
    param.b_annexb = 1;         // place start codes (1) or sizes (0)

    // initalize
    x264_param_apply_profile(&param, "high");
    encoder = x264_encoder_open(&param);

    // at this point, x264_encoder_headers can be used, but it has had no effect

    // set up streaming context. a lot of error handling has been ommitted
    // for brevity, but this should be pretty standard.
    avctx = avformat_alloc_context();
    struct AVOutputFormat* fmt = av_guess_format("rtp", NULL, NULL);
    avctx->oformat = fmt;

    snprintf(avctx->filename, sizeof(avctx->filename), "rtp://%s:%d", RTP_ADDRESS, RTP_PORT);
    if (url_fopen(&avctx->pb, avctx->filename, URL_WRONLY) < 0)
    {
        perror("url_fopen failed");
        return 1;
    }
    struct AVStream* stream = av_new_stream(avctx, 1);

    // initalize codec
    AVCodecContext* c = stream->codec;
    c->codec_id = CODEC_ID_H264;
    c->codec_type = AVMEDIA_TYPE_VIDEO;
    c->flags = CODEC_FLAG_GLOBAL_HEADER;
    c->width = WIDTH;
    c->height = HEIGHT;
    c->time_base.den = FPS;
    c->time_base.num = 1;
    c->gop_size = FPS;
    c->bit_rate = BITRATE;
    avctx->flags = AVFMT_FLAG_RTP_HINT;

    // write the header
    av_write_header(avctx);

    // make some frames
    for (int frame = 0; frame < 10000; frame++)
    {
        // create a sample moving frame
        x264_picture_t* pic = (x264_picture_t*) malloc(sizeof(x264_picture_t));
        create_sample_picture(pic);

        // encode the frame
        x264_nal_t* nals;
        int num_nals = encode_frame(pic, &nals);

        if (num_nals < 0)
            printf("invalid frame size: %d\n", num_nals);

        // send out NALs
        for (int i = 0; i < num_nals; i++)
        {
            stream_frame(nals[i].p_payload, nals[i].i_payload);
        }

        // free up resources
        x264_picture_clean(pic);
        free(pic);

        // stream at approx 30 fps
        printf("frame %d\n", frame);
        usleep(33333);
    }

    return 0;
}

Questo test mostra linee nere su uno sfondo bianco che dovrebbe muoversi senza intoppi a sinistra. È stato scritto per FFMPEG 0.6.5 Ma il problema può essere riprodotto su 0.8 e 0.10 (da quello che ho testato finora). Ho preso alcune scorciatoie per la gestione degli errori per rendere questo esempio più corto come possibile mentre ancora mostra il problema, quindi per favore scusa alcuni dei Codice castore. Dovrei anche notare che mentre un SDP non è usato qui, io ho provato ad usare questo già con risultati simili. Il test può essere compilato con:

gcc -g -std=gnu99 streamtest.c -lswscale -lavformat -lx264 -lm -lpthread -o streamtest

Può essere giocato con Gtreamer direttamente:

gst-launch udpsrc port=49990 ! application/x-rtp,payload=96,clock-rate=90000 ! rtph264depay ! decodebin ! xvimagesink
.

Dovresti notare immediatamente la balbuzie. Un comune "correzione" che ho Visto tutto Internet è quello di aggiungere sincronizzazione= false alla tubazione:

gst-launch udpsrc port=49990 ! application/x-rtp,payload=96,clock-rate=90000 ! rtph264depay ! decodebin ! xvimagesink sync=false
.

Ciò fa sì che la riproduzione sia liscia (e quasi in tempo reale), ma è a non soluzione e funziona solo con Gstreamer. Mi piacerebbe aggiustare il problema alla fonte. Sono stato in grado di torrendo con quasi identico Parametri che utilizzano FFMPEG grezzo e non hanno avuto problemi:

ffmpeg -re -i sample.mp4 -vcodec libx264 -vpre ultrafast -vpre baseline -b 400000 -an -f rtp rtp://127.0.0.1:49990 -an
.

Così chiaramente sto facendo qualcosa di sbagliato. Ma cos'è?

Soluzione

1) Non hai impostato PTS per fotogrammi che invii a libx264 (probabilmente dovresti vedere "Avvertenze" non rigorosamente-monotoniche ") 2) Non hai impostato PTS / DTS per i pacchetti che invii a Libavformat's RTP MUXER (non sono sicuro al 100% che deve essere impostato, ma immagino che sarebbe meglio. Dal codice sorgente sembra utilizzare RTPS PTS). 3) IMHO Usleep (33333) è cattivo.Causa l'encoder è anche questa volta (crescente latenza) mentre è possibile codificare il frame successivo durante questo periodo anche se non è ancora necessario inviarlo da RTP.

P.S.BTW Non hai impostato Param.rc.i_rc_method in x264_rc_abr in modo x 2664 Utilizzerà il CRF 23 invece e ignorerà il tuo "param.rc.i_bitrato= bitrate".Inoltre può essere una buona idea utilizzare VBV durante la codifica per l'invio della rete.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow