Comment peut-on encode une série d'images en H264 en utilisant l'API x264 C?

https://stackoverflow.com/questions/2940671

05-10-2019
|

Question

Comment peut-on utiliser l'API x264 C pour coder les images RBG dans H264 cadres? Je l'ai déjà créé une séquence d'images RBG, comment puis-je transformer maintenant cette séquence en une séquence de trames H264? En particulier, comment puis-je encoder cette séquence d'images RVB en une séquence de cadre H264 constitué d'une seule première image clé H264 suivi par des cadres H264 dépendants?

La solution

Tout d'abord: vérifier le fichier x264.h, il contient plus ou moins la référence pour chaque fonction et la structure. Le fichier x264.c vous pouvez trouver dans le téléchargement contient une implémentation échantillon. La plupart des gens disent de vous baser sur celui-là, mais je trouve assez complexe pour les débutants, il est bon d'exemple à se replier sur cependant.

Tout d'abord vous configurez certains paramètres, du type x264_param_t, un bon site décrivant les paramètres est http://mewiki.project357.com/wiki/X264_Settings . Jetez aussi un coup d'œil à la fonction x264_param_default_preset qui vous permet de cibler certaines fonctionnalités sans avoir besoin de comprendre tous les paramètres (parfois assez complexes). Aussi l'utilisation x264_param_apply_profile après (vous voudrez probablement le profil « de base »)

Ceci est une exemple de configuration de mon code:

x264_param_t param;
x264_param_default_preset(&param, "veryfast", "zerolatency");
param.i_threads = 1;
param.i_width = width;
param.i_height = height;
param.i_fps_num = fps;
param.i_fps_den = 1;
// Intra refres:
param.i_keyint_max = fps;
param.b_intra_refresh = 1;
//Rate control:
param.rc.i_rc_method = X264_RC_CRF;
param.rc.f_rf_constant = 25;
param.rc.f_rf_constant_max = 35;
//For streaming:
param.b_repeat_headers = 1;
param.b_annexb = 1;
x264_param_apply_profile(&param, "baseline");

Après cela, vous pouvez initialiser le codeur comme suit

x264_t* encoder = x264_encoder_open(&param);
x264_picture_t pic_in, pic_out;
x264_picture_alloc(&pic_in, X264_CSP_I420, w, h)

X264 attend YUV420P données (je suppose que d'autres aussi, mais c'est une commune). Vous pouvez utiliser libswscale (de ffmpeg) pour convertir les images au format. C'est comme Initialisation ceci (je suppose que les données RVB avec 24bpp).

struct SwsContext* convertCtx = sws_getContext(in_w, in_h, PIX_FMT_RGB24, out_w, out_h, PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL);

encodage est aussi simple que cela, alors, pour chaque trame faire:

//data is a pointer to you RGB structure
int srcstride = w*3; //RGB stride is just 3*width
sws_scale(convertCtx, &data, &srcstride, 0, h, pic_in.img.plane, pic_in.img.stride);
x264_nal_t* nals;
int i_nals;
int frame_size = x264_encoder_encode(encoder, &nals, &i_nals, &pic_in, &pic_out);
if (frame_size >= 0)
{
    // OK
}

J'espère que cela va vous y aller;), j'ai passé beaucoup de temps sur moi-même pour commencer. X264 est un logiciel incroyablement puissant mais parfois complexe.

modifier: Lorsque vous utilisez d'autres paramètres, il y aura des trames retardé, ce n'est pas le cas avec mes paramètres (principalement en raison de l'option nolatency). Si tel est le cas, frame_size sera parfois zéro et vous devrez appeler x264_encoder_encode aussi longtemps que la fonction x264_encoder_delayed_frames ne retourne pas 0. Mais pour cette fonctionnalité, vous devriez jeter un coup d'oeil plus profondément dans x264.c et x264.h.

Autres conseils

Je l'ai téléchargé un exemple qui génère des trames yuv premières, puis les code en utilisant x264. code complet se trouve ici: https://gist.github.com/roxlu/6453908

FFmpeg 2.8.6 C exemple runnable

Utilisation FFpmeg comme emballage pour x264 est une bonne idée, car elle expose une API uniforme pour plusieurs codeurs. Donc, si vous avez besoin de formats de changement, vous pouvez changer un seul paramètre au lieu d'apprendre une nouvelle API.

L'exemple des encodages et synthétise des cadres colorés générés par generate_rgb.

Contrôle de type de trame ( I, P, B ) d'avoir aussi peu key- cadres possible (idéalement que le premier) est discuté ici: https://stackoverflow.com/a/36412909/895245 comme il dit, je ne recommande pas pour la plupart des applications.

Les principales lignes qui font le contrôle de type de trame ici sont:

/* Minimal distance of I-frames. This is the maximum value allowed,
or else we get a warning at runtime. */
c->keyint_min = 600;

if (frame->pts == 1) {
    frame->key_frame = 1;
    frame->pict_type = AV_PICTURE_TYPE_I;
} else {
    frame->key_frame = 0;
    frame->pict_type = AV_PICTURE_TYPE_P;
}

On peut alors vérifier le type de cadre avec:

ffprobe -select_streams v \
    -show_frames \
    -show_entries frame=pict_type \
    -of csv \
    tmp.h264

tel que mentionné à: https: / /superuser.com/questions/885452/extracting-the-index-of-key-frames-from-a-video-using-ffmpeg

Aperçu des de sortie généré.

main.c

#include <libavcodec/avcodec.h>
#include <libavutil/imgutils.h>
#include <libavutil/opt.h>
#include <libswscale/swscale.h>

static AVCodecContext *c = NULL;
static AVFrame *frame;
static AVPacket pkt;
static FILE *file;
struct SwsContext *sws_context = NULL;

static void ffmpeg_encoder_set_frame_yuv_from_rgb(uint8_t *rgb) {
    const int in_linesize[1] = { 3 * c->width };
    sws_context = sws_getCachedContext(sws_context,
            c->width, c->height, AV_PIX_FMT_RGB24,
            c->width, c->height, AV_PIX_FMT_YUV420P,
            0, 0, 0, 0);
    sws_scale(sws_context, (const uint8_t * const *)&rgb, in_linesize, 0,
            c->height, frame->data, frame->linesize);
}

uint8_t* generate_rgb(int width, int height, int pts, uint8_t *rgb) {
    int x, y, cur;
    rgb = realloc(rgb, 3 * sizeof(uint8_t) * height * width);
    for (y = 0; y < height; y++) {
        for (x = 0; x < width; x++) {
            cur = 3 * (y * width + x);
            rgb[cur + 0] = 0;
            rgb[cur + 1] = 0;
            rgb[cur + 2] = 0;
            if ((frame->pts / 25) % 2 == 0) {
                if (y < height / 2) {
                    if (x < width / 2) {
                        /* Black. */
                    } else {
                        rgb[cur + 0] = 255;
                    }
                } else {
                    if (x < width / 2) {
                        rgb[cur + 1] = 255;
                    } else {
                        rgb[cur + 2] = 255;
                    }
                }
            } else {
                if (y < height / 2) {
                    rgb[cur + 0] = 255;
                    if (x < width / 2) {
                        rgb[cur + 1] = 255;
                    } else {
                        rgb[cur + 2] = 255;
                    }
                } else {
                    if (x < width / 2) {
                        rgb[cur + 1] = 255;
                        rgb[cur + 2] = 255;
                    } else {
                        rgb[cur + 0] = 255;
                        rgb[cur + 1] = 255;
                        rgb[cur + 2] = 255;
                    }
                }
            }
        }
    }
    return rgb;
}

/* Allocate resources and write header data to the output file. */
void ffmpeg_encoder_start(const char *filename, int codec_id, int fps, int width, int height) {
    AVCodec *codec;
    int ret;

    codec = avcodec_find_encoder(codec_id);
    if (!codec) {
        fprintf(stderr, "Codec not found\n");
        exit(1);
    }
    c = avcodec_alloc_context3(codec);
    if (!c) {
        fprintf(stderr, "Could not allocate video codec context\n");
        exit(1);
    }
    c->bit_rate = 400000;
    c->width = width;
    c->height = height;
    c->time_base.num = 1;
    c->time_base.den = fps;
    c->keyint_min = 600;
    c->pix_fmt = AV_PIX_FMT_YUV420P;
    if (codec_id == AV_CODEC_ID_H264)
        av_opt_set(c->priv_data, "preset", "slow", 0);
    if (avcodec_open2(c, codec, NULL) < 0) {
        fprintf(stderr, "Could not open codec\n");
        exit(1);
    }
    file = fopen(filename, "wb");
    if (!file) {
        fprintf(stderr, "Could not open %s\n", filename);
        exit(1);
    }
    frame = av_frame_alloc();
    if (!frame) {
        fprintf(stderr, "Could not allocate video frame\n");
        exit(1);
    }
    frame->format = c->pix_fmt;
    frame->width  = c->width;
    frame->height = c->height;
    ret = av_image_alloc(frame->data, frame->linesize, c->width, c->height, c->pix_fmt, 32);
    if (ret < 0) {
        fprintf(stderr, "Could not allocate raw picture buffer\n");
        exit(1);
    }
}

/*
Write trailing data to the output file
and free resources allocated by ffmpeg_encoder_start.
*/
void ffmpeg_encoder_finish(void) {
    uint8_t endcode[] = { 0, 0, 1, 0xb7 };
    int got_output, ret;
    do {
        fflush(stdout);
        ret = avcodec_encode_video2(c, &pkt, NULL, &got_output);
        if (ret < 0) {
            fprintf(stderr, "Error encoding frame\n");
            exit(1);
        }
        if (got_output) {
            fwrite(pkt.data, 1, pkt.size, file);
            av_packet_unref(&pkt);
        }
    } while (got_output);
    fwrite(endcode, 1, sizeof(endcode), file);
    fclose(file);
    avcodec_close(c);
    av_free(c);
    av_freep(&frame->data[0]);
    av_frame_free(&frame);
}

/*
Encode one frame from an RGB24 input and save it to the output file.
Must be called after ffmpeg_encoder_start, and ffmpeg_encoder_finish
must be called after the last call to this function.
*/
void ffmpeg_encoder_encode_frame(uint8_t *rgb) {
    int ret, got_output;
    ffmpeg_encoder_set_frame_yuv_from_rgb(rgb);
    av_init_packet(&pkt);
    pkt.data = NULL;
    pkt.size = 0;
    if (frame->pts == 1) {
        frame->key_frame = 1;
        frame->pict_type = AV_PICTURE_TYPE_I;
    } else {
        frame->key_frame = 0;
        frame->pict_type = AV_PICTURE_TYPE_P;
    }
    ret = avcodec_encode_video2(c, &pkt, frame, &got_output);
    if (ret < 0) {
        fprintf(stderr, "Error encoding frame\n");
        exit(1);
    }
    if (got_output) {
        fwrite(pkt.data, 1, pkt.size, file);
        av_packet_unref(&pkt);
    }
}

/* Represents the main loop of an application which generates one frame per loop. */
static void encode_example(const char *filename, int codec_id) {
    int pts;
    int width = 320;
    int height = 240;
    uint8_t *rgb = NULL;
    ffmpeg_encoder_start(filename, codec_id, 25, width, height);
    for (pts = 0; pts < 100; pts++) {
        frame->pts = pts;
        rgb = generate_rgb(width, height, pts, rgb);
        ffmpeg_encoder_encode_frame(rgb);
    }
    ffmpeg_encoder_finish();
}

int main(void) {
    avcodec_register_all();
    encode_example("tmp.h264", AV_CODEC_ID_H264);
    encode_example("tmp.mpg", AV_CODEC_ID_MPEG1VIDEO);
    return 0;
}

compiler et exécuter avec:

gcc -o main.out -std=c99 -Wextra main.c -lavcodec -lswscale -lavutil
./main.out
ffplay tmp.mpg
ffplay tmp.h264

Testé sur Ubuntu 16.04. GitHub amont .

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow