Decoding mJPEG with libavcodec

https://stackoverflow.com/questions/23443322

14-07-2023
|

Question

I am creating video conference application. I have discovered that webcams (at least 3 I have) provide higher resolutions and framerates for mJPEG output format. So far I was using YUY2, converted in I420 for compression with X264. To transcode mJPEG to I420, I need to decode it first. I am trying to decode images from webcam with libavcodec. This is my code.

Initialization:

// mJPEG to I420 conversion
AVCodecContext * _transcoder = nullptr;
AVFrame * _outputFrame;
AVPacket _inputPacket;

avcodec_register_all();
_outputFrame = av_frame_alloc();
av_frame_unref(_outputFrame);
av_init_packet(&_inputPacket);

AVCodec * codecDecode = avcodec_find_decoder(AV_CODEC_ID_MJPEG);

_transcoder = avcodec_alloc_context3(codecDecode);
avcodec_get_context_defaults3(_transcoder, codecDecode);
_transcoder->flags2 |= CODEC_FLAG2_FAST;
_transcoder->pix_fmt = AVPixelFormat::AV_PIX_FMT_YUV420P;
_transcoder->width = width;
_transcoder->height = height;
avcodec_open2(_transcoder, codecDecode, nullptr);

Decoding:

_inputPacket.size = size;
_inputPacket.data = data;

int got_picture;
int decompressed_size = avcodec_decode_video2(_transcoder, _outputFrame, &got_picture, &_inputPacket);

But so far, what I am getting is a green screen. Where am I wrong?

UPD: I have enabled libavcodec logging, but there are not warnings or errors. Also I have discovered that _outputframe has AV_PIX_FMT_YUVJ422P as format and colorspace, which does not fit any on values in libavcodec's enums (the actual value is 156448160).

Solution

After suggestions from comments, I came up with working solution.

Initialization:

av_init_packet(&_inputPacket);

AVCodec * codecDecode = avcodec_find_decoder(AV_CODEC_ID_MJPEG);

_transcoder = avcodec_alloc_context3(codecDecode);
avcodec_get_context_defaults3(_transcoder, codecDecode);
avcodec_open2(_transcoder, codecDecode, nullptr);

// swscale contex init
mJPEGconvertCtx = sws_getContext(width, height, AV_PIX_FMT_YUVJ422P, 
        width, height, AV_PIX_FMT_YUV420P, SWS_FAST_BILINEAR, nullptr, nullptr, nullptr);

// x264 pic init
x264_picture_t _pic_in;
x264_picture_alloc(&_pic_in, X264_CSP_I420, width, height);
_pic_in.img.i_csp = X264_CSP_I420 ;
_pic_in.img.i_plane = 3;
_pic_in.img.i_stride[0] = width;
_pic_in.img.i_stride[1] = width / 2;
_pic_in.img.i_stride[2] = width / 2;

_pic_in.i_type = X264_TYPE_AUTO ;
_pic_in.i_qpplus1 = 0;

Transcoding:

_inputPacket.size = size;
_inputPacket.data = data;

int got_picture;

// decode
avcodec_decode_video2(_transcoder, _outputFrame, &got_picture, &_inputPacket); 

// transform
sws_scale(_mJPEGconvertCtx, _outputFrame->data, _outputFrame->linesize, 0, height, 
        _pic_in.img.plane, _pic_in.img.i_stride);

Afterwards, _pic_in is used directly by x264. Image is fine, but the transcoding times are horrible for higher resolutions.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow