How can arbitrary frames be discarded from an MP4 video using Media Foundation?

https://stackoverflow.com/questions/22204623

09-06-2023
|

Question

I am currently trying to implement an algorithm which can quickly discard unwanted frames from an MP4 video when encoding to another MP4 (using Media Foundation).

The encoding part doesn't seem to bad - the "Source Reader plus Sink Writer" approach is nice and fast. You basically just have to create an IMFSourceReader and an IMFSinkWriter, set the source native media type on the writer, yada, yada, yada, and just loop: source.ReadSample(&sample) --> writer.WriteSample(&sample). The WriteSample() calls can be conditioned on whether they're "! 2 b discarded".

That naive approach is no good if you consider that the samples read will be "predicted-frames", a.k.a., P-frames in the H.264 encoded streams. Dropping any preceding "intra-coded picture frame" (I-frame or key-frame) before that will result in garbled video.

So, my question is, is it possible to introduce an I-frame (somehow) into the sink writer before resuming the sample writing in the sink writer?

Doing something with the MFSampleExtension_CleanPoint attribute doesn't seem to help. I could manually create an IMFSample (via MFCreateSample), but getting it in the right H.264 format might be tricky.

Any ideas? Or thoughts on other approaches to dropping frames during encoding?

Solution

I think that this is not possible without reenconding the video! The Reference between P and I Frames are in the h264 Bitstream and not in the container (MP4). You can only safely skip frames, which are not referenced from other frames:

last P-Frames of a GOP (before the next I-Frame)
B-Frames

Normaly these Frames are not referrenced, but they can be! This dependes on the encoder-settings used to create the h264 stream

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow