Question

Usually for videos the marker bit of RTP Packet indicates the last packet of the RTP. So, with this it is guaranteed that I will receive 1 frame per packet or can receive more than one?

In the case beyond the depacketization I would have to make a parser to separate the H.264 frames?

If I can get more than one frame per RTP ​​packetit is possible to get a piece of the next frame? Or all frames within the RTP packet even if more than one are completes?

Best regards,

Was it helpful?

Solution

RFC 6184 "RTP Payload Format for H.264 Video" has answers for the raised questions. It can be both ways: 2+ NAL units per packet, and 1 NAL unit fragmented over 2+ packets.

See quotes below:

5.7.1. Single-Time Aggregation Packet (STAP)

A single-time aggregation packet (STAP) SHOULD be used whenever NAL units are aggregated that all share the same NALU-time.

and

5.8. Fragmentation Units (FUs)

This payload type allows fragmenting a NAL unit into several RTP packets. Doing so on the application layer instead of relying on lower-layer fragmentation (e.g., by IP) has the following advantages:

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top