I am trying to learn enough about H.264, RTP, RTSP and encapsulation file formats to develop a video recording application.
Specifically, what should I read to understand the problem?
I want to be able to answer the following questions:
I want to be able to answer these questions on a fairly low level so I can implement software that does some of the processes (capture RTP streams, rebroadcast joined MP4s).
Background
The goal is to record video from a network camera onto disk. The camera has an RTSP server that provides an H.264 encoded stream which it sends via RTP to a player. I have successfully played the stream using VLC, but would like to customize the process.
The "raw" video stream is a sequence of NAL units, per H.264 specification. Neither on RTSP, nor on MP4 file you have this stream "as is".
On RTSP connection you typically receive NAL units fragmented, and you need to depacketize them (no you cannot simply concatenate):
MP4 file is a container formatted file, and has its own structure (boxes). So you cannot simply stream NALs into such file and you have to do what is called multiplexing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With