How can I play an already streaming H264 RTSP data with VLC - dumped to file with wireshark

3

Backstory

I have an IP-Camera on my lan, I have connected to it using VLC (CTRL + N, adding rtsp://192.168.1.10 <- IP camera configured ip address), and after the VLC connected, i started sniffing the traffic using Wireshark. After about 30 seconds, on Wireshark, I clicked "Follow TCP stream", and dumped the traffic coming from the camera to my computer raw.

My goal

I want to be able to take the dumped RTSP stream from Wireshark (Or other network sniffing alternatives), extract the H264 data, and be able to play it afterwards on VLC.

What I dont want to do

  1. I dont want to dump the stream using VLC
  2. I dont want to dump the stream using FFMPEG

Things I have tried

So I've already managed to parse the RTSP and RTP headers, and understand where the H264 payload starts, for example:

00000000     24 00 05 A0 80 60 35 85 14 0C 1F DE 43 13 B7 58
00000010     7C 81 9A 26 26 64 33 FF 7C 11 99 87 4B 15 FA 06 
00000020     06 47 12 C5 6E 39 56 56 82 0D E0 1F 8D 1B 8A A7

Starting at offset 0x00000000, there is the RTSP header 24 00 05 A0. Unpacking it using python:

magic, channel, length = struct.unpack(">BBH", data)

after that, between the offsets 0x00000004 to 0x00000010, i have the RTP header, parsing as

bits_header_a, bits_header_b, sequence_number, timestamp, ssrc_identifier = struct.unpack(">BBHII", data)

and from the offset 0x00000010 I have the actual payload.

I looked at the RFC, and the great answer given here on this question but haven't quite managed to understand what I'm doing wrong next to reconstruct the H264 data from the RTSP actual payload. Here is a part of a larger piece of code I'm executing on every payload (frame) data I previously parsed (python), to create the data frames:

first_byte = "{0:>08s}".format(bin(ord(frame[0]))[2:])
second_byte = "{0:>08s}".format(bin(ord(frame[1]))[2:])
rest_of_data = frame[2:]


# First byte
nal_unit_a = int(first_byte[:3], 2)
fragment_type = int(first_byte[3:], 2)

# Second byte
start_bit = int(second_byte[0], 2)
end_bit = int(second_byte[1], 2)
nal_unit_b = int(second_byte[3:], 2)


# Video data
if 0x1C == fragment_type:

    # Middle frame, just add
    if 0 == start_bit and 0 == end_bit:
        total_data += rest_of_data
        continue

    # First frame in sequence
    elif 1 == start_bit:
        print " [*] Found start frame"
        nal_unit = chr((nal_unit_a << 5) | nal_unit_b)
        if ord(nal_unit) != ( (ord(frame[0]) & 0xe0)  |  (ord(frame[1]) & 0x1F)):
            print "Unexpected"
        total_data += "\x00\x00\x00\x01"
        total_data += nal_unit
        total_data += rest_of_data
        continue

    # End
    elif 0 == start_bit and 1 == end_bit:
        print " [*] Found end frame"
        total_data += rest_of_data

I'm reconstructing the NALU as I think I should. I'm executing this code for every payload. But the output is not playable on VLC (Even after setting the VLC Demux module to force H264 parsing). I'm pretty sure i am missing something, as I did not fix anything related to timestamps and the H264, so there might be a problem with that. The only values I get for the nal_unit_b, and nal_unit_a are 5 3, 1 3, being reconstructed to the whole NAL unit 0x61 and 0x65.

My questions

  1. I know that at the beginning of the RTSP session, some data is sent to know better how to parse the RTSP session (Also called SDP), But I'm pretty sure all the data should also exist in the H264 payload and it should be playable even without telling it how to parse the data, alternatively I can always make an assumption of how the data is encoded, if i know for certain what IP-Camera vendor i am communicating with. Is that correct?
  2. Is my python code snippet above correct? (Parsing regarded)
  3. Why on my data stream I only receive video data packets (fragment_type = 0x1C), and I dont see anything else.
  4. When simply connecting to the IP-Camera using VLC and clicking Tools-->Codec Information, the presented codec is: Codec: H264 - MPEG-4 AVC (part 10) (h264), I assumed it should be parsed as H264, I've also tried parsing it as MPEG but failed, was I correct to parse it as H264?
python
wireshark
h.264
rtsp
ip-camera
asked on Stack Overflow Oct 17, 2018 by Orlox

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0