trafico - Cómo procesar paquetes UDP sin formato para que puedan ser decodificados por un filtro de decodificador en un filtro de fuente directshow

interpretar paquetes wireshark (3)

Con los paquetes UDP, usted recibe bits de flujo H.264 que se espera que desempaquete en Unidades NAL H.264, que, a su vez, usted está presionando en la canalización de DirectShow desde su filtro.

Las unidades NAL se formatearán como muestras de medios DirectShow, y posiblemente también, como parte del tipo de medio (Unidades NAL SPS/PPS ).

Los pasos de despacketización se describen en RFC 6184 - Formato de carga útil RTP para video H.264 . Esta es la parte de carga útil del tráfico RTP, definida por RFC 3550 - RTP: un protocolo de transporte para aplicaciones en tiempo real .

Claro, pero no del todo corto.

Larga historia:

Hay una fuente H264 / MPEG-4
Puedo conectar esta fuente con el protocolo RTSP.
Puedo obtener paquetes UDP sin procesar con el protocolo RTP.
A continuación, envíe esos paquetes UDP sin formato a un Decoder [h264 / mpeg-4] [DS Source Filter]
Pero esos paquetes UDP "en bruto" no pueden decodificarse con el filtro Decoder [h264 / mpeg-4]

Dentro de poco:

¿Cómo se procesan los datos UDP sin procesar para poder decodificarlos mediante el filtro decodificador H264 / MPEG-4? ¿Puede alguien identificar claramente los pasos que tengo que hacer con la transmisión H264 / MPEG?

Información extra:

Puedo hacer esto con FFmpeg ... Y realmente no puedo entender cómo FFmpeg procesa los datos en bruto para que puedan ser decodificados por un decodificador.

Paz de la torta!

1. Obtener los datos

Como puedo ver, ya sabes cómo hacerlo (iniciar la sesión de RTSP, CONFIGURAR un RTP/AVP/UDP;unicast; transportar y obtener datagramas de usuario) ... pero si tienes dudas, pregunta.

No importa el transporte (UDP o TCP), el formato de datos es básicamente el mismo:

Datos RTP: [RTP Header - 12bytes][Video data]
UDP: [RTP Data]
TCP: [$ - 1byte][Transport Channel - 1byte][RTP data length - 2bytes][RTP data]

Entonces, para obtener datos de UDP, solo tiene que quitar los primeros 12 bytes que representan el encabezado RTP. ¡Pero cuidado, lo necesita para obtener información de sincronización de video y para MPEG4 la información de paquetización!

Para TCP necesita leer primer byte hasta obtener byte $ . Luego lea el siguiente byte, que será el canal de transporte al que pertenecen los siguientes datos (cuando el servidor responde en la solicitud de SETUP dice: Transport: RTP/AVP/TCP;unicast;interleaved=0-1 esto significa que VIDEO DATA tendrá TRANSPORT_CHANNEL = 0 y VIDEO RTCP DATA tendrá TRANSPORT_CHANNEL = 1). Desea obtener DATOS DE VIDEO, así que esperamos 0 ... luego lea uno corto (2 bytes) que represente la longitud de los datos de RTP que siguen, así que lea tantos bytes, y ahora haga lo mismo que con UDP.

2. Depacketize datos

Los datos H264 y MPEG4 suelen estar empaquetados (en SDP hay packetization-mode parámetro de packetization-mode de paquete que puede tener los valores 0, 1 y 2 de lo que significa cada uno, y cómo despaletizarlo, puede ver HERE ) porque hay un cierto límite de red que un punto final puede enviar a través de TCP o UDP que se llama MTU. Por lo general, es de 1500 bytes o menos. Entonces, si el cuadro de video es más grande que eso (y generalmente lo es), debe estar fragmentado (empaquetado) en fragmentos de tamaño MTU. Esto se puede hacer mediante un codificador / streamer en transporte TCP y UDP, o puede retransmitir en IP para fragmentar y volver a montar el cuadro de video en el otro lado ... el primero es mucho mejor si desea tener un video propenso a errores sin problemas sobre UDP y TCP.

H264: Para verificar, ¿los datos RTP (que llegaron a través de UDP o intercalados sobre TCP) contienen fragmentos de un cuadro de video H264 más grande, usted debe saber cómo se ve el fragmento cuando se empaqueta:

H264 FRAGMENTO

First byte: [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS] Second byte: [ START BIT | END BIT | RESERVED BIT | 5 NAL UNIT BITS] Other bytes: [... VIDEO FRAGMENT DATA...]

Ahora, obtenga los primeros DATOS DE VÍDEO en una matriz de bytes llamada Data y obtenga la siguiente información:

int fragment_type = Data[0] & 0x1F; int nal_type = Data[1] & 0x1F; int start_bit = Data[1] & 0x80; int end_bit = Data[1] & 0x40;

Si fragment_type == 28 , los datos de video que le siguen representan el fragmento de marco de video. La siguiente comprobación es start_bit set, si es así, entonces ese fragmento es el primero en una secuencia. Se usa para reconstruir el byte NAL de IDR tomando los primeros 3 bits del primer byte de carga útil ( 3 NAL UNIT BITS ) y combinarlos con los últimos 5 bits del segundo byte de carga ( 5 NAL UNIT BITS ) para obtener un byte como este [3 NAL UNIT BITS | 5 NAL UNIT BITS] [3 NAL UNIT BITS | 5 NAL UNIT BITS] . Luego, escribe ese byte NAL primero en un búfer claro con VIDEO FRAGMENT DATA de ese fragmento.

Si start_bit y end_bit son 0, solo escribe los datos de VIDEO FRAGMENT DATA (omitiendo los dos primeros bytes de carga útil que identifican el fragmento) en el búfer.

Si start_bit es 0 y end_bit es 1, eso significa que es el último fragmento, y usted simplemente escribe su VIDEO FRAGMENT DATA (omitiendo los dos primeros bytes que identifican el fragmento) en el búfer, ¡y ahora tiene su fotograma de video reconstruido!

Ten en cuenta que los datos RTP tienen encabezado RTP en los primeros 12 bytes, y que si el marco está fragmentado, nunca escribes los primeros dos bytes en el búfer de desfragmentación, y que necesitas reconstruir el byte NAL y escribirlo primero. Si ensucias algo aquí, la imagen será parcial (la mitad será gris o negra o verás artefactos).

MPEG4: Este es fácil. Debe verificar MARKER_BIT en el encabezado RTP. Ese byte se establece ( 1 ) si los datos de video representan el cuadro de video completo, y es 0 de los datos de video es un fragmento de marco de video. Entonces, para desempaquetar eso, necesita ver qué es MARKER_BIT. Si es 1 eso es todo, solo lea los bytes de datos de video.

MARCO ENTERO:

[MARKER = 1]

MARCO PACKETIZED:

[MARKER = 0], [MARKER = 0], [MARKER = 0], [MARKER = 1]

El primer paquete que tiene MARKER_BIT=0 es el primer fragmento de marco de video, todos los demás que siguen, incluido el primero con MARKER_BIT=1 son fragmentos del mismo marco de video. Entonces, lo que debes hacer es:

Hasta que MARKER_BIT=0 coloque VIDEO DATA en el buffer de depacketization
Coloque el próximo VIDEO DATA donde MARKER_BIT=1 en el mismo buffer
El búfer de desempaquetado ahora contiene un marco completo MPEG4

3. Datos de proceso para el decodificador (flujo de bytes NAL)

Cuando tiene marcos de video despacketizados, necesita hacer un flujo de bytes NAL. Tiene el siguiente formato:

H264: 0x000001[SPS], 0x000001[PPS], 0x000001[VIDEO FRAME], 0x000001...
MPEG4: 0x000001[Visual Object Sequence Start], 0x000001[VIDEO FRAME]

REGLAS:

Cada cuadro DEBE estar antepuesto con 0x000001 código de 3 bytes sin importar el códec
Cada flujo DEBE comenzar con CONFIGURATION INFO, para H264 que son cuadros SPS y PPS en ese orden ( sprop-parameter-sets en SDP), y para MPEG4 el marco VOS (parámetro config en SDP)

Por lo tanto, debe compilar un búfer de configuración para H264 y MPEG4 precedido de 3 bytes 0x000001 , enviarlo primero y, a continuación, anteponer cada fotograma de video despaletizado con los mismos 3 bytes y enviarlo al descodificador.

Si necesitas algún comentario aclaratorio ... :)

Tengo una implementación de esto @ https://net7mma.codeplex.com/

Aquí está el código relevante

/// <summary> /// Implements Packetization and Depacketization of packets defined in <see href="https://tools.ietf.org/html/rfc6184">RFC6184</see>. /// </summary> public class RFC6184Frame : Rtp.RtpFrame { /// <summary> /// Emulation Prevention /// </summary> static byte[] NalStart = { 0x00, 0x00, 0x01 }; public RFC6184Frame(byte payloadType) : base(payloadType) { } public RFC6184Frame(Rtp.RtpFrame existing) : base(existing) { } public RFC6184Frame(RFC6184Frame f) : this((Rtp.RtpFrame)f) { Buffer = f.Buffer; } public System.IO.MemoryStream Buffer { get; set; } /// <summary> /// Creates any <see cref="Rtp.RtpPacket"/>''s required for the given nal /// </summary> /// <param name="nal">The nal</param> /// <param name="mtu">The mtu</param> public virtual void Packetize(byte[] nal, int mtu = 1500) { if (nal == null) return; int nalLength = nal.Length; int offset = 0; if (nalLength >= mtu) { //Make a Fragment Indicator with start bit byte[] FUI = new byte[] { (byte)(1 << 7), 0x00 }; bool marker = false; while (offset < nalLength) { //Set the end bit if no more data remains if (offset + mtu > nalLength) { FUI[0] |= (byte)(1 << 6); marker = true; } else if (offset > 0) //For packets other than the start { //No Start, No End FUI[0] = 0; } //Add the packet Add(new Rtp.RtpPacket(2, false, false, marker, PayloadTypeByte, 0, SynchronizationSourceIdentifier, HighestSequenceNumber + 1, 0, FUI.Concat(nal.Skip(offset).Take(mtu)).ToArray())); //Move the offset offset += mtu; } } //Should check for first byte to be 1 - 23? else Add(new Rtp.RtpPacket(2, false, false, true, PayloadTypeByte, 0, SynchronizationSourceIdentifier, HighestSequenceNumber + 1, 0, nal)); } /// <summary> /// Creates <see cref="Buffer"/> with a H.264 RBSP from the contained packets /// </summary> public virtual void Depacketize() { bool sps, pps, sei, slice, idr; Depacketize(out sps, out pps, out sei, out slice, out idr); } /// <summary> /// Parses all contained packets and writes any contained Nal Units in the RBSP to <see cref="Buffer"/>. /// </summary> /// <param name="containsSps">Indicates if a Sequence Parameter Set was found</param> /// <param name="containsPps">Indicates if a Picture Parameter Set was found</param> /// <param name="containsSei">Indicates if Supplementatal Encoder Information was found</param> /// <param name="containsSlice">Indicates if a Slice was found</param> /// <param name="isIdr">Indicates if a IDR Slice was found</param> public virtual void Depacketize(out bool containsSps, out bool containsPps, out bool containsSei, out bool containsSlice, out bool isIdr) { containsSps = containsPps = containsSei = containsSlice = isIdr = false; DisposeBuffer(); this.Buffer = new MemoryStream(); //Get all packets in the frame foreach (Rtp.RtpPacket packet in m_Packets.Values.Distinct()) ProcessPacket(packet, out containsSps, out containsPps, out containsSei, out containsSlice, out isIdr); //Order by DON? this.Buffer.Position = 0; } /// <summary> /// Depacketizes a single packet. /// </summary> /// <param name="packet"></param> /// <param name="containsSps"></param> /// <param name="containsPps"></param> /// <param name="containsSei"></param> /// <param name="containsSlice"></param> /// <param name="isIdr"></param> internal protected virtual void ProcessPacket(Rtp.RtpPacket packet, out bool containsSps, out bool containsPps, out bool containsSei, out bool containsSlice, out bool isIdr) { containsSps = containsPps = containsSei = containsSlice = isIdr = false; //Starting at offset 0 int offset = 0; //Obtain the data of the packet (without source list or padding) byte[] packetData = packet.Coefficients.ToArray(); //Cache the length int count = packetData.Length; //Must have at least 2 bytes if (count <= 2) return; //Determine if the forbidden bit is set and the type of nal from the first byte byte firstByte = packetData[offset]; //bool forbiddenZeroBit = ((firstByte & 0x80) >> 7) != 0; byte nalUnitType = (byte)(firstByte & Common.Binary.FiveBitMaxValue); //o The F bit MUST be cleared if all F bits of the aggregated NAL units are zero; otherwise, it MUST be set. //if (forbiddenZeroBit && nalUnitType <= 23 && nalUnitType > 29) throw new InvalidOperationException("Forbidden Zero Bit is Set."); //Determine what to do switch (nalUnitType) { //Reserved - Ignore case 0: case 30: case 31: { return; } case 24: //STAP - A case 25: //STAP - B case 26: //MTAP - 16 case 27: //MTAP - 24 { //Move to Nal Data ++offset; //Todo Determine if need to Order by DON first. //EAT DON for ALL BUT STAP - A if (nalUnitType != 24) offset += 2; //Consume the rest of the data from the packet while (offset < count) { //Determine the nal unit size which does not include the nal header int tmp_nal_size = Common.Binary.Read16(packetData, offset, BitConverter.IsLittleEndian); offset += 2; //If the nal had data then write it if (tmp_nal_size > 0) { //For DOND and TSOFFSET switch (nalUnitType) { case 25:// MTAP - 16 { //SKIP DOND and TSOFFSET offset += 3; goto default; } case 26:// MTAP - 24 { //SKIP DOND and TSOFFSET offset += 4; goto default; } default: { //Read the nal header but don''t move the offset byte nalHeader = (byte)(packetData[offset] & Common.Binary.FiveBitMaxValue); if (nalHeader > 5) { if (nalHeader == 6) { Buffer.WriteByte(0); containsSei = true; } else if (nalHeader == 7) { Buffer.WriteByte(0); containsPps = true; } else if (nalHeader == 8) { Buffer.WriteByte(0); containsSps = true; } } if (nalHeader == 1) containsSlice = true; if (nalHeader == 5) isIdr = true; //Done reading break; } } //Write the start code Buffer.Write(NalStart, 0, 3); //Write the nal header and data Buffer.Write(packetData, offset, tmp_nal_size); //Move the offset past the nal offset += tmp_nal_size; } } return; } case 28: //FU - A case 29: //FU - B { /* Informative note: When an FU-A occurs in interleaved mode, it always follows an FU-B, which sets its DON. * Informative note: If a transmitter wants to encapsulate a single NAL unit per packet and transmit packets out of their decoding order, STAP-B packet type can be used. */ //Need 2 bytes if (count > 2) { //Read the Header byte FUHeader = packetData[++offset]; bool Start = ((FUHeader & 0x80) >> 7) > 0; //bool End = ((FUHeader & 0x40) >> 6) > 0; //bool Receiver = (FUHeader & 0x20) != 0; //if (Receiver) throw new InvalidOperationException("Receiver Bit Set"); //Move to data ++offset; //Todo Determine if need to Order by DON first. //DON Present in FU - B if (nalUnitType == 29) offset += 2; //Determine the fragment size int fragment_size = count - offset; //If the size was valid if (fragment_size > 0) { //If the start bit was set if (Start) { //Reconstruct the nal header //Use the first 3 bits of the first byte and last 5 bites of the FU Header byte nalHeader = (byte)((firstByte & 0xE0) | (FUHeader & Common.Binary.FiveBitMaxValue)); //Could have been SPS / PPS / SEI if (nalHeader > 5) { if (nalHeader == 6) { Buffer.WriteByte(0); containsSei = true; } else if (nalHeader == 7) { Buffer.WriteByte(0); containsPps = true; } else if (nalHeader == 8) { Buffer.WriteByte(0); containsSps = true; } } if (nalHeader == 1) containsSlice = true; if (nalHeader == 5) isIdr = true; //Write the start code Buffer.Write(NalStart, 0, 3); //Write the re-construced header Buffer.WriteByte(nalHeader); } //Write the data of the fragment. Buffer.Write(packetData, offset, fragment_size); } } return; } default: { // 6 SEI, 7 and 8 are SPS and PPS if (nalUnitType > 5) { if (nalUnitType == 6) { Buffer.WriteByte(0); containsSei = true; } else if (nalUnitType == 7) { Buffer.WriteByte(0); containsPps = true; } else if (nalUnitType == 8) { Buffer.WriteByte(0); containsSps = true; } } if (nalUnitType == 1) containsSlice = true; if (nalUnitType == 5) isIdr = true; //Write the start code Buffer.Write(NalStart, 0, 3); //Write the nal heaer and data data Buffer.Write(packetData, offset, count - offset); return; } } } internal void DisposeBuffer() { if (Buffer != null) { Buffer.Dispose(); Buffer = null; } } public override void Dispose() { if (Disposed) return; base.Dispose(); DisposeBuffer(); } //To go to an Image... //Look for a SliceHeader in the Buffer //Decode Macroblocks in Slice //Convert Yuv to Rgb }

También hay implementaciones para varios otros RFC que ayudan a que los medios se reproduzcan en un MediaElement o en otro software o simplemente lo guarden en el disco.

Se está escribiendo en un formato contenedor.