To start with important concept,
in video calls each endpoint will advertise its receive capabilities for which
the sending endpoint will use to encode.
Therefore, we can have asymmetric video streams. This is
different from audio calls where both endpoints needs to agree on common audio
stream parameters (codec, dtmf, etc)
Video Stream Negotiation
There are multiple
attributes negotiated in video stream SDP
Bandwidth Attribute
The bandwidth
attribute is presented in SDP body as b=: .
This specifies the maximum
amount of receive bandwidth supported by the endpoint.
The bandwidth
attribute can be present in the session section and/or media section of the SDP
body.
There are three
types of Bandwidth Modifiers which can be present in the bandwidth header
inside the SDP body:
- Transport Independent Application Specific (TIAS) in bps: Bandwidth does NOT include the lower layers (e.g. RTP bandwidth only)
- Application Specific (AS) in kbps: Bandwidth includes the lower layers (e.g. TCP/UDP and IP)
- Conference Total (CT): Max Bandwidth that a Conference Session will use
Example:
o=CiscoSystemsCCM-SIP
161095 1 IN IP4 10.58.9.6
s=SIP Call
b=TIAS:6000000 Transport Independent
Application Specific bandwidth (RTP) in bits/sec
b=AS:6000
Application
Specific bandwidth (RTP/UDP/IP) in kbps
t=0 0
m=audio 16444
RTP/AVP 102 103 104 9 105 106 0 8 101
b=TIAS:64000
… attributes
of multiple audio codecs in the offer
…
m=video 16446
RTP/AVP 98 99
b=TIAS:6000000
For this endpoint –
the maximum media stream bandwidths that can be received :
= 6 Mbps for all
voice and video streams including UDP and IP headers (AS session bandwidth)
= 64kbps for voice
RTP traffic – not including UDP and IP headers (TIAS audio)
= 6 Mbps for video
RTP traffic – not including UDP and IP headers (TIAS video)
Video Codec Attributes
Video codec
advertised by each endpoint is considered to be the desired receive codec. In
below SDP body multiple codecs are advertised in the preference order (98 is
H264 followed by 99 which is H263).
…
m=video 16446
RTP/AVP 98 99
c=IN IP4
10.58.9.86
b=TIAS:6000000
a=rtpmap:98 H264/90000 H.264/ Sampling Rate
90000 Hz
a=fmtp:98
profile-level-id=428016;packetization-mode=1;max-mbps=245000;max-fs=9000;max-cpb=200;maxbr=5000;max-rcmd-nalu-size=3456000;max-smbps=245000;max-fps=6000
a=rtpmap:99 H263-1998/90000 H.263 version 2/
Sampling Rate 90000 Hz
a=fmtp:99
QCIF=1;CIF=1;CIF4=1;CUSTOM=352,240,1
a=rtcp-fb:*
nack pli
a=rtcp-fb:*
ccm tmmbr
H264
In
H264 codec, there are two layers present which and Video Coding Layer (VCL) and
Network Abstraction Layer (NAL).
The
VCL layer creates a coded representation of the video image by partitioning the
video frame into slices with each slice partitioned into Macroblocks
(rectangular samples of pixels). The slices are grouped in NALU. Depending on
the packetization mode advertised by endpoints, single or multiple NALUs can be
encapsulated in single RTP packet.
Note: Multiple RTP packets can
represent a single frame
Let's
go deeper in H264.
a=rtpmap:98 H264/90000
a=fmtp:98 profile-level-id=428016;packetization-mode=1;max-mbps=245000;max-fs=9000;maxcpb=200;max-br=5000;max-rcmd-nalu-size=3456000;max-smbps=245000;max-fps=6000
profile-level-id=428016 The Profile-Level-ID describes the minimum
set of features/capabilities that are supported by this endpoint
packetization-mode=1 These parameters describe the features and
capabilities beyond those of the profile-level-id that are supported by this
endpoint
max-mbps=245000
max-fs=9000
max-cpb=200
max-br=5000
max-rcmd-nalu-size=3456000
max-smbps=245000
max-fps=6000
Profile-Level-ID
consists of 6 hex-digits. The first 4 hex-digits define the profile-id while
the other 2 hex-digits define the level.
In our case the profile-id is 4280 while the level is 16. Profile-Level-ID must be
symmetrical for the call
Profile-ID
describes the subset of coding tools that the codec supports by the endpoint.
The profile ID 4280 represents baseline profile (BP,66) which supports encoding
features such as Flexible Macroblock
Ordering, Arbitrary Slice Ordering, Redundant Slices.
Profile
Level describes the resolution, frame rate and bit rate that the endpoint can
support. Level 16 in hex, which is 22 in dec, represents level 2.2 = 352 x 480
pixels @ 30 frames per second.
packetization-mode=1
|
Values (0,1,2)
0 = a single NALU
packet sent in an RTP packet, no fragments
1= multiple NALUs
can be sent in decoding order. Fragments allowed
2= multiple NALUs
can be sent out of decoding order. Fragments allowed
The negotiated packetization
mode for the call must be symmetrical
|
max-mbps=245000
|
Max Decoding speed
= Max Macroblocks/sec = 245000 (Baseline profile level 2.2 value = 20250)
|
max-fs=9000
|
Max Frame Size =
9000 Macroblocks (Baseline profile level 2.2 value = 1620)
|
max-cpb=200
|
Max Coded Picture
Buffer size = 200 kbits (Baseline profile level 2.2 value = 4 kbits)
|
max-br=5000
|
Max video bit rate
= 5000 kbps, Baseline profile level 2.2 value = 4000 kbps
|
max-rcmd-nalu-size=3456000
|
Max NALU packet
size (bytes) that the receiver can handle
|
max-smbps=245000
|
Max Static
Macroblock processing rate – macroblocks/second
|
max-fps=6000
|
Max Frames Per
Second in 1/100s of a frame/second = 60 fps (Baseline profile level 2.2 value
= 30 fps)
|
Offer (H.264 and H.263 Offered)
a=rtpmap:98
H264/90000
a=fmtp:98 profile-level-id=428016;packetization-mode=1;max-mbps=245000;max-fs=9000;max-cpb=200; max-br=5000;
max-rcmd-nalu-size=3456000;max-smbps=245000;max-fps=6000
a=rtpmap:99
H263-1998/90000
a=fmtp:99
QCIF=1;CIF=1;CIF4=1;CUSTOM=352,240,1
Answer (H.264 selected – Symmetric Attributes - Asymmetric
attributes)
a=rtpmap:98
H264/90000
a=fmtp:98 profile-level-id=428016;packetization-mode=1;max-mbps=108000;max-fs=3600;max-cpb=200; max-br=5000;
max-rcmd-nalu-size=1382400;max-smbps=108000;max-fps=6000
RTCP Attributes
Video endpoints use
RTCP packets as feedback mechanism for rate adaption when packet
loss/congestion is encountered. The negotiation of RTCP feedback mechanism is
taking place as part of call establishment (part of video negotiation).
Looking at SDP body
we can see the following RTCP headers
a=rtcp-fb:* nack
pli
|
“rtcp-fb” RTP Control Protocol
(RTCP) - Feedback
“*” RTCP-Feedback for any of the
offered video codecs
NACK – Negative Acknowledgement –
indicates the loss of one or more RTP packets
PLI – Picture Loss Indication
|
a=rtcp-fb:* ccm
tmmbr
|
“rtcp-fb” RTCP-Feedback
“*” RTCP-Feedback for any of the
offered video codecs
“ccm” indicates support of codec
control using RTCP feedback messages
"tmmbr" indicates support
of the Temporary Maximum Media Stream Bit Rate Request/Notification
|
BFCP
Video Attributes
BFCP video stream
negotiation is exactly similar to main video stream negotiation. However, it is
important to point out two headers in SDP body used to distinguish BFCP video
from main video parameters. These two headers are content and label.
For main video
stream the content parameter will be main while for BFCP video stream the
content parameter will be slides (desktop sharing).
a=content:main
a=label:11
Or
a=label:12
a=content:slides
The label parameter in BFCP video is very
important and is mapped to floor-id in BFCP attributes.
…
a=rtpmap:99
H263-1998/90000
a=fmtp:99
QCIF=1;CIF=1;CIF4=1;CUSTOM=352,240,1
a=label:12
a=content:slides
a=rtcp-fb:*
nack pli
a=rtcp-fb:*
ccm tmmbr
m=application
5070 UDP/BFCP *
c=IN IP4
10.58.9.86
a=floorctrl:c-s
a=floorid:2 mstrm:12
No comments:
Post a Comment