WebRTC’s SDP in depth

Originally posted on 2024-10-17

TL;DR - skip to the a=candidate lines, the first IP and port pair you see on each line is the port that the other peer will try to connect to

SDP is ugly. Few people will disagree about that. Here’s what a partial SDP message looks like:

v=0
o=- 903082692346368527 1705546069 IN IP4 0.0.0.0
s=-
t=0 0
a=fingerprint:sha-256 1D:F8:74:EE:C4:82:0C:EE:7B:EE:CF:EE:68:60:05:6B:BB:FF:3F:2B:BD:13:9F:FB:DE:AA:D7:BB:AE:CC:15:18
a=extmap-allow-mixed
a=group:BUNDLE 0
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=setup:actpass
a=mid:0
a=sendrecv
a=sctp-port:5000
a=ice-ufrag:aXYxyYyxYxxYEXPC
a=ice-pwd:RyclbDCifzzzlzNVzRzzzzzzzzzzzVSfW
a=candidate:1993901060 1 udp 2130706431 192.168.0.159 54956 typ host
a=candidate:1993901060 2 udp 2130706431 192.168.0.159 54956 typ host

It is horrible. But here is some guidance to help you make sense of what you’re looking at when you’re using WebRTC and SDP.

Lines?

Yes, SDP is a text format that consists of a bunch of lines. Those lines can be separated by a carriage return and/or a line feed. You should be flexible here because like Forrest Gump said “you never know what you’re gonna get”.

Attributes?

The first character of each line tells you what kind of line it is. A line that starts with a= is an attribute line. These lines let applications add data to SDP without it having to be part of the standard.

Candidates?

To connect with WebRTC you need interactive connectivity establishment (ICE) candidates. These contain information about your current network connection that need to be sent to another party to establish a connection.

If you like reading RFCs have at it.

Host candidates

Candidates denoted as typ host are candidates that your WebRTC stack came up with on its own. These are typically local IP addresses like 192.168.1.5. A host candidate record looks like this:

a=candidate:1993901060 1 udp 2130706431 192.168.0.19 54956 typ host
a=candidate:1993901060 2 udp 2130706431 192.168.0.19 54956 typ host

1993901060 is the foundation. This value is a string between 1 and 32 characters. If two lines have the same foundation value then they are the same type (host in this case), have the same base (a term never defined by the RFC, nice), and in the case of server reflexive candidates they are guaranteed to come from the same STUN server.

1 means that this candidate is going to be used for RTP streams

2 means that this candidate is going to be used for RTCP streams

udp means that the candidate uses UDP for the transport layer

2130706431 is the priority of the candidate. Candidates are used/tested starting from the highest priority value and going down to the lowest.

192.168.0.19 is the IP address that the other peer should try to connect to

54956 is the UDP port that the other peer should try to connect to

typ host indicates that this is a host candidate

Server reflexive candidates

Candidates denoted as typ srflx are candidates that your WebRTC stack got from a STUN server. These are the public IP addresses and ports that your system can’t know on its own without asking a system with a public IP address.

If you’ve specified one or more STUN servers the WebRTC stack will make binding requests to each server. The servers will then respond with binding response messages that contain the public IP address and the source port that the binding request came from.

Because NAT remaps source ports the binding response makes it possible for us to establish direct connectivity with another peer by figuring out what this remapped value is. This works most of the time unless you’re behind symmetric NAT. Symmetric NAT remaps every source IP, source port, destination IP, and destination port with a new value.

A server reflexive candidate record looks like this:

a=candidate:2394145186 1 udp 1694498815 45.134.142.197 52746 typ srflx raddr 0.0.0.0 rport 52746
a=candidate:2394145186 2 udp 1694498815 45.134.142.197 52746 typ srflx raddr 0.0.0.0 rport 52746

Everything here is the same as a host candidate except that the IP and port values are your public IP address and public port values.

There’s also the raddr and rport values. But fear not! They’re useless. They’re for debugging only and you can’t depend on their values.

Remember that the WebRTC stack probably made a request to the STUN server on some port other than UDP 52746 (let’s say it was UDP 38542), your NAT remapped it to UDP 52746, and now your WebRTC stack knows that UDP 38542 is mapped to UDP 52746.

The other peer can now send payloads to your public IP address, on UDP port 52746, and they’ll be received on your system on your private IP address on UDP port 38542.

Peer reflexive candidates

Theoretically this is when you get candidates from peers instead of a STUN server. (Almost) Nobody does this. Don’t bother with them.