WebRTC is only hard for the first few months

Originally posted on 2023-12-29

2024-01-03 - Every time I look at this I notice something different… stay tuned, some of the info about the receiver is incorrect!

2024-01-08 - Finally got back to this and clarified the different order of operations for the receiver in the “Remote description” section.

Achsually… it’s EASY!

I spent a long time learning all of this and as I move from platform to platform I don’t want to learn any of it ever again. Everywhere I look WebRTC is overcomplicated. Sometimes for good reason (trying to build a client with lots of bells and whistles that gives you cool nerdy info about your connection). Sometimes it’s just because people are confused.

Let’s try to make this as simple as possible.

WebRTC

WebRTC lets you connect two peers on a network with each other. Usually this is on the Internet but can also just be used to establish connectivity in an isolated network as long as you have a STUN server or some way to generate ICE candidates. Hosts exchange ICE candidates and then use that information to connect to each other.

ICE candidate

An ICE (Interactive Connectivity Establishment) candidate is a data structure that represents a host’s IP address (public and private) along with some port mapping information that another host can use to connect to it either directly, through a relay, or via NAT hole punching.

Initiator and receiver

One of the parties needs to be the initiator. The other needs to be the receiver. It doesn’t matter which way data will flow since the channels are bidirectional. Therefore, this is decided arbitrarily which is the best way to decide most things.

Peer connections

Both parties need a peer connection. This data structure contains the STUN and optional TURN server configuration information. In Golang it looks like this:

config := webrtc.Configuration{
	ICEServers: []webrtc.ICEServer{
		{
			URLs: []string{"stun:stun.l.google.com:19302"},
		},
	},
}

peerConnection, err := webrtc.NewPeerConnection(config)

Offers and answers

The initiator needs to create an offer. The receiver needs to create an answer. The basically just gathers the ICE candidates into something called the local description.

Then both the initiator and receiver need to set their local description to get the WebRTC implementation to start listening on ports and IPs in the local description.

The order of operations is slightly different for initiator and receiver as we’ll see in the next section.

Remote description

A remote description is a serialized representation of the ICE candidates. The format is called SDP (Session Description Protocol). It’s old, and ugly. Be glad there are libraries for this.

The initiator’s local description, serialized to SDP, is the receiver’s remote description.

The receiver’s local description, serialized to SDP, is the initiator’s remote description.

It’s important to note that the receiver needs to call SetRemoteSessionDescription first, then CreateAnswer, and then finally SetLocalDescription. This is because creating an “answer” without knowing the remote party’s configuration doesn’t make much sense. Finally, setting the local description will kick things into action and match up the mutual candidates that make sense.

Transporting it

Now the fun part. You need to get these two remote descriptions from one peer to another.

Since nothing in life is easy this exercise is left up to the reader.

Connecting

Once the peers have the remote descriptions it is time to set the remote description and establish connectivity. In Golang we do this:

Actually moving packets

At this point you probably just want to see some data flowing. If all went well you can create an event handler for the peerConnection.OnDataChannel event:

And you can do this on the initiator:

The receiver and initiator will then need handlers for the OnOpen and OnMessage events on the data channels:

Conclusion?

So easy even GPT-4 Turbo could do it!