Elixir

Hooking into Erlang’s SSL to Perform DTLS-SRTP Handshake

Michał ŚledźJan 29, 20256 min read

Before we start, the following solution does not fully fit into WebRTC architecture, or at least that’s not how we perform DTLS-SRTP handshake in Elixir WebRTC.

However, it shows that something can already be achieved using existing Erlang’s SSL module and its abstractions, and we are not so far from getting rid of OpenSSL NIFs (like ex_dtls).

How WebRTC encrypts data?

Let’s start with DTLS. You can think of it as TLS but for UDP. In particular, because UDP is not reliable, DTLS comes with its own retransmission mechanism.

Now, in WebRTC, DTLS plays two roles. First of all, it is used to encrypt arbitrary data that can be sent via WebRTC’s data channels (text, binary, etc.). Second of all, it is a source of the so-called keying material (cryptographically strong data), which is used to provide encryption keys for audio and video.

One can ask: if we already use DTLS for encrypting arbitrary data, why can’t we also use it for encrypting audio and video?

Audio and video are encapsulated into RTP packets, which include sequence numbers, timestamps, and codec information, among other things. In other words, RTP fills in the gaps left by UDP, enabling custom congestion controllers, detecting packet reordering, supporting retransmissions, and identifying codecs. It happens that DTLS datagrams duplicate some of these fields, so one possible reason for this MIGHT have been to avoid wasting network bandwidth.

The Erlang’s SSL module

The Erlang SSL module already allows for performing DTLS-SRTP handshake (DTLS-SRTP is an extended version of casual DTLS that makes it usable in the context of RTP). An example code can look like this:

# server.exs
:ssl.start()
options = [
  protocol: :dtls,
  certs_keys: [
    %{certfile: "cert.pem", keyfile: "key.pem"}
  ],
  use_srtp: %{protection_profiles: [<<0x01::16>>]}
]

{:ok, socket} = :ssl.listen(4444, options)
{:ok, socket} = :ssl.transport_accept(socket)
{:ok, socket} = :ssl.handshake(socket)
:ok = :ssl.send(socket, "hello")
# client.exs
:ssl.start()

options = [
  # this is for demo purposes
  verify: :verify_none,
  protocol: :dtls,
  use_srtp: %{protection_profiles: [<<0x01::16>>]}
]

{:ok, socket} = :ssl.connect({127, 0, 0, 1}, 4444, options)

receive do
  msg -> dbg(msg)
end

The problem is that the :ssl module will open and maintain sockets automatically, which is undesirable in WebRTC. WebRTC uses ICE protocol for establishing the connection between two parties. In a nutshell, in ICE, both sides open a separate socket on every network interface available in your OS. Then, for every open socket, ICE agent discovers its public IP address and sends it to the other side. At the end, both agents pair received addresses and check which pair seems to work.

Therefore, in WebRTC, we have to manage UDP sockets on our own. What’s more, the socket used for sending can change in time — for example, imagine a check for one pair of addresses passed. To speed up the whole process, ICE immediately allows for sending data without waiting for other checks to finish. However, if a check for another pair with better network characteristic passes, ICE will switch to it.

The perfect solution

To solve the problem, in most cases, we write Sans I/O code. Looking from :ssl module perspective, instead of reading and writing packets from/to a socket, it could read and write them from/to a memory. It is then the user’s responsibility to deliver these packets to the other end, and receive the response using their own socket. In OpenSSL, this is achieved using memory BIO.

Alternatively, the :ssl module could accept a socket/port/pid that should be used for reading and writing data. And when we take a look at the documentation, that’s something that is already supported, but only for TCP sockets.

cb_info

The closest solution right now is a callback module that :ssl module will use as the underlying transport implementation. In other words, a module that will implement open , setopts , getopts , send, and other functions.

Let’s see how it works:

# server.exs
defmodule CustomUDPTransport do
  def open(port, opts) do
    :gen_udp.open(port, opts)
  end

  def controlling_process(socket, pid) do
    :gen_udp.controlling_process(socket, pid)
  end

  def setopts(socket, opts) do
    :inet.setopts(socket, opts)
  end

  def getopts(socket, opts) do
    :inet.getopts(socket, opts)
  end

  def port(socket) do
    :inet.port(socket)
  end

  def send(socket, host, port, packet) do
    :gen_udp.send(socket, host, port, packet)
  end

  def close(socket) do
    :gen_udp.close(socket)
  end
end

:ssl.start()

options = [
  cb_info: {CustomUDPTransport, :udp, :udp_closed, :udp_error},
  protocol: :dtls,
  certs_keys: [
    %{certfile: "cert.pem", keyfile: "key.pem"}
  ],
  use_srtp: %{protection_profiles: [<<0x01::16>>]}
]

{:ok, socket} = :ssl.listen(4444, options)
{:ok, socket} = :ssl.transport_accept(socket)
{:ok, socket} = :ssl.handshake(socket)

:ok = :ssl.send(socket, "hello")
# client.exs
:ssl.start()

options = [
  # this is for demo purposes
  verify: :verify_none,
  protocol: :dtls,
  cb_info: {CustomUDPTransport, :udp, :udp_closed, :udp_error},
  use_srtp: %{protection_profiles: [<<0x01::16>>]}
]

:ssl.connect({127, 0, 0, 1}, 4444, options)

receive do
  msg -> dbg(msg)
end

We added a new option called cb_info. It specifies a module that implements needed functions as well as message tags that the internal Erlang code should expect to be triggered by a socket returned by our custom open function.

This implementation just delegates all calls to :gen_udp or :inet modules. What we could do is to, instead of calling :gen_udp.open, return a pid of the ICE agent and try to simulate an in-memory handshake, but I didn’t manage to make it work.

What’s more, in case of the server side, only open and controlling_process functions from CustomUDPTransport module are called.

Protection profile and keying material

If you still find this code useful, there are two more questions:

  • how to choose protection profile
  • how to obtain keying material

Re. 1. A client can specify multiple protection profiles it supports. Every protection profile is a two-byte pos-integer e.g. [<<0, 1>>, <<0, 2>>, <<0, 5>>]. Protection profiles are defined in different RFCs, e.g.: RFC 5764 or RFC 7714.

The server has to choose one protection profile that is going to be used. This is done by pausing the handshake (using the[handshake: :hello] option) after receiving a Client Hello message and resuming it using handshake_continue with new options that contain a single, selected protection profile. Alternatively, the handshake can be canceled (e.g. when the client didn’t offer protection profile supported by the server) using the handshake_cancel function. See this test for more.

Re. 2. Since OTP 27, there is a dedicated function :ssl.export_key_materials. The easiest way will be to take a look at the code:

defmodule KeyMaterial do
  def export(socket, protection_profile) do
    {master_key_len, master_salt_len} = get_length(protection_profile)

    # See RFC 5764 sec. 4.2 for label and parsing explanation
    {:ok, key_materials} =
      :ssl.export_key_materials(socket, ["EXTRACTOR-dtls_srtp"], [:no_context], [
        2 * (master_key_len + master_salt_len)
      ])

    <<client_master_key::binary-size(master_key_len),
      server_master_key::binary-size(master_key_len),
      client_master_salt::binary-size(master_salt_len),
      server_master_salt::binary-size(master_salt_len)>> =
      key_materials

    client_key_material = <<client_master_key::binary, client_master_salt::binary>>
    server_key_material = <<server_master_key::binary, server_master_salt::binary>>

    {client_key_material, server_key_material}
  end

  # RFC 3711 sec. 8.2
  def get_length(0x01), do: {16, 14}
  def get_length(0x02), do: {16, 14}
  # RFC 7714 sec. 12
  def get_length(0x07), do: {16, 12}
  def get_length(0x08), do: {32, 12}
end

Summary

Although the Erlang SSL module doesn’t fit our WebRTC implementation, it provides useful abstractions that, if expanded in the future, could fully replace ex_dtls.

The full code is available here.

Happy streaming!

We’re Software Mansion: multimedia experts, AI explorers, React Native core contributors, community builders, and software development consultants. Hire us: [email protected].