Crossing the Mixed Content Boundary: abusing STUN/TURN as a Communication Channel

TL;DR: You can send and receive data using TURN by encoding data into username and IPv6 UDP address.

Say that you have a HTTPS website. Modern web safety forbids it from accessing insecure parts of the network using HTTP. This is usually not an issue, since signing a HTTPS certificate for a website is fairly easy thanks to Let’s Encrypt.

However, there are situations where signing a certificate for a server is not feasible. For example, you cannot obtain a certificate for a server that is not on the public Internet.¹ If I have a server running at 192.168.1.100, there are not a lot of options to communicate with it from https://example.com.²

Examining Existing Options

With https to http communication not being available, the remaining options are

Using redirects, by redirecting users from https://example.com to http://192.168.1.100 and then back to https://example.com, we can encode information in the URL, and pass them back and forth.
- This is a lot like how SSO is implemented using SAML and OAuth2.
- This is not great for UX. Having to redirect a user back and forth each time we want to communicate is not good.
WebRTC
- WebRTC cannot by itself establish an connection. It needs a signaling server. This means that the local server would need to connect to the Internet. This may not be feasible or maybe you just don’t want to host a WebRTC signaling server.
Use WebRTC, but use redirects to handle signaling
- This is a lot better than option 1, as you don’t need to do redirects every time when communication is needed.
- However, this still requires a redirect each time the user visits the website since WebRTC sessions do not persist over navigations. So every time you go to https://example.com, the user needs to be redirected to and from http://192.168.1.100.

Can we do better than these options? Can we establish a non-TLS connection without redirects, user interaction, or a pre-existing communication channel? Given that we have WebRTC, it means that we just need to pass about a couple KBs of data bidirectionally to establish a connection – is there any way that allows a browser to pass small amount of data crossing the mixed content boundary?

The answer is yes, via STUN/TURN.

Smuggling Bytes using STUN/TURN

Both STUN and TURN are protocols of the Interactive Connectivity Establishment (ICE) suite, intended to establish connectivity though NAT.

The purpose of STUN is to get a public UDP address of a peer. If we use IPv4, That is 6 bytes of information, 4 bytes from the IP, and 2 bytes from the port (4 bytes usable if we eliminate invalid ports and IP addresses). If we use IPv6, we can get 18 bytes of information (16 bytes usable). The problem with STUN is that there’s no user configurable input to STUN. You can receive data, but not send data.

The sister protocol TURN is an extension to STUN. TURN allows allocating a UDP relay, in case a connection cannot be established via STUN. Since TURN is a relay server, it is a lot more expensive to run than STUN, therefore some form of authentication is usually required. TURN sends the username in plaintext. Given that the safe MTU is about 512 bytes and there’s overhead in the TURN protocol, it’s possible to send at least 256 bytes of data using a TURN username. We can receive 16 bytes of data from a TURN response, encoded in a IPv6 address and port number pair.

We are now able to send about 256 bytes of data (encoded in username) and receive 16 bytes of data (encoded in IPv6 address and port) by abusing TURN.

16 bytes in a response is obviously not sufficient, as a regular WebRTC SDP weighs about 1.5KB. So I developed a janky protocol that splits the request and response into smaller chunks.

The client first allocates a request and tells the server how big is request is going to be, encoded in the username. The server returns a 16 byte request ID.
The client incrementally sends the parts of the request in 256 byte chunks using the allocated request ID.
Once all parts are sent, the client tells the server to execute the request. The server returns the response size.
The client incrementally pulls the response from the server in 16 byte chunks.

Given that a WebRTC SDP is about 1.5KB. This is going to result in about 100 TURN request and response pairs, which translates into 400 UDP packets. We transmitted 3KB using 400 UDP packets, averaging an impressive 7.5 bytes transmitted per UDP packet.

This is not great. So I trimmed the response SDP down by removing unnecessary information (in particular, redundant candidates). This brought down the SDP size to about 700 bytes.³

This is still not great, so I added ZLIB compression using a pre-defined dictionary. I compressed the SDPs down to about 300 bytes each – now we only need to send about 80 UDP packets.

Now we can establish a communication channel by only using IP address + port.

Putting it All Together

I made a demo at https://codepen.io/gyf304/pen/oNrjoJj. The demo fetches a proxied version of https://example.com over TURN. It requires 0 user interaction (apart from clicking the “Execute” button) to communicate with a non-secure server from a secure origin.

I also published NPM packages @turnx/server and @turnx/client, in case people want to give it a spin. The @turnx/server package provides a server that allows you to proxy a HTTP/HTTPS website over TURN, and the @turnx/client package provides a client that implements a similar interface to fetch. The packages are not documented, but the demo above should be sufficient to get you going.

The source is available at https://github.com/gyf304/turnx.

It is also worth noting that this communication protocol is extremely inefficient, so the intended usage is to use this to establish an actual connection using WebRTC (e.g. by using WHIP).

Also: don’t do this over the public Internet.

You can technically sign a cert for a local address using techniques outlined in https://letsencrypt.org/docs/certificates-for-localhost/. But as the article also pointed out, distributing a private cert as part of a software package is a big no-no. ↩︎
Why would someone want to communicate to a LAN server from a secure HTTPS website? I can think of a few use-cases: a media app like Plex where logic is served from a HTTPS remote server, or a PWA, and the content is stored on a local server; a HTTPS website for upgrading the firmware of a LAN only appliance; or a web app to transfer files in a LAN and you don’t want to build a WebRTC signaling server. ↩︎
You can reduce this further. In fact, people have encoded an SDP in 106 bytes! (https://webrtchacks.com/the-minimum-viable-sdp/) ↩︎

Crossing the Mixed Content Boundary: abusing STUN/TURN as a Communication Channel

Examining Existing Options

Smuggling Bytes using STUN/TURN

Putting it All Together

Comments

Leave a Reply Cancel reply

More posts

Crossing the Mixed Content Boundary: abusing STUN/TURN as a Communication Channel

IPv4 Network Address Translation for Steam (NAT4S)

Janky NTP – Setting System Time Using HTTP

File System as Partition Table