Adaptive Proof of Work for Socket Connection Establishment

erik aronesty
2 min readJul 30, 2021

When developing an enterprise-scale server cluster, one source of potential failure is the high cost of connection establishment:

For example:

  • TLS Connection Negotiation is expensive
  • TLS Connection Negotiation can be made significantly more expensive for the server, if the attacker chooses to
    — Repeatedly renegotiate
    — Repeatedly connect

Many custom websocket implementations, layered on top of SSL, are even more expensive:

  • Authentication token verification
  • Payload verification

These combined expenses for the server can lead to a 100-fold cost for the server compared to the client.

A trivial solution to this problem is a required “proof of work” at connection establishment.

  • The server calculates the number of inbound connections per second it is receiving
  • As this number grows larger, the server increases the proof-of-work requirements for successful connections
  • Proof of work is is verified before handshakes, auth token verifications and other expensive operations

A simple atomic counter with the total number of currently negotiating connections is used to compute the POW required per connection — adjusting in response to demand. Other adaptive mechanisms can be used, such as load average, for higher level connection negotiation schemes.

The net effect of this is that an attacker causes it to be more expensive for clients to connect — including himself. This spreads resource consumption around all connected clients, leaving the servers no longer vulnerable.

A comparison of SSL handshakes with embedded, adaptive proof of work during a DDOS attack vs handshakes without:

  • 1000 legitimate clients, 10k connections per second from an attacker with a large pool of IP addresses causes performance degradation on the server, all connections and services begin failing.
  • With SSL-APOW, 1000 legitimate clients, APOW responded within 0.01 seconds, attempted 10k connections per second from the attacker caused no performance degradation on connected clients. New clients saw a 0.25 second connection POW latency expense.

--

--