Steve Lacey - Student ID: 07501160 - Source code available on Github

WebSockets

Introduction

The WebSocket specification defines TCP for the web browser and they are one of the most underappreciated innovations in HTML5. The benefits of the newly introduced WebSocket API are for most part invisible to the end-user, but as developers over the course of the last decade we have invented dozens of technologies to try and solve the problem of achieving asynchronous and bi-directional communication between web-browsers and servers. Asynchronous JavaScript and XML (AJAX), HTTP Streaming and Flash sockets to name a few; all have weaknesses, that eventually boil down to the simple fact that web browsers were not designed for bi-directional communication.

Until now that is, WebSockets are different to these technologies of the past because they were designed from the very beginning to support full-duplex communication. Only a single connection is required; which has a massive impact reducing bandwidth consumption and resource utilisation for both clients and servers, allowing us to communicate both quicker and more efficiently. Additionally, the WebSocket protocol was developed with existing technology in mind. WebSockets can be used over SSL, through proxies, firewalls and utilises the HTTP channel, meaning that existing technology in place such as routers, proxies and load balancers should operate as normal.

Relevance to the World Wide Web

The relevance of this technology to the web as we know it is astounding. By having bi-directional communication between browsers and servers, we can push real-time data at our users and to quote the HTML5 specification lead (and Google employee) Ian Hickson:

Reducing kilobytes of data to 2 bytes…and reducing latency from 150ms to 50ms is far more than marginal. In fact, these two factors alone are enough to make Web Sockets seriously interesting to Google.

Ian Hickson

Now that we have WebSockets we can build applications that react to the world around them in a fraction of the time previously possible. With the significant latency improvements by removing unnecessary overheads and polling- and when combined with Canvas and WebGL; a whole new generation of web applications and massively multiplayer games are on the horizon.

JavaScript API

WebSocket Events

The WebSocket API is still a draft, but the major modern browsers have already implemented most of the functionality, and even Microsoft has a prototype implementation in the works for Internet Explorer.

This specification defines an API that enables Web pages to use the WebSocket protocol for two-way communication with a remote host.

The WebSocket API Editor's Draft

The JavaScript added as part of the HTML5 specification allows for fairly simple use of the WebSocket protocol by developers. Amongst read-only connection details such as 'url', 'protocol' and 'readyState' the WebSocket interface defines four event handlers and two methods:

[Constructor(in DOMString url, in optional DOMString protocols)]
[Constructor(in DOMString url, in optional DOMString[] protocols)]
interface WebSocket {
  readonly attribute DOMString url;

  // ready state
  const unsigned short CONNECTING = 0;
  const unsigned short OPEN = 1;
  const unsigned short CLOSING = 2;
  const unsigned short CLOSED = 3;
  readonly attribute unsigned short readyState;
  readonly attribute unsigned long bufferedAmount;

  // networking
  attribute Function onopen;
  attribute Function onmessage;
  attribute Function onerror;
  attribute Function onclose;
  readonly attribute DOMString protocol;
  void send(in DOMString data);
  void close();
};
WebSocket implements EventTarget;

These simple functions make it easy to get started with WebSockets, a simple example being the echo test hosted on websocket.org:

websocket = new WebSocket('ws://echo.websocket.org');

websocket.onopen = function(event) {
  websocket.send('hello from client');
};

websocket.onmessage = function(event) {
  console.log(event.data);
};

Here, the client attempts to connect to a WebSocket, should the connection be successful, it sends a message to the server. The echo service example simply repeats messages back across the connection, so on receipt of the message, would send the same message back to the client, which would then trigger the 'onmessage' function and print the message out in the console. This example could be built upon as I have shown below by swapping out console logs for DOM alterations; or even further by negotiating the exchange of client messages as a chat application as shown in my examples.

Server-Side

WebSocket servers are also relatively simple to set up. There is a vast collection of libraries, frameworks and even some native support spanning languages including:

Having some experience with Node.js and finding Ruby syntax somewhat difficult to grasp, for my prototypes I chose to make use of the Node WebSocket Server package by Micheil Smith.

Node WebSocket Server is a near specification compliant implementation of the server-side WebSocket Protocol. It is built on top of Node.js as a third-party module, and is designed to support various versions of the WebSocket protocol – which is currently still an IETF Draft.

Node WebSocket Server

The server's main roles amongst other things are to accept (or reject) clients, maintain a unique identifier for each connection and to listen for and broadcast messages, the core concept being to manage the rapid exchange of data between nodes.

Protocol

WebSocket Lifecycle

When loading a webpage geared for WebSocket interactions, initially, a standard HTTP connection is used, to load the mark-up, styling and JavaScript. The JavaScript then sends a HTTP request asking to upgrade to use the WebSocket protocol, passing along some keys. After a complex handshake that ensures both sides fully support WebSocket- the server sends a HTTP response returning more keys and confirming the connection. This handshake is for the most part handled internally by the WebSocket JavaScript API, and once it is complete, the full power of the protocol is unleashed.

Once the connection is established there is no HTTP header overhead on each packet, the protocol overhead is in fact only two bytes per packet. Additionally, there's no huge XML encode / decode overhead either, so this is a great transport for low-latency data like gaming player positioning updates or speech.

Whilst mainstream adoption of the protocol amongst modern browser vendors has been great, recent discoveries found that the WebSocket protocol is vulnerable to attacks. Adam Barth demonstrated some serious attacks against the protocol that could be used by an attacker to poison caches that sit in between the browser and the Internet. As a result of this, major vendors including Mozilla and Opera disabled WebSockets functionality by default to protect users until the protocol is fixed. However, at the time of writing, the protocol has now been patched and no doubt vendors will implement the changes and re-enable the protocol by default soon.

Degradation

There are a variety of options available when considering degrading WebSocket functionality for legacy browsers. Alike many of the new technologies that came along with the new HTML specification, polyfilling is a popular choice for degradation.

A polyfill, or polyfiller, is a piece of code (or plugin) that provides the technology that you, the developer, expect the browser to provide natively. Flattening the API landscape if you will.

What is a Polyfill? by Remy Sharp

The basic premise being that those who need to degrade this functionality can write or utilise a wrapper library that makes use of the in-built WebSocket functionality when it is available natively, and reverts to one of the techniques designed for the last generation of browsers; such as long polling or FlashSockets when it is not.

A few of the fallbacks recommended by Paul Irish and others include:

Socket.IO appears to be the most mature of the selection, offering complete abstraction for both client and server-side code, as well as degradation to a wide selection of technologies. Obviously, the fallbacks will be of no match to native WebSocket support- but they are a promising solution for developers who need that backwards compatibility.

Examples

The following examples I have created to demonstrate the basic functionality of WebSockets, they are for the most part written in raw JavaScript with the occasional dash of jQuery and Node.js server-side. All source code is available on Github.

The 'chat' and 'date' servers reside on port 8081 and 8082 respectively; ports of which are blocked on the UWE campus.

The echo example's server is hosted by websocket.org on port 80, and should work through the proxy.

WebSockets in Action

The following websites demonstrate the use of WebSockets in production environments, for a wide range of applications including massively multiplayer gaming, sketching and multiuser synchronising of data.

Evaluation

WebSockets represent the next evolution of web communications: a full-duplex, bidirectional communications channel that operates through a single socket over the Web. They provide a true standard that can be used to build scalable, real-time web applications. Eliminating many of the problems associated with previous similar technologies, removing excessive overhead and dramatically reducing complexity of the applications we build.

Whilst there are clearly considerations to be made when back-porting the technology to legacy browsers, the tools available to aid this process are ground-breaking. Additionally, the vendor involvement when writing the specification means that they either support it or do not, far gone are the days where these technologies are half-implemented; allowing a relatively simple feature detection and polyfilling process; rather than writing 'hacks' such as those for Internet Explorer CSS bugs as seen in the past.

References