Recently, I came across a technique to host a single page application and thought it was non-obvious and should be shared. The technique is useful especially if every ms counts and if you can work around its limitations. Thanks to a co-worker for introducing the concept to me and the AWS documentation for outlining how to do it.

What is a Single Page Application (SPA)?

An SPA is a modern technique to develop web apps. In the old generation of sites, users would click on links that would take the user to a whole new page, causing the page to unload then load a new page into your browser. Much of the page could be the same as the previous page (header, footer, sidebar) with only some changed content section.

At some point, web engineers realised that DOM reloads were a bottleneck. Unloading and reloading much of the same content wasn’t fast enough, and when you want to reach the whole world and optimally, there should be a better way, and if not, we’ll figure how to make it. It then become a practice to load the initial page once, and then using a single page application to alter the page (with javascript) to mutate it into the new page.

What is the common way to deploy a SPA?

First, let’s talk about the evolution of web pages. Initially, much of the Internet’s web pages were static. This means once they were created, they never changed. Web servers would just host them as ordinary files just like images. After some time, we started adding dynamic content to web pages and they were no longer just static pages. This brought in more complicated server code, template engines, dependency injection, etc. At this time, servers still rendered the pages because they had access to all the data the previously static web pages needed. This is when we started getting into AJAX, where pages became more lively, and responded to click events and loaded content without refreshing. You could still click certain links and get the DOM unload/load, but part of the site took advantage of AJAX or client-side loading.

After this phase, changing the entire website became the next logical step — continue absorbing more pages into a single page application and keep avoiding the costly DOM reload. The easiest transformation here again was to let the server render the first page, then respond to requests from the single page application. We got API-driven development, allowing us to connect many clients like mobile devices and now IoT. So as it stands from my viewpoint, the typical way is still to have the server render the initial page of the SPA, and then let the SPA modify the page to avoid the DOM reload.

What is a Content Distribution Network (CDN)?

To introduce the technique, we need to talk about CDNs. The time it takes to send data through the internet is proportional to the distance, data size, and routing hops you have to make (I.e. smaller size, smaller distance, and a more direct route can reduce the time it takes to send data, among others). When all your servers exist in say the US, sending data to Asia or Europe would be slower than someone else in the US, if other things stay constant like bandwidth, download speeds, etc.

To help solve this problem, CDNs provide a service with servers distributed around the world that can cache copies of data to respond to requests originating within their purview. When the first request comes in, it checks its local cache and if it doesn’t exist, it asks an originating server that you configure for the copy. It’s basically one big forwarding logic.

It’s important to know that when you put something on the CDN, you are giving up some control over the data. If you had only one copy and you wanted to delete it, you just delete it and you know it’s gone (to some extent, I.e. irrespective of HTTP Cache Control or external services). If you have ten servers around the world, you either need to let them expire (a CDN is a giant cache) or you need to tell them you deleted it.

screen-shot-2016-11-06-at-4-52-18-pm

Hosting your SPA on a CDN?

So the big thought here is that we go back to our static html ways. With static assets, we can put them on a CDN because they don’t change often. With SPAs, you generally have an index html file, an app.js, some css files, and other static assets like images. If you treat these files as static, you can allow them all to be cached by a CDN.

How to update your SPA on a CDN?

One of the hardest parts of hosting your SPA is actually updating it. Getting static assets onto a CDN isn’t hard. Determining how you want to update it is harder. Imagine if you make a mistake and you need to change your SPA immediately. Some CDNs have SLAs up to a 10 min until your previous file has been successfully updated. So just pushing invalidation requests to a CDN might not be enough.

The best method I can think of is a combination of things, one of which is to design your SPA to have a method to know if it’s outdated or not. Think about how auto-update clients work, they know there’s a new update and they ask you to update them. Sometimes this is centralised across applications into a common place (I.e. the AppStore) or the application just asks by itself. Regardless, if the client knows when to update, it can reload itself. You could even design it so that some client updates are soft-reloads and some are hard-reloads, letting the client decide or ask the user if they’d like to refresh to get new features, not get blocked out due to an older version, etc. Anyways, if you have this ability, you’ll be in a better place. How the SPAs find out they are out of date is up to you as well. You could have a static file somewhere on your web servers that you have complete control over and have the SPA request this URL. Or you could push an update signal to SPAs using technologies like WebSockets. You just need complete control over this, then you enable quick updates.

After you have an SPA that knows it is out of date, you’ll want to find a way for the SPA to update itself. Just reloading isn’t enough, because remember, the assets are all on the CDN, which you don’t have synchronous control over (true for AWS + Akamai). So how do we load the latest version of our SPA? For our initial app.js, which is really the SPA, we can add a versioning mechanism to unique identify deployments. Whatever it is, whether it’s numbers, letters, hex, if you create unique files to identify your versioned SPAs you want to deploy, the CDN won’t have those files and will request to the origin server to get them. This process makes immutable copies of your SPA, which is where CDN’s advantages are (static content). Changing the index.html though is harder. One thing we could do is to make this versioned as well, like index.{hash}.html, but this won’t work for many servers that have rules like rewrite all root requests to /index.html. Then instead of using window.reload, we redirect to another page. This could be less-attractive visually (think facebook.com/index.3k23.html or facebook.com/v4930). The optimal thing would be to have control over the CDN, where you can add your own rewrite rules for all new requests to route to the current version’s index.html. Reloads ignore browser cache, so this should work.

Overall, here are some of the ways I can think of to update an SPA, and happy to discuss more in the comments:

  1. Accept the invalidation duration, whatever it is. If you have up to a 10 min wait, accept it and be more careful when you deploy (really how often do you need this control). You’ll still need auto-update because of updating long-living clients.
  2. Forward URLs to new URLs. Come up with some nice URL schema and forward the client when a new hard reload is required. Taking the CDN as given, new URLs go to the Origin server, giving you control.
  3. Wait or look for a CDN where you can add custom rewrites with a quick SLA (if you need it).
  4. Setup your own surgically placed CDN. Data centers are all over the world, including ones from Google, Amazon, Microsoft, IBM, and third party vendors where you can rent bare metal. Of course this comes with a higher setup cost but you can write a script to invalidate, internally rewrite URLs (index.html to index.{version}.html), etc. You can also use clustering algorithms like K-Means to help determine the location of your servers, assuming you have users already.

Other Considerations

There are other concerns to hosting your SPA on a CDN. Things like security (CSRF), SEO, and client architecture can all be impacted. However, some of these can be dependent on how you build your SPA so are beyond the scope of this article, but I’ll be happy to discuss in the comments.

Reference Links