Andy Davies

Independent Web Performance Consultant

Exploring Site Speed Optimisations With WebPageTest and Cloudflare Workers

One of the common questions I'm often asked by clients is "What difference will the changes you're recommending make to our site's speed"?

And too often that can be a hard question to answer…

I can be pretty sure of the 'direction of travel' – shrinking resources should make them download faster, delaying 3rd-parties should make content appear sooner – but page load can be non-deterministic and un-sharding domains, re-ordering resources or other changes sometimes leads to unexpected results.

Knowledge, experience and lots of testing can help us to prioritise what we think are the appropriate optimisations but often we have to wait until those changes make it to staging (or even live) before we can check the results.

WebPageTest and DevTools can give us clues that we're heading in the right direction but there's a gap that neither of them quite fill – a reliable testing environment that allows us to experiment and make changes to the page being tested.

When we worked together, Simon Hearne prototyped a proxy using mod_pagespeed that optimised pages and illustrated potential performance gains to customers (and accidentally siphoned away a UK airline's search traffic) but it's optimisations were limited and it wasn't easy to use.

So, last year when Pat Meenan, and Andrew Galloni started demonstrating what was possible using Cloudflare Workers as a proxy I guessed it might be a solution to fill the gap.

But it's taken me a little while to get around to experimenting with them...

Cloudflare Workers

Service Workers are often described as a programmable proxy in the browser – they can intercept and rewrite requests and responses, cache and synthesise responses, and much more.

Cloudflare Workers are a similar concept but instead of running in the browser they run on CDN edge nodes.

In addition to intercepting network requests, there's a HTMLRewriter class that targets DOM nodes using CSS selectors and triggers a handler when there's a match. The handlers can alter the matched elements, for example changing attributes, or even replacing an elements contents.

Andrew Galloni's post – Prototyping optimizations with Cloudflare Workers and WebPageTest – for the 2019 Performance Advent Calendar gives a good overview and guide to get started with them.

How I'm Using Them

Key to the approach I'm using is WebPageTest's overrideHost script command. It allows requests to one domain to be rewritten to another, and sets an x-host HTTP header on the revised request.

In the example script below any requests to example.com are rewritten to demo-proxy.asteno.workers.dev and the x-Host header set to example.com for those requests.

overrideHost www.example.com demo-proxy.asteno.workers.dev
navigate https://example.com/test-page.html

I start with a simple boilerplate worker and as the transforms tend to be bespoke for each site, I create a separate worker for each site I'm testing.

The boilerplate script for the worker follows this pattern:

  1. serves a robots.txt that disallows crawlers
  2. returns an error if the x-host header is missing
  3. if the request is for a predefined site, the browser is expecting a HTML response and the x-bypass-transform header isn't set to true the proxy uses a HTMLRewriter to modify the response
  4. Otherwise just proxy the request
/* Started from Pat's example in https://www.slideshare.net/patrickmeenan/getting-the-most-out-of-webpagetest */

/*
 * TODO
 * Add mimetype to robots.txt
 * Add a better doc check, perhaps use a header instead?
 */

const site = 'www.example.com';

addEventListener('fetch', event => {
 event.respondWith(handleRequest(event.request))
});

async function handleRequest(request) {

 const url = new URL(request.url);

 // Disallow crawlers

 if(url.pathname === "/robots.txt") {
   return new Response('User-agent: *\nDisallow: /', {status: 200});
 }

 // When overrideHost is used in a script, WPT sets x-host to original host i.e. site we want to proxy

 const host = request.headers.get('x-host');

   // Error if x-host header missing

 if(!host) {
   return new Response('x-host header missing', {status: 403});
 }

 url.hostname = host;

 const bypassTransform = request.headers.get('x-bypass-transform');

 const acceptHeader = request.headers.get('accept');

 // If it's the original document, and we don't want to bypass the rewrite of HTML
 // TODO will also select sub-documents e.g. iframes, from the same site :-(

 if(host === site &&
   (acceptHeader && acceptHeader.indexOf('text/html') >= 0) &&
   (!bypassTransform || (bypassTransform && bypassTransform.indexOf('true') === -1))) {

   const response = await fetch(url.toString(), request)

   return new HTMLRewriter()
     .on('selector', new exampleElementHandler())
     .transform(response)
   }

 // Otherwise just proxy the request

 return fetch(url.toString(), request)
}

/*
 *
 */

class exampleElementHandler {
 element(element) {
   // Do something
 }
}

Example Transforms

The transforms I'm using are fairly straightforward and mainly consist of unsharding domains, changing the order of the page, or delaying when a resource loads.

Sometimes it's possible to manipulate an existing element in the page, sometimes an element has to be deleted and a replacement inserted elsewhere in the page.

  • Unsharding Domains

Requesting frameworks, libraries etc from 3rd-party CDNs such as cdnjs, jsdelivr etc. is still very common across many of the customers I work with.

Requesting these from another origin involves creating a new connection, and then as HTTP/2 prioritisation only works across a single connection they may compete for the network with other resources.

One of the first tests I try is directing these requests through the proxy, so they're on the same origin as the page too:

overrideHost www.example.com demo-proxy.asteno.workers.dev
overrideHost ajax.googleapis.com demo-proxy.asteno.workers.dev
navigate https://example.com/test-page.html)

The proxy could be improved to cache these libraries on Cloudflare to remove the request origin for them – one of Pat Meenan's workers has an example of how to do this.

  • Deferring inline scripts

Clients often use 3rd-party services that don't need to be loaded until the visitor has a usable page – sometimes these provide outward facing features such as chat or feedback widgets, other times they may be internal facing, session replay for example.

I'll often defer the load for these types of services by moving them into a Tag Manager, and initiating their insertion using the Window.Loaded trigger in Google Tag Manager (GTM).

In one recent example, HotJar was loaded via an async snippet at the start of the head:

(function(h,o,t,j,a,r){
  h.hj=h.hj||function(){(h.hj.q=h.hj.q||[]).push(arguments)};
  h._hjSettings={hjid:xxxxxx,hjsv:x};
  a=o.getElementsByTagName('head')[0];
  r=o.createElement('script');r.async=1;
  r.src=t+h._hjSettings.hjid+j+h._hjSettings.hjsv;
  a.appendChild(r);
 })(window,document,'https://static.hotjar.com/c/hotjar-','.js?sv=');

To delay HotJar loading and simulate it being implemented via GTM I wrapped the HotJar snippet with a native event handler for window onload.

class deferInlineScript {
  element(element) {

    const wrapperStart = "window.addEventListener('load', function() {";
    const wrapperEnd ="});";

    element.prepend(wrapperStart, {html: true});
    element.append(wrapperEnd,  {html: true});
  }
}
  • Moving Third-Party Tags

Qubit's SmartServe is quite a large tag and even when loaded async competes for network bandwidth and CPU time in ways that impact performance.

One site I tested implemented the SmartServe tag near the top of the <head>, before any stylesheets.

<script src='//static.goqubit.com/smartserve-xxxx.js' async defer></script>

Its fetch was initiated soon after the page started loading and was competing with higher priority render blocking resources so I wanted to move the element to much later in the <head>.

This type of change becomes a two stage process where one handler removes the script element and then a second reinserts it (just before the end of the head).

.on('script[src="//static.goqubit.com/smartserve-xxxx.js"]', new removeSmartServe())
.on('head', new reinsertSmartServe())
class removeSmartServe {
  element(element) {
    element.remove();
  }
}

class reinsertSmartServe {
  element(element) {
    var text = '<script src="//static.goqubit.com/smartserve-xxxx.js" async defer></script>';

    element.append(text, {html: true});
  }
}

Testing

In initial testing I tend to start with host overrides in WebPageTest, then switch to curl or a browser when developing the HTML rewriting script, and finally switching back to WebPageTest to check before and after comparisons.

It's also an iterative process where I'll make a some initial changes, test and refine until I'm happy with their impact and then start around the loop again.

  • curl

To test the HTML rewriting using curl both the x-host, and accept headers need to be set appropriately.

curl -H "x-host: www.example.com" -H "accept: text/html" https://demo-proxy.asteno.workers.dev/test-page.html

Piping curl's output to a file or util like less makes it easier to read.

  • Browser

For in-browser testing of HTML rewriting I've been using Chrome, setting the x-host header with the ModHeader Extension and then loading the page via the proxy i.e. https://demo-proxy.asteno.workers.dev/test-page.html

This approach only allows the initial host to be overridden, so can't be used to unshard domains.

  • WebPageTest

Finally when I'm happy with the host overrides and HTML rewrites I switch back to WebPageTest and generate before (baseline) and after tests.

I've found that some sites get faster when proxied through Cloudflare's network, so I still used the proxy when I'm generating a baseline for comparison but set the x-bypass-transform header to true so the HTML transforms aren't applied.

setHeader x-bypass-transform: true

Gotchas

A few issues have tripped me up while I was writing and testing proxies:

  • overrideHost and Service Workers

WebPageTest's overrideHost command doesn't seem to work with requests dispatched from a Service Worker and the request always seems to default back to the original host.

Reading the code and talking to Pat, it appears it should but I've not had time to debug this issue further yet.

  • overrideHost and non-Chromium browsers

I could only get overrideHost to work in Chromium based browsers – Chrome, Mobile Chrome and Edge.

  • Fragile Selectors

When rewriting the HTML, I sometimes have to rely on fragile DOM queries, for example this selector to target the first script element in the head: head > script:nth-of-type(1).

And as there's currently no way to extract the contents of an element I can't test that the element that's been passed to the handler is the one I wanted to target.

Specific selectors for example, that use an id, or src attribute etc., are more robust.

  • Differing DOMs

The DOM that HTMLRewriter is operating on is not the same DOM as viewed in the Elements tab in DevTools as the rewriter doesn't execute scripts, so by default the DOM queries can't be tested in the browser.

Using DevTools to block all requests except the one for the source HTML document and then checking the queries from the console is one way around this.

Closing Thoughts

Even though I've only used the combination WebPageTest and Cloudflare Workers with a few sites, it's clear that it's a powerful combination and it's likely to become a regular part of my client workflow.

At BrightonSEO I'm talking about Reducing the Speed Impact of Third-Party Tags and as much as I can talk about the theory, nothing beats a good demo.

For my demo I used a worker to re-write parts of the page and choreograph how 3rd-party tags were loaded. The changes improved Largest Contentful Paint by a second for OPI's product page (top row).

WebPageTest filmstrip comparing Opi before and after third-party tags have been choreographed

The filmstrip is for an uncached view of the page, and although there's still plenty of room for improvement in the initial render time, it illustrates how a proxy can be used to quickly evaluate changes before committing them to the development lifecycle.

There's plenty of other optimisations to try… from replacing an embedded YouTube player with a lazy-loaded version or adding the lazy-loading attribute to out of viewport images, through to using Cloudflare's image optimisation, and text compression features to reduce payload sizes.

A few clients ask me to evaluate the performance impact of 3rd-party tags before they implement them. As part of this process I typically query the HTTP Archive to find another site that uses the same tag and then test that site with and without the tag. Using a proxy I could inject the tag into the client's site and see what impact it has.

As yet, I've not got as far as rewriting or replacing external scripts and stylesheets, or exploring how Cloudflare's cache and key-value store can be used in the testing process.

But if you'd like some more sophisticated examples of the types of optimisations that can be implemented using Cloudflare's Workers, Pat Meenan has a collection of examples on GitHub.

Further Reading

Prototyping optimizations with Cloudflare Workers and WebPageTest, Andrew Galloni, Dec 2019

Pat Meenan's collection of Cloudflare Workers

Cloudflare Workers documentation

Comments