NavigationTiming in Practice

May 27th, 2015

NavigationTiming is a specification developed by the W3C Web Performance working group, with the goal of exposing accurate performance metrics that describe your visitor’s page load experience (via JavaScript).

NavigationTiming was the first specification developed by the working group, and exposes valuable performance metrics to JavaScript that were impossible to accurately measure before.

NavigationTiming is currently a Recommendation, which means that browser vendors are encouraged to implement it. As of May 2015, 81% of the world-wide browser market-share supports NavigationTiming (and more on that later).

There’s a follow-up spec, NavigationTiming2 that is currently being developed and improves on the original spec.

Let’s take a deep-dive into NavigationTiming.

How it was done before?

NavigationTiming exposes performance metrics to JavaScript that were never before available, such as your root HTML’s network timings. Prior to NavigationTiming, you could not measure your page’s DNS, TCP, request or response times because all of that occurred before your application (JavaScript) started up, and the browser did not expose them.

Before NavigationTiming was available, you could still estimate some performance metrics, such as how long it took for your page’s sub-resources to download. To do this, you can hook into the browser’s onload event, which is fired once all of the sub-resources on your page (such as JavaScript, CSS, IMGs and IFRAMES) have been downloaded.

Here’s sample code:

<html><head><script>
var start = new Date().getTime();

function onLoad {
  var pageLoadTime = (new Date().getTime()) - start;
}

body.addEventListener(“load”, onLoad, false);
</script></head></html>

What’s wrong with this?

First, it only measures the time from when the JavaScript runs to when the last sub-resource is downloaded.

If that’s all you’re interested in measuring, that’s fine, but there’s a large part of the user’s experience that you’ll be blind to.

Let’s review the main phases that the browser goes through when fetching your HTML:

  1. DNS resolve: Look up the domain name to find what IP address to connect to
  2. TCP connect: Connect to your server on port 80 (HTTP) or 443 (HTTPS) via TCP
  3. Request: Send a HTTP request, with headers and cookies
  4. Response: Wait for the server to start sending the content (back-end time)

It’s only after Phase 4 (Response) is complete that your HTML is parsed and your scripts would start running.

Phase 1-4 timings will vary depending on the network. One visitor might fetch your content in 100 ms while it might take another user, on a slower connection, 5,000 ms before they see your content. That delay translates into a much worse user-experience.

Thus if you’re only monitoring your application from JavaScript in the <HEAD> to to onload (in the snippet above), you are blind to a large part of the overall experience.

So the primitive approach above has several downsides:

  • It only measures the time from when the JavaScript runs to when the last sub-resource is downloaded
  • It misses the initial DNS lookup, TCP connection and HTTP request phases
  • Date().getTime() is not reliable

Interlude – DOMHighResTimeStamp

What about #3? Why is Date.getTime() (or Date.now() or +(new Date)) not reliable?

Let’s talk about another modern browser feature, DOMHighResTimeStamp.

DOMHighResTimeStamp is a new data type for performance interfaces. In JavaScript, it’s typed as a regular number primitive, but anything that exposes a DOMHighResTimeStamp is following several conventions.

Notably, DOMHighResTimeStamp is a monotonically non-decreasing timestamp with an epoch of navigationStart and sub-millisecond resolution. It is used by several W3C webperf performance specs, and can always be queried via window.performance.now();

Why not just use the Date object?

DOMHighResTimeStamp helps solve three shortcomings of Date. Let’s break its definition down:

  • monotonically non-decreasing means that every time you fetch a DOMHighResTimeStamp, its’ value will always be at least the same as when you accessed it last. It will never decrease.

  • timestamp with an epoch of navigationStart means it’s value is a timestamp, whose basis (start) is window.performance.timing.navigationStart. Thus a DOMHighResTimeStamp of 10 means it’s 10 milliseconds after time time given by navigationStart

  • sub-millisecond resolution means the value has the resolution of at least a millisecond. In practice, DOMHighResTimeStamps will be a number with the milliseconds as whole-numbers and fractions of a millisecond represented after the decimal. For example, 1.5 means 1500 microseconds, while 100.123 means 100 milliseconds and 123 microseconds.

Each of these points addresses a shortcoming of the Date object. First and foremost, monotonically non-decreasing fixes a subtle issue with the Date object that you may not know exists. The problem is that Date simple exposes the value of your end-user’s clock, according to the operating system. While the majority of the time this is OK, the system clock can be influenced by outside events, even in the middle of when your app is running.

For example, when the user changes their clock, or an atomic clock service adjusts it, or daylight-savings kicks in, the system clock may jump forward, or even go backwards!

So imagine you’re performance-profiling your application by keeping track of a start and end timestamp of some event via the Date object. You track the start time… and then your end-users atomic clock kicks in and adjusts the time forward an hour… and now, from JavaScript Date‘s point of view, it seems like your application just took an hour to do a simple task.

This can even lead to problems when doing statistical analysis of your performance data. Imagine if your monitoring tool is taking the mean value of operational times and one of your users’ clocks jumped forward 10 years. That outlier, while “true” from the point of view of Date, will skew the rest of your data significantly.

DOMHighResTimeStamp addresses this issue by guaranteeing it is monotonically non-decreasing. Every time you access performance.now(), you are guaranteed it will be at least equal to, if not greater than, the last time you accessed it.

You should’t mix Date timestamps (which are Unix epoch based, so you get sample times like 1430700428519) with DOMHighResTimeStamps. If the user’s clock changes, and you mix both Date and DOMHighResTimeStamps, the former could be wildly different from the later.

To help enforce this, DOMHighResTimeStamp is not Unix epoch based. Instead, its epoch is window.performance.timing.navigationStart (more details of which are below). Since it has sub-millisecond resolution, this means that the values that you get from it are the number of milliseconds since the page load started. As a benefit, this makes them easier to read than Date timestamps, since they’re relatively small and you don’t need to do (now - startTime) math to know when something started running.

DOMHighResTimeStamp is available in most modern browsers, including Internet Explorer 10+, Firefox 15+, Chrome 20+, Safari 8+ and Android 4.4+. If you want to be able to always get timestamps via window.performance.now(), you can use a polyfill. Note these polyfills will be millisecond-resolution timestamps with a epoch of “something” in unsupported browsers, since monotonically non-decreasing can’t be guaranteed and sub-millisecond isn’t available unless the browser supports it.

As a summary:

DateDOMHighResTimeStamp
Accessed viaDate().getTime()performance.now()
Resolutionmillisecondsub-millisecond
StartUnix epochperformance.timing.navigationStart
Monotonically Non-decreasingNoYes
Affected by user’s clockYesNo
Example14201475246063392.275999998674

Back to NavigationTiming

So, how do you access NavigationTiming data?

All of the performance metrics NavigationTiming exposes are available underneath the new window.performance DOM object that most of the W3C webperf specs utilize.

NavigationTiming’s metrics are primarily available underneath window.performance.navigation and window.performance.timing. The former provides performance characteristics (such as the type of navigation, or the number of redirects taken to get to the current page) while the latter exposes performance metrics (timestamps).

Here’s the WebIDL (definition) of the interfaces:

window.performance.navigation:

interface PerformanceNavigation {
  const unsigned short TYPE_NAVIGATE = 0;
  const unsigned short TYPE_RELOAD = 1;
  const unsigned short TYPE_BACK_FORWARD = 2;
  const unsigned short TYPE_RESERVED = 255;
  readonly attribute unsigned short type;
  readonly attribute unsigned short redirectCount;
};

window.performance.timing:

interface PerformanceTiming {
    readonly attribute unsigned long long navigationStart;
    readonly attribute unsigned long long unloadEventStart;
    readonly attribute unsigned long long unloadEventEnd;
    readonly attribute unsigned long long redirectStart;
    readonly attribute unsigned long long redirectEnd;
    readonly attribute unsigned long long fetchStart;
    readonly attribute unsigned long long domainLookupStart;
    readonly attribute unsigned long long domainLookupEnd;
    readonly attribute unsigned long long connectStart;
    readonly attribute unsigned long long connectEnd;
    readonly attribute unsigned long long secureConnectionStart;
    readonly attribute unsigned long long requestStart;
    readonly attribute unsigned long long responseStart;
    readonly attribute unsigned long long responseEnd;
    readonly attribute unsigned long long domLoading;
    readonly attribute unsigned long long domInteractive;
    readonly attribute unsigned long long domContentLoadedEventStart;
    readonly attribute unsigned long long domContentLoadedEventEnd;
    readonly attribute unsigned long long domComplete;
    readonly attribute unsigned long long loadEventStart;
    readonly attribute unsigned long long loadEventEnd;
};

The NavigationTiming Timeline

Each of the timestamps above corresponds with events in the timeline below:

NavigationTiming timeline

Note that each of the timestamps are Unix epoch-based, instead of being navigationStart-based like DOMHighResTimeStamps. This has been addressed in NavigationTiming2.

The entire process starts at timing.navigationStart. This is when your end-user started the navigation. They might have clicked on a link, or hit reload in your browser. The navigation.type property tells you what type of page-load it was: a regular navigation (link- or bookmark- click) (TYPE_NAVIGATE = 0), a reload (TYPE_RELOAD = 1), or a back-forward navigation (TYPE_BACK_FORWARD = 2). Each of these types of navigations will have different performance characteristics.

Around this time, the browser will also start to unload the previous page. If the previous page is the same origin (domain) as the current page, the timestamps of that document’s onunload event (start and end) will be filled in as timing.unloadEventStart and timing.unloadEventEnd. If the previous page was on another origin (or there was no previous page), these timestamps will be 0.

Next, in some cases, your site may go through one or more HTTP redirects before it reaches the final destination. navigation.redirectCount gives you an important insight into how many hops it took for your visitor to reach your page. 301 and 302 redirects each take time, so for performance reasons you should reduce the number of redirects to reach your content to 0 or 1. Unfortunately, due to security concerns, you do not have access to the actual URLs that redirected to this page, and it is entirely possibly that a third-party site (not under your control) initiated the redirect. The difference between timing.redirectStart and timing.redirectEnd encompasses all of the redirects. If these values are 0, it means that either there were no redirects, or at least one of the redirects was from a different origin.

fetchStart is the next timestamp, and indicates the timestamp for the start of the fetch of the current page. If there were no redirects when loading the current page, this value should equal navigationStart. Otherwise, it should equal redirectEnd.

Next, the browser goes through the networking phases required to fetch HTML over HTTP. First the domain is resolved (domainLookupStart and domainLookupEnd), then a TCP connection is initiated (connectStart and connectEnd). Once connected, a HTTP request (with headers and cookies) is sent (requestStart). Once data starts coming back from the server, responseStart is filled, and is ended when the last byte from the server is read at responseEnd.

Note that the only phase without an end timestamp is requestEnd, as the browser does not have insight into when the server received the response.

Any of the above phases (DNS, TCP, request or response) might not take any time, such as when DNS was already resolved, a TCP connection is re-used or when content is served from disk. In this case, the timestamps should not be 0, but should reflect the timestamp that the phase started and ended, even if the duration is 0. For example, if fetchStart is at 1000 and a TCP connection is reused, domainLookupStart, domainLookupEnd, connectStart and connectEnd should all be 1000 as well.

secureConnectionStart is an optional timestamp that is only supported in some browsers (notably, it is missing in Internet Explorer). If not 0, it represents the time that the SSL/TLS handshake started.

After responseStart, there are several timestamps that represent phases of the DOM’s lifecycle. These are domLoading, domInteractive, domContentLoadedEventStart, domContentLoadedEventEnd and domComplete.

domLoading, domInteractive and domComplete correspond to when the Document’s readyState are set to the corresponding loading, interactive and complete states.

domContentLoadedEventStart and domContentLoadedEventEnd correspond to when the DOMContentLoaded event fires on the document and when it has completed running.

Finally, once the body’s onload event fires, loadEventStart is filled in. Once all of the onload handlers are complete, loadEventEnd is filled in. Note this means if you’re querying window.performance.timing from within the onload event, loadEventEnd will be 0. You could work around this by querying the timestamps from a setTimeout(..., 10) fired from within the onload event, as in the code example below.

Note: There is a bug in some browsers where they are reporting 0 for some timestamps. This is a bug, as all same-origin timestamps should be filled in, but if you’re consuming this data, you may have to adjust for this.

Browser vendors are also free to ad their own additional timestamps to window.performance.timing. Here is the only currently known vendor-prefixed timestamp available:

  • msFirstPaint – Internet Explorer 9+ only, this event corresponds to when the first paint occurred within the document. It makes no guarantee about what content was painted — in fact, the paint could be just the “white out” prior to other content being displayed. Do not rely on this event to determine when the user started seeing actual content.

Example data

Here’s sample data from a page load:

// window.performance.navigation
redirectCount: 0
type: 0

// window.performance.timing
navigationStart: 1432762408327,
unloadEventEnd: 0,
unloadEventStart: 0,
redirectStart: 0,
redirectEnd: 0,
fetchStart: 1432762408648,
connectEnd: 1432762408886,
secureConnectionStart: 1432762408777,
connectStart: 1432762408688,
domainLookupStart: 1432762408660,
domainLookupEnd: 1432762408688,
requestStart: 1432762408886,
responseStart: 1432762409141,
responseEnd: 1432762409229,
domComplete: 1432762411136,
domLoading: 1432762409147,
domInteractive: 1432762410129,
domInteractive: 1432762410129,
domContentLoadedEventStart: 1432762410164,
domContentLoadedEventEnd: 1432762410263,
loadEventEnd: 1432762411140,
loadEventStart: 1432762411136

How to Use

All of the metrics exposed on the window.performance interface are available to your application via JavaScript. Here’s example code for gathering durations of the different phases of the main page load experience:

function onLoad() {
  if ('performance' in window && 'timing' in window.performance) {
    setTimeout(function() {
      var t = window.performance.timing;
      var ntData = {
        redirect: t.redirectEnd - t.redirectStart,
        dns: t.domainLookupEnd - t.domainLookupStart,
        connect: t.connectEnd - t.connectStart,
        ssl: t.secureConnectionStart ? (t.connectEnd - secureConnectionStart) : 0,
        request: t.responseStart - t.requestStart,
        response: t.responseEnd - t.responseStart,
        dom: t.loadEventStart - t.responseEnd,
        total: t.loadEventEnd - t.navigationStart
      };
    }, 0);
  }
}

With access to all of this performance data, you are free to do with it whatever you want. You could analyze it on the client, notifying you when there are problems. You could send 100% of the data to your back-end analytics server for later analysis. Or, you could hook the data into a DIY or commercial RUM solution that does this for you automatically.

Let’s explore all of these options:

DIY

There are many DIY / Open Source solutions out there that gather and analyze data exposed by NavigationTiming.

Here are some DIY ideas for what you can do with NavigationTiming:

  • Gather the performance.timing metrics on your own and alert you if they are over a certain threshold (warning: this could be noisy)
  • Gather the performance.timing metrics on your own and XHR every page-load’s metrics to your backend for analysis
  • Watch for any pages that resulted in one or more redirects via performance.navigation.redirectCount
  • Determine what percent of users go back-and-forth on your site via performance.navigation.type
  • Accurately monitor your app’s bootstrap time that runs in the body’s onload event via (loadEventEnd - loadEventStart)
  • Monitor the performance of your DNS servers
  • Measure DOM event timestamps without adding event listeners

Open-Source

There are some great projects out there that consume NavigationTiming information.

Boomerang, an open-source library developed by Philip Tellis, had a method for tracking performance metrics before NavigationTiming was supported in modern browsers. Today, it incorporates NavigationTiming data if available. It does all of the hard work of gathering various performance metrics, and lets you beacon (send) the data to a server of your choosing. (I am a contributor to the project).

To compliment Boomerang, there are a couple open-source servers that receive Boomerang data, such as Boomcatch and BoomerangExpress. In both cases, you’ll still be left to analyze the data on your own:

BoomerangExpress

To view NavigationTiming data for any site you visit, you can use this kaaes bookmarklet:

kaaes bookmarklet

SiteSpeed.io helps you track your site’s performance metrics and scores (such as PageSpeed and YSlow):

SiteSpeed.io

Finally, if you’re already using Piwik, there’s a plugin that gathers NavigationTiming data from your visitors:

"generation time" = responseEnd - requestStart

Piwik

Commercial Solutions

If you don’t want to build or manage a DIY / Open-Source solution to gather RUM metrics, there are many great commercial services available.

Disclaimer: I work at SOASTA, on mPulse and Boomerang

SOASTA mPulse, which gathers 100% of your visitor’s performance data:

SOASTA mPulse

Google Analytics Site Speed:

Google Analytics Site Speed

New Relic Browser:

New Relic Browser

NeuStar WPM:

NeuStar WPM

SpeedCurve:

SpeedCurve

There may be others as well — please leave a comment if you have experience using another service.

Availability

NavigationTiming is available in most modern browsers. According to caniuse.com 81% of world-wide browser market share supports NavigationTiming, as of May 2015. This includes Internet Explore 9+, Firefox 7+, Chrome 6+, Opera 15+, Android Browser 4+ and Mac Safari 8+.

CanIUse NavigationTiming

Notably absent? iOS Safari. NavigationTiming made a brief appearance in iOS 8.0, but was pulled from 8.1 for “performance reasons“. As of iOS 8.3, NavigationTiming is still missing in iOS Safari. This is highly unfortunate for mobile-focused websites, as it means owners cannot get accurate performance metrics of their visitors.

One thing that can help for mobile is Boomerang, which can capture performance metrics for all browsers, even those not supporting NavigationTiming, for everything beyond the first page view on your site. It does this by tracking timestamps in a cookie from page-to-page. However, there is no accurate way of tracking first-page network timings (beyond guess-timates).

I’m hoping Apple is hard at work bringing NavigationTiming back to Safari. You can check on the latest updates by viewing doesioshavenavigationtiming.com. Hah.

Tips

Some final tips to re-iterate if you want to use NavigationTiming data:

  • Use fetchStart instead of navigationStart, unless you’re interested in redirects, browser tab initialization time, etc.
  • loadEventEnd will be 0 until after the body’s onload event has finished (so you can’t measure it in the load event itself).
  • We don’t have an accurate way to measure the “request time”, as requestEnd is invisible to us (the server sees it).
  • secureConnectionStart isn’t available in Internet Explorer, and will be 0 in other browsers unless on a HTTPS link.
  • If your site is the home-page for a user, you may see some 0 timestamps. Timestamps up through the responseEnd event may be 0 duration because some browsers speculatively pre-fetch home pages (and don’t report the correct timings).
  • Some browsers report 0 for timestamps that should always be filled in, such as domainLookup*, connect* and requestStart. This is a bug, but you may need to detect and work around this if you’re analyzing data on your own.
  • If you’re going to be beaconing data to your back-end for analysis, if possible, send the data in the body’s onload event versus waiting for onbeforeunload. onbeforeunload isn’t 100% reliable, and may not fire in some browsers (such as iOS Safari).
  • Single-Page Apps: You’ll need a different solution for “soft” or “in-page” navigations (SOASTA has developed a Boomerang plugin for SPAs).

NavigationTiming2

Currently just a draft, NavigationTiming2 builds on top of NavigationTiming, adding:

Head to the mailing list or view the spec on Github for more information.

Conclusion

NavigationTiming exposes valuable and accurate performance metrics that were not previously available to your web app. If you’re interested in measuring and monitoring the performance of your web app, NavigationTiming data is the first place you should look.

Next up: Interested in capturing the same network timings for all of the sub-resources on your page, such as images, JavaScript, and CSS? ResourceTiming is what you want.

Other articles in this series:

More resources:

Measuring the performance of your web apps

May 25th, 2015

You know that performance matters, right?

Just a few seconds slower and your site could be turning away thousands (or millions) of visitors. Don’t take my word for it: there are plenty of case studies, articles, findings, presentations, charts and more showing just how important it is to make your site load quickly. Google is even starting to shame-label slow sites. You don’t want to be that guy.

So how do you monitor and measure the performance of your web apps?

The performance of any system can be measured from several different points of view. Let’s take a brief look at three of the most common performance viewpoints for a web app: from the eyes of the developer, the server and the end-user.

This is the beginning of a series of articles that will expand upon the content given during my talk “Make it Fast: Using Modern Brower APIs to Monitor and Improve the Performance of your Web Applications” at CodeMash 2015.

Developer

The developer’s machine is the first line of defense in ensuring your web application is performing as intended. While developing your app, you are probably building, testing and addressing performance issues as you see them.

In addition to simply using your app, there are many tools you can use to measure how it’s performing. Some of my favorites are:

While ensuring everything is performing well on your development machine (which probably has tons of RAM, CPU and a quick connection to your servers) is a good first step, you also need to make sure your app is playing well with other services on your network, such as your web server, database, etc.

Server

Monitoring the server(s) that run your infrastructure (such as web, database, and other back-end services) is critical for a performance monitoring strategy. Many resources and tools have been developed to help engineers monitor what their servers are doing. Performance monitoring at the server level is critical for reliability (ensuring your core services are running) and scalability (ensuring your infrastructure is performing at the level you want).

From each of your servers’ points of view, there are several components that you can monitor to have visibility into how your infrastructure is performing. Some common monitoring and measuring tools are:

By putting these tools together, you can get a pretty good sense of how your overall infrastructure is performing.

End-user

So you’ve developed your app, deployed it to production, and have been monitoring your infrastructure closely to ensure all of your servers are performing smoothly.

Everything should be golden, right? Your end-users are having a fantastical experience and every one of them just loves visiting your site.

… clearly, that’s probably not the case. The majority of your end-users don’t surf the web on $3,000 development machines, using the latest cutting-edge browser on a low-latency link from your datacenter. A lot of your users are probably on a low-end tablet, on a cell network, 2,000 miles away from your datacenter.

The experience you’ve curated while developing your web app on your high-end development machine will probably be the best experience possible. All of your visitors will likely experience something worse, from not-a-noticeable-difference down to can’t-stand-how-slow-it-is-and-will-never-come-back.

Measuring performance from the server and the developer’s perspective is not the full story. In the end, the only thing that really matters is what your visitor sees, and the experience they have.

Just a few years ago, the web development community didn’t have a lot of tools available to monitor the performance from their end-users’ perspectives. Sure, you could capture simple JavaScript timestamps within your code:

var startTime = Date.now();
// do stuff
var elaspedTime = Date.now() - startTime;

You could spread this code throughout your app and listen for browser events such as onload, but simple timestamps don’t give a lot of visibility into the performance of your end-users.

In addition, since this style of timestamp/profiling is just JavaScript, you have zero visibility into the browser’s networking performance and what happened before the browser parsed your HTML and JavaScript.

W3C Webperf Working Group

To solve these issues, in 2010 the W3C (a standards body in charge of developing web standards such as HTML5, CSS, etc.) formed a new working group with the mission of giving developers the ability to assess and understand the performance characteristics of their web apps.

The W3C webperf working group is an organization whose members include Microsoft, Google, Mozilla, Opera, Facebook, Netflix, SOASTA and more. The working group collaboratively develops standards with the following goals:

  • Expose information that was not previously available

  • Give developers the tools they need to make their applications more efficient

  • Little to no overhead
  • Easy to understand APIs

Since it’s inception, the working group has published a number of standards, many of which are available in modern browsers today. Some of these standards are:

Other published standards include Page Visibility, requestAnimationFrame, setImmediate, and there are several work-in-progress standards such as Beacon API, Resource Hints, Frame Timing, Server Timing and Network Error Logging.

These standards, many of which are already implemented in modern browsers, give extremely valuable insight into the real-world performance of your end-users. Also called Real-User Monitoring (RUM), this data fills the critical information gap that existed when you could only accurately monitor performance at the developer- or server-level.

If you’re interested in learning more about the working group, head to their website, check out their mailing list or view their specs on Github.

The next few blog posts will go over some of these specs. We’ll discuss what they look like, the problems they solve, and how you can use them, both with DIY solutions or commercial products.

Next up:

JavaScript Module Patterns

March 17th, 2015

Presented at the Lansing JavaScript Meetup:

javascript-module-patterns

Slides are available on Slideshare or Github.

Make It Fast – CodeMash 2015

January 8th, 2015

Presented at CodeMash 2015:

Make It Fast - CodeMash 2015

Compressing ResourceTiming

November 7th, 2014

At SOASTA, we’re building tools and services to help our customers understand and improve the performance of their websites. Our mPulse product utilizes Real User Monitoring to capture data about page-load performance.

For browser-side data collection, mPulse uses Boomerang, which beacons every single page-load experience back to our real time analytics engine. Boomerang utilizes NavigationTiming when possible to relay accurate performance metrics about the page load, such as the timings of DNS, TCP, SSL and the HTTP response.

ResourceTiming is another important feature in modern browsers that gives JavaScript access to performance metrics about the page’s components fetched from the network, such as CSS, JavaScript and images. mPulse will soon be releasing a new feature that lets our customers view the complete waterfall of every visitor’s session, which can be a tremendous help in debugging performance issues.

The challenge with ResourceTiming is that it offers a lot of data if you want to beacon it all back to a server. For each resource, there’s data on:

  • URL
  • Initiating element (eg IMG)
  • Start time
  • Duration
  • Plus 11 other timestamps

Here’s an example of performance.getEntriesByType('resource') of a single resource:

{"responseEnd":2436.426999978721,"responseStart":2435.966999968514,
"requestStart":2435.7460000319406,"secureConnectionStart":0,
"connectEnd":2434.203000040725,"connectStart":2434.203000040725,
"domainLookupEnd":2434.203000040725,"domainLookupStart":2434.203000040725,
"fetchStart":2434.203000040725,"redirectEnd":0,"redirectStart":0,
"initiatorType":"internal","duration":2.2239999379962683,
"startTime":2434.203000040725,"entryType":"resource","name":"http://nicj.net/"}

JSON.stringify()‘d, that’s 469 bytes for this one resource.  Multiple that by each resource on your page, and you can quickly see that gathering and beaconing all of this data back to a server will take a lot of bandwidth and storage if you’re tracking this for every single visitor to your site. The HTTP Archive tells us that the average page is composed of 99 HTTP resources, with an average URL length of 85 bytes.

So for a rough estimate you could expect around 45 KB of ResourceTiming data per page load.

The Goal

We wanted to find a way to compress this data before we JSON serialize it and beacon it back to our server.

Philip Tellis, the author of Boomerang, and I have come up with several compression techniques that can reduce the above data to about 15% of it’s original size.

Techniques

Let’s start out with a single resouce, as you get back from window.performance.getEntriesByType("resource"):

{  
  "responseEnd":323.1100000002698,
  "responseStart":300.5000000000000,
  "requestStart":252.68599999981234,
  "secureConnectionStart":0,
  "connectEnd":0,
  "connectStart":0,
  "domainLookupEnd":0,
  "domainLookupStart":0,
  "fetchStart":252.68599999981234,
  "redirectEnd":0,
  "redirectStart":0,
  "duration":71.42400000045745,
  "startTime":252.68599999981234,
  "entryType":"resource",
  "initiatorType":"script",
  "name":"http://foo.com/js/foo.js"
}

Step 1: Drop some attributes

We don’t need:

  • entryType will always be resource
  • duration can always be calculated as responseEnd - startTime.
  • fetchStart will always be startTime (with no redirects) or redirectEnd (with redirects)
{  
  "responseEnd":323.1100000002698,
  "responseStart":300.5000000000000,
  "requestStart":252.68599999981234,
  "secureConnectionStart":0,
  "connectEnd":0,
  "connectStart":0,
  "domainLookupEnd":0,
  "domainLookupStart":0,
  "redirectEnd":0,
  "redirectStart":0,
  "startTime":252.68599999981234,
  "initiatorType":"script",
  "name":"http://foo.com/js/foo.js"
}

Step 2: Change into a fixed-size array

Since we know all of the attributes ahead of time, we can change the object into a fixed-sized array. We’ll create a new object where each key is the URL, and its value is a fixed-sized array. We’ll take care of duplicate URLs later:

{ "name": [initiatorType, startTime, redirectStart, redirectEnd,
   domainLookupStart, domainLookupEnd, connectStart, secureConnectionStart, 
   connectEnd, requestStart, responseStart, responseEnd] }

With our data:

{ "http://foo.com/foo.js": ["script", 252.68599999981234, 0, 0
   0, 0, 0, 0, 
   0, 252.68599999981234, 300.5000000000000, 323.1100000002698] }

Step 3: Drop microsecond timings

For our purposes, we don’t need sub-milliscond accuracy, so we can round all timings to the nearest millisecond:

{ "http://foo.com/foo.js": ["script", 252, 0, 0, 0, 0, 0, 0, 0, 252, 300, 323] }

Step 4: Trie

We can now use an optimized Trie to compress the URLs. A Trie is an optimized tree structure where associative array keys are compressed.

Mark Holland and Mike McCall discussed this technique at Velocity this year.

Here’s an example with multiple resources:

{
    "http://": {
        "foo.com/": {
            "js/foo.js": ["script", 252, 0, 0, 0, 0, 0, 0, 0, 252, 300, 323]
            "css/foo.css": ["css", 300, 0, 0, 0, 0, 0, 0, 0, 305, 340, 500]
        },
        "other.com/other.css": [...]
    }
}

Step 5: Offset from startTime

If we offset all of the timestamps from startTime (which they should always be larger than), they may use fewer characters:

{
    "http://": {
        "foo.com/": {
            "js/foo.js": ["script", 252, 0, 0, 0, 0, 0, 0, 0, 0, 48, 71],
            "css/foo.css": ["script", 300, 0, 0, 0, 0, 0, 5, 40, 200]
        },
        "other.com/other.css": [...]
    }
}

Step 6: Reverse the timestamps and drop any trailing 0s

The only two required timestamps in ResourceTiming are startTime and responseEnd. Other timestamps may be zero due to being a Cross-Origin resource, or a timestamp that was “zero” because it didn’t take any time offset from startTime, such as domainLookupStart if DNS was already resolved.

If we re-order the timestamps so that, after startTime, we put them in reverse order, we’re more likely to have the “zero” timestamps at the end of the array.

{ "name": [initiatorType, startTime, responseEnd, responseStart,
   requestStart, connectEnd, secureConnectionStart, connectStart,
   domainLookupEnd, domainLookupStart, redirectEnd, redirectStart] }
{
    "http://": {
        "foo.com/": {
            "js/foo.js": ["script", 252, 71, 48, 0, 0, 0, 0, 0, 0, 0, 0, 0]
            "css/foo.css": ["script", 300, 200, 40, 5, 0, 0, 0, 0, 0, 0, 0, 0]
        }
    }
}

Once we have all of the zero timestamps towards the end of the array, we can drop any repeating trailing zeros. When reading later, missing array values can be interpreted as zero.

{
    "http://": {
        "foo.com/": {
            "js/foo.js": ["script", 252, 71, 48]
            "css/foo.css": ["css", 300, 200, 40]
        }
    }
}

Step 7: Convert initiatorType into a lookup

Using a numeric lookup instead of a string will save some bytes for initiatorType:

var INITIATOR_TYPES = {
    "other": 0,
    "img": 1,
    "link": 2,
    "script": 3,
    "css": 4,
    "xmlhttprequest": 5
};
{
    "http://": {
        "foo.com/": {
            "js/foo.js": [3, 252, 71, 48]
            "css/foo.css": [4, 300, 200, 40]
        }
    }
}

Step 8: Use Base36 for numbers

Base 36 is convenient because it can result in smaller byte-size than Base-10 and has built-in browser support in JavaScript toString(36):

{
    "http://": {
        "foo.com/": {
            "js/foo.js": [3, "70", "1z", "1c"]
            "css/foo.css": [4, "8c", "5k", "14"]
        }
    }
}

Step 9: Compact the array into a string

A JSON string representation of an array (separated by commas) saves a few bytes during serialization. We’ll designate the first byte as the initiatorType:

{
    "http://": {
        "foo.com/": {
            "js/foo.js": "370,1z,1c",
            "css/foo.css": "48c,5k,14"
        }
    }
}

Step 10: Multiple hits

Finally, if there are multiple hits to the same resource, the keys (URLs) in the Trie will conflict with each other.

Let’s fix this by concatenating multiple hits to the same URL via a special character such as pipe | (see foo.js below):

{
    "http://": {
        "foo.com/": {
            "js/foo.js": "370,1z,1c|390,1,2",
            "css/foo.css": "48c,5k,14"
        }
    }
}

Step 11: Gzip or MsgPack

Applying gzip compression or MsgPack can give additional savings during transport and storage.

Results

Overall, the above techniques compress raw JSON.stringify(performance.getEntriesByType('resource')) to about 15% of its original size.

Taking a few sample pages:

  • Search engine home page
    • Raw: 1,000 bytes
    • Compressed: 172 bytes
  • Questions and answers page:
    • Raw: 5,453 bytes
    • Compressed: 789 bytes
  • News home page
    • Raw: 32,480 bytes
    • Compressed: 4,949 bytes

How-To

These compression techniques have been added to the latest version of Boomerang.

I’ve also released a small library that does the compression as well as de-compression of the optimized result: resourcetiming-compression.js.

This article also appears on soasta.com.