Compressing UserTiming

December 10th, 2015

UserTiming is a modern browser performance API that gives developers the ability the mark important events (timestamps) and measure durations (timestamp deltas) in their web apps. For an in-depth overview of how UserTiming works, you can see my article UserTiming in Practice or read Steve Souders’ excellent post with several examples for how to use UserTiming to measure your app.

UserTiming is very simple to use. Let’s do a brief review. If you want to mark an important event, just call window.performance.mark(markName):

// log the beginning of our task
performance.mark("start");

You can call .mark() as many times as you want, with whatever markName you want. You can repeat the same markName as well.

The data is stored in the PerformanceTimeline. You query the PerformanceTimeline via methods like performance.getEntriesByName(markName):

// get the data back
var entry = performance.getEntriesByName("start");
// -> {"name": "start", "entryType": "mark", "startTime": 1, "duration": 0}

Pretty simple right? Again, see Steve’s article for some great use cases.

So let’s imagine you’re sold on using UserTiming. You start instrumenting you website, placing marks and measures throughout the life-cycle of your app. Now what?

The data isn’t useful unless you’re looking at it. On your own machine, you can query the PerformanceTimeline and see marks and measures in the browser developer tools. There are also third party services that give you a view of your UserTiming data.

What if you want to gather the data yourself? What if you’re interested in trending different marks or measures in your own analytics tools?

The easy approach is to simply fetch all of the marks and measures via performance.getEntriesByType(), stringify the JSON, and XHR it back to your stats engine.

But how big is that data?

Let’s look at some example data — this was captured from a website I was browsing:

{"duration":0,"entryType":"mark","name":"mark_perceived_load","startTime":1675.636999996641},
{"duration":0,"entryType":"mark","name":"mark_before_flex_bottom","startTime":1772.8529999985767},
{"duration":0,"entryType":"mark","name":"mark_after_flex_bottom","startTime":1986.944999996922},
{"duration":0,"entryType":"mark","name":"mark_js_load","startTime":2079.4459999997343},
{"duration":0,"entryType":"mark","name":"mark_before_deferred_js","startTime":2152.8769999968063},
{"duration":0,"entryType":"mark","name":"mark_after_deferred_js","startTime":2181.611999996676},
{"duration":0,"entryType":"mark","name":"mark_site_init","startTime":2289.4089999972493}]

That’s 657 bytes for just 7 marks. What if you want to log dozens, hundreds, or even thousands of important events on your page? What if you have a Single Page App, where the user can generate many events over the lifetime of their session?

Clearly, we can do better. The signal : noise ratio of stringified JSON isn’t that good. As a performance-conscientious developer, we should strive to minimize our visitor’s upstream bandwidth usage when sending our analytics packets.

Let’s see what we can do.

The Goal

Our goal is to reduce the size of an array of marks and measures down to a data structure that’s as small as possible so that we’re only left with a minimal payload that can be quickly beacon’d to a server for aggregate analysis.

For a similar domain-specific compression technique for ResourceTiming data, please see my post on Compressing ResourceTiming. The techniques we will discuss for UserTiming will build on some of the same things we can do for ResourceTiming data.

An additional goal is that we’re going to stick with techniques where the resulting compressed data doesn’t expand from URL encoding if used in a query-string parameter. This makes it easy to just tack on the data to an existing analytics or Real-User-Monitoring (RUM) beacon.

The Approach

There are two main areas of our data-structure that we can compress. Let’s take a single measure as an example:

{  
    "name":      "measureName",
    "entryType": "measure",
    "startTime": 2289.4089999972493,
    "duration":  100.12314141 
}

What data is important here? Each mark and measure has 4 attributes:

  1. Its name
  2. Whether it’s a mark or a measure
  3. Its start time
  4. Its duration (for marks, this is 0)

I’m going to suggest we can break these down into two main areas: The object and its payload. The object is simply the mark or measure’s name. The payload is its start time, and if its a measure, it’s duration. A duration implies that the object is a measure, so we don’t need to track that attribute independently.

Essentially, we can break up our UserTiming data into a key-value pair. Grouping by the mark or measure name let’s us play some interesting games, so the name will be the key. The value (payload) will be the list of start times and durations for each mark or measure name.

First, we’ll compress the payload (all of the timestamps and durations). Then, we can compress the list of objects.

So, let’s start out by compressing the timestamps!

Compressing the Timestamps

The first thing we want to compress for each mark or measure are its timestamps.

To begin with, startTime and duration are in millisecond resolution, with microseconds in the fraction. Most people probably don’t need microsecond resolution, and it adds a ton of byte size to the payload. A startTime of 2289.4089999972493 can probably be compressed down to just 2,289 milliseconds without sacrificing much accuracy.

So let’s say we have 3 marks to begin with:

{"duration":0,"entryType":"mark","name":"mark1","startTime":100},
{"duration":0,"entryType":"mark","name":"mark1","startTime":150},
{"duration":0,"entryType":"mark","name":"mark1","startTime":500}

Grouping by mark name, we can reduce this structure to an array of start times for each mark:

{ "mark1": [100, 150, 500] }

One of the truths of UserTiming is that when you fetch the entries via performance.getEntries(), they are in sorted order.

Let’s use this to our advantage, by offsetting each timestamp by the one in front of it. For example, the 150 timestamp is only 50ms away from the 100 timestamp before it, so its value can be instead set to 50. 500 is 350ms away from 150, so it gets set to 350. We end up with smaller integers this way, which will make compression easier later:

{ "mark1": [100, 50, 350] }

How can we compress the numbers further? Remember, one goal is to make the resulting data transmit easier on a URL (query string), so we mostly want to use the ASCII alpha-numeric set of characters.

One really easy way of reducing the number of bytes taken by a number in JavaScript is using Base-36 encoding. In other words, 0=0, 10=a, 35=z. Even better, JavaScript has this built-in to Integer.toString(36):

(35).toString(36)          == "z" (saves 1 character)
(99999999999).toString(36) == "19xtf1tr" (saves 3 characters)

Once we Base-36 encode all of our offset timestamps, we’re left with a smaller number of characters:

{ "mark1": ["2s", "1e", "9q"] }

Now that we have these timestamps offsets in Base-36, we can combine (join) them into a single string so they’re easily transmitted. We should avoid using the comma character (,), as it is one of the reserved characters of the URI spec (RFC 3986), so it will be escaped to %2C.

The list of non-URI-encoded characters is pretty small:

[0-9a-zA-Z] $ - _ . + ! * ' ( )

The period (.) looks a lot like a comma, so let’s go with that. Applying a simple Array.join("."), we get:

{ "mark1": "2s.1e.9q" }

So we’re really starting to reduce the byte size of these timestamps. But wait, there’s more we can do!

Let’s say we have some timestamps that came in at a semi-regular interval:

{"duration":0,"entryType":"mark","name":"mark1","startTime":100},
{"duration":0,"entryType":"mark","name":"mark1","startTime":200},
{"duration":0,"entryType":"mark","name":"mark1","startTime":300}

Compressed down, we get:

{ "mark1": "2s.2s.2s" }

Why should we repeat ourselves?

Let’s use one of the other non-URI-encoded characters, the asterisk (*), to note when a timestamp offset repeats itself:

  • A single * means it repeated twice
  • *[n] means it repeated n times.

So the above timestamps can be compressed further to:

{ "mark1": "2s*3" }

Obviously, this compression depends on the application’s characteristics, but periodic marks can be seen in the wild.

Durations

What about measures? Measures have the additional data component of a duration. For marks these are always 0 (you’re just logging a point in time), but durations are another millisecond attribute.

We can adapt our previous string to include durations, if available. We can even mix marks and measures of the same name and not get confused later.

Let’s use this data set as an example. One mark and two measures (sharing the same name):

{"duration":0,"entryType":"mark","name":"foo","startTime":100},
{"duration":100,"entryType":"measure","name":"foo","startTime":150},
{"duration":200,"entryType":"measure","name":"foo","startTime":500}

Instead of an array of Base36-encoded offset timestamps, we need to include a duration, if available. Picking another non-URI-encoded character, the under-bar (_), we can easily “tack” this information on to the end of each startTime.

For example, with a startTime of 150 (1e in Base-36) and a duration of 100 (2s in Base-36), we get a simple string of 1e_2s.

Combining the above marks and measures, we get:

{ "foo": "2s.1e_2s.9q_5k" }

Later, when we’re decoding this, we haven’t lost track of the fact that there are both marks and measures intermixed here, since only measures have durations.

Going back to our original example:

[{"duration":0,"entryType":"mark","name":"mark1","startTime":100},
{"duration":0,"entryType":"mark","name":"mark1","startTime":150},
{"duration":0,"entryType":"mark","name":"mark1","startTime":500}]

Let’s compare that JSON string to how we’ve compressed it (still in JSON form, which isn’t very URI-friendly):

{"mark1":"2s.1e.9q"}

198 bytes originally versus 21 bytes with just the above techniques, or about 10% of the original size.

Not bad so far.

Compressing the Array

Most sites won’t have just a single mark or measure name that they want to transmit. Most sites using UserTiming will have many different mark/measure names and values.

We’ve compressed the actual timestamps to a pretty small (URI-friendly) value, but what happens when we need to transmit an array of different marks/measures and their respective timestamps?

Let’s pretend there are 3 marks and 3 measures on the page, each with one timestamp. After applying timestamp compression, we’re left with:

{
    "mark1": "2s",
    "mark2": "5k",
    "mark3": "8c",
    "measure1": "2s_2s",
    "measure2": "5k_5k",
    "measure3": "8c_8c"
}

There are several ways we can compress this data to a format suitable for URL transmission. Let’s explore.

Using an Array

Remember, JSON is not URI friendly, mostly due to curly braces ({ }), quotes (") and colons (:) having to be escaped.

Even in a minified JSON form:

{"mark1":"2s","mark2":"5k","mark3":"8c","measure1
":"2s_2s","measure2":"5k_5k","measure3":"8c_8c"}
(98 bytes)

This is what it looks like after URI encoding:

%7B%22mark1%22%3A%222s%22%2C%22mark2%22%3A%225k%2
2%2C%22mark3%22%3A%228c%22%2C%22measure1%22%3A%22
2s_2s%22%2C%22measure2%22%3A%225k_5k%22%2C%22meas
ure3%22%3A%228c_8c%22%7D
(174 bytes)

Gah! That’s almost 77% overhead.

Since we have a list of known keys (names) and values, we could instead change this object into an “array” where we’re not using { } " : characters to delimit things.

Let’s use another URI-friendly character, the tilde (~), to separate each. Here’s what the format could look like:

[name1]~[timestamp1]~[name2]~[timestamp2]~[...]

Using our data:

mark1~2s~mark2~5k~mark3~8c~measure1~2s_2s~measure
2~5k_5k~measure3~8c_8c~
(73 bytes)

Note that this depends on your names not including a tilde, or, you can pre-escape tildes in names to %7E.

Using an Optimized Trie

That’s one way of compressing the data. In some cases, we can do better, especially if your names look similar.

One great technique we used in compressing ResourceTiming data is an optimized Trie. Essentially, you can compress strings anytime one is a prefix of another.

In our example above, mark1, mark2 and mark3 are perfect candidates, since they all have a stem of "mark". In optimized Trie form, our above data would look something closer to:

{
    "mark": {
        "1": "2s",
        "2": "5k",
        "3": "8c"
    },
    "measure": {
        "1": "2s_2s",
        "2": "5k_5k",
        "3": "8c_8c"
    }
}

Minified, this is 13% smaller than the original non-Trie data:

{"mark":{"1":"2s","2":"5k","3":"8c"},"measure":{"
1":"2s_2s","2":"5k_5k","3":"8c_8c"}}
(86 bytes)

However, this is not as easily compressible into a tilde-separated array, since it’s no longer a flat data structure.

There’s actually a great way to compress this JSON data for URL-transmission, called JSURL. Basically, the JSURL replaces all non-URI-friendly characters with a better URI-friendly representation. Here’s what the above JSON looks like regular URI-encoded:

%7B%22mark%22%3A%7B%221%22%3A%222s%22%2C%222%22%3
A%225k%22%2C%223%22%3A%228c%22%7D%2C%22measure%22
%3A%7B%22%0A1%22%3A%222s_2s%22%2C%222%22%3A%225k_
5k%22%2C%223%22%3A%228c_8c%22%7D%7D
(185 bytes)

Versus JSURL encoded:

~(m~(ark~(1~'2s~2~'5k~3~'8c)~easure~(1~'2s_2s~2~'
5k_5k~3~'8c_8c)))
(67 bytes)

This JSURL encoding of an optimized Trie reduces the bytes size by 10% versus a tilde-separated array.

Using a Map

Finally, if you know what your mark / measure names will be ahead of time, you may not need to transmit the actual names at all. If the set of your names is finite, and could maintain a map of name : index pairs, and only have to transmit the indexed value for each name.

Using the 3 marks and measures from before:

{
    "mark1": "2s",
    "mark2": "5k",
    "mark3": "8c",
    "measure1": "2s_2s",
    "measure2": "5k_5k",
    "measure3": "8c_8c"
}

What if we simply mapped these names to numbers 0-5:

{
    "mark1": 0,
    "mark2": 1,
    "mark3": 2,
    "measure1": 3,
    "measure2": 4,
    "measure3": 5
}

Since we no longer have to compress names via a Trie, we can go back to an optimized array. And since the size of the index is relatively small (values 0-35 fit into a single character), we can save some room by not having a dedicated character (~) that separates each index and value (timestamps).

Taking the above example, we can have each name fit into a string in this format:

[index1][timestamp1]~[index2][timestamp2]~[...]

Using our data:

02s~15k~28c~32s_2s~45k_5k~58c_8c
(32 bytes)

This structure is less than half the size of the optimized Trie (JSURL encoded).

If you have over 36 mapped name : index pairs, we can still accommodate them in this structure. Remember, at value 36 (the 37th value from 0), (36).toString(36) == 10, taking two characters. We can’t just use an index of two characters, since our assumption above is that the index is only a single character.

One way of dealing with this is by adding a special encoding if the index is over a certain value. We’ll optimize the structure to assume you’re only going to use 36 values, but, if you have over 36, we can accommodate that as well. For example, let’s use one of the final non-URI-encoded characters we have left over, the dash (-):

If the first character of an item in the array is:

  • 0-z (index values 0 – 35), that is the index value
  • -, the next two characters are the index (plus 36)

Thus, the value 0 is encoded as 0, 35 is encoded as z, 36 is encoded as -00, and 1331 is encoded as -zz. This gives us a total of 1331 mapped values we can use, all using a single or 3 characters.

So, given compressed values of:

{
    "mark1": "2s",
    "mark2": "5k",
    "mark3": "8c"
}

And a mapping of:

{
    "mark1": 36,
    "mark2": 37,
    "mark3": 1331
}

You could compress this as:

-002s~-015k~-zz8c

We now have 3 different ways of compressing our array of marks and measures.

We can even swap between them, depending on which compresses the best each time we gather UserTiming data.

Test Cases

So how do these techniques apply to some real-world (and concocted) data?

I navigated around the Alexa Top 50 (by traffic) websites, to see who’s using UserTiming (not many). I gathered any examples I could, and created some of my own test cases as well. With this, I currently have a corpus of 20 real and fake UserTiming examples.

Let’s first compare JSON.stringify() of our UserTiming data versus the culmination of all of the techniques above:

+------------------------------+
¦ Test    ¦ JSON ¦ UTC ¦ UTC % ¦
+---------+------+-----+-------¦
¦ 01.json ¦ 415  ¦ 66  ¦ 16%   ¦
+---------+------+-----+-------¦
¦ 02.json ¦ 196  ¦ 11  ¦ 6%    ¦
+---------+------+-----+-------¦
¦ 03.json ¦ 521  ¦ 18  ¦ 3%    ¦
+---------+------+-----+-------¦
¦ 04.json ¦ 217  ¦ 36  ¦ 17%   ¦
+---------+------+-----+-------¦
¦ 05.json ¦ 364  ¦ 66  ¦ 18%   ¦
+---------+------+-----+-------¦
¦ 06.json ¦ 334  ¦ 43  ¦ 13%   ¦
+---------+------+-----+-------¦
¦ 07.json ¦ 460  ¦ 43  ¦ 9%    ¦
+---------+------+-----+-------¦
¦ 08.json ¦ 91   ¦ 20  ¦ 22%   ¦
+---------+------+-----+-------¦
¦ 09.json ¦ 749  ¦ 63  ¦ 8%    ¦
+---------+------+-----+-------¦
¦ 10.json ¦ 103  ¦ 32  ¦ 31%   ¦
+---------+------+-----+-------¦
¦ 11.json ¦ 231  ¦ 20  ¦ 9%    ¦
+---------+------+-----+-------¦
¦ 12.json ¦ 232  ¦ 19  ¦ 8%    ¦
+---------+------+-----+-------¦
¦ 13.json ¦ 172  ¦ 34  ¦ 20%   ¦
+---------+------+-----+-------¦
¦ 14.json ¦ 658  ¦ 145 ¦ 22%   ¦
+---------+------+-----+-------¦
¦ 15.json ¦ 89   ¦ 48  ¦ 54%   ¦
+---------+------+-----+-------¦
¦ 16.json ¦ 415  ¦ 33  ¦ 8%    ¦
+---------+------+-----+-------¦
¦ 17.json ¦ 196  ¦ 18  ¦ 9%    ¦
+---------+------+-----+-------¦
¦ 18.json ¦ 196  ¦ 8   ¦ 4%    ¦
+---------+------+-----+-------¦
¦ 19.json ¦ 228  ¦ 50  ¦ 22%   ¦
+---------+------+-----+-------¦
¦ 20.json ¦ 651  ¦ 38  ¦ 6%    ¦
+---------+------+-----+-------¦
¦ Total   ¦ 6518 ¦ 811 ¦ 12%   ¦
+------------------------------+

Key:
* JSON      = JSON.stringify(UserTiming).length (bytes)
* UTC       = Applying UserTimingCompression (bytes)
* UTC %     = UTC bytes / JSON bytes

Pretty good, right? On average, we shrink the data down to about 12% of its original size.

In addition, the resulting data is now URL-friendly.

UserTiming-Compression.js

usertiming-compression.js (and its companion, usertiming-decompression.js) are open-source JavaScript modules (UserTimingCompression and UserTimingDecompression) that apply all of the techniques above.

They are available on Github at github.com/nicjansma/usertiming-compression.js.

These scripts are meant to provide an easy, drop-in way of compressing your UserTiming data. They compress UserTiming via one of the methods listed above, depending on which way compresses best.

If you have intimate knowledge of your UserTiming marks, measures and how they’re organized, you could probably construct an even more optimized data structure for capturing and transmitting your UserTiming data. You could also trim the scripts to only use the compression technique that works best for you.

Versus Gzip / Deflate

Wait, why did we go through all of this mumbo-jumbo when there are already great ways of compression data? Why not just gzip the stringified JSON?

That’s one approach. One challenge is there isn’t native support for gzip in JavaScript. Thankfully, you can use one of the excellent open-source libraries like pako.

Let’s compare the UserTimingCompression techniques to gzipping the raw UserTiming JSON:

+----------------------------------------------------+
¦ Test    ¦ JSON ¦ UTC ¦ UTC % ¦ JSON.gz ¦ JSON.gz % ¦
+---------+------+-----+-------+---------+-----------¦
¦ 01.json ¦ 415  ¦ 66  ¦ 16%   ¦ 114     ¦ 27%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 02.json ¦ 196  ¦ 11  ¦ 6%    ¦ 74      ¦ 38%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 03.json ¦ 521  ¦ 18  ¦ 3%    ¦ 79      ¦ 15%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 04.json ¦ 217  ¦ 36  ¦ 17%   ¦ 92      ¦ 42%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 05.json ¦ 364  ¦ 66  ¦ 18%   ¦ 102     ¦ 28%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 06.json ¦ 334  ¦ 43  ¦ 13%   ¦ 96      ¦ 29%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 07.json ¦ 460  ¦ 43  ¦ 9%    ¦ 158     ¦ 34%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 08.json ¦ 91   ¦ 20  ¦ 22%   ¦ 88      ¦ 97%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 09.json ¦ 749  ¦ 63  ¦ 8%    ¦ 195     ¦ 26%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 10.json ¦ 103  ¦ 32  ¦ 31%   ¦ 102     ¦ 99%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 11.json ¦ 231  ¦ 20  ¦ 9%    ¦ 120     ¦ 52%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 12.json ¦ 232  ¦ 19  ¦ 8%    ¦ 123     ¦ 53%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 13.json ¦ 172  ¦ 34  ¦ 20%   ¦ 112     ¦ 65%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 14.json ¦ 658  ¦ 145 ¦ 22%   ¦ 217     ¦ 33%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 15.json ¦ 89   ¦ 48  ¦ 54%   ¦ 91      ¦ 102%      ¦
+---------+------+-----+-------+---------+-----------¦
¦ 16.json ¦ 415  ¦ 33  ¦ 8%    ¦ 114     ¦ 27%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 17.json ¦ 196  ¦ 18  ¦ 9%    ¦ 81      ¦ 41%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 18.json ¦ 196  ¦ 8   ¦ 4%    ¦ 74      ¦ 38%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 19.json ¦ 228  ¦ 50  ¦ 22%   ¦ 103     ¦ 45%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ 20.json ¦ 651  ¦ 38  ¦ 6%    ¦ 115     ¦ 18%       ¦
+---------+------+-----+-------+---------+-----------¦
¦ Total   ¦ 6518 ¦ 811 ¦ 12%   ¦ 2250    ¦ 35%       ¦
+----------------------------------------------------+

Key:
* JSON      = JSON.stringify(UserTiming).length (bytes)
* UTC       = Applying UserTimingCompression (bytes)
* UTC %     = UTC bytes / JSON bytes
* JSON.gz   = gzip(JSON.stringify(UserTiming)).length
* JSON.gz % = JSON.gz bytes / JSON bytes

As you can see, gzip does a pretty good job of compressing raw JSON (stringified) – on average, reducing the size of to 35% of the original. However, UserTimingCompression does a much better job, reducing to 12% of overall size.

What if instead of gzipping the UserTiming JSON, we gzip the minified timestamp map? For example, instead of:

[{"duration":0,"entryType":"mark","name":"mark1","startTime":100},
{"duration":0,"entryType":"mark","name":"mark1","startTime":150},
{"duration":0,"entryType":"mark","name":"mark1","startTime":500}]

What if we gzipped the output of compressing the timestamps?

{"mark1":"2s.1e.9q"}

Here are the results:

+-----------------------------------+
¦ Test    ¦ UTC ¦ UTC.gz ¦ UTC.gz % ¦
+---------+-----+--------+----------¦
¦ 01.json ¦ 66  ¦ 62     ¦ 94%      ¦
+---------+-----+--------+----------¦
¦ 02.json ¦ 11  ¦ 24     ¦ 218%     ¦
+---------+-----+--------+----------¦
¦ 03.json ¦ 18  ¦ 28     ¦ 156%     ¦
+---------+-----+--------+----------¦
¦ 04.json ¦ 36  ¦ 46     ¦ 128%     ¦
+---------+-----+--------+----------¦
¦ 05.json ¦ 66  ¦ 58     ¦ 88%      ¦
+---------+-----+--------+----------¦
¦ 06.json ¦ 43  ¦ 43     ¦ 100%     ¦
+---------+-----+--------+----------¦
¦ 07.json ¦ 43  ¦ 60     ¦ 140%     ¦
+---------+-----+--------+----------¦
¦ 08.json ¦ 20  ¦ 33     ¦ 165%     ¦
+---------+-----+--------+----------¦
¦ 09.json ¦ 63  ¦ 76     ¦ 121%     ¦
+---------+-----+--------+----------¦
¦ 10.json ¦ 32  ¦ 45     ¦ 141%     ¦
+---------+-----+--------+----------¦
¦ 11.json ¦ 20  ¦ 37     ¦ 185%     ¦
+---------+-----+--------+----------¦
¦ 12.json ¦ 19  ¦ 35     ¦ 184%     ¦
+---------+-----+--------+----------¦
¦ 13.json ¦ 34  ¦ 40     ¦ 118%     ¦
+---------+-----+--------+----------¦
¦ 14.json ¦ 145 ¦ 112    ¦ 77%      ¦
+---------+-----+--------+----------¦
¦ 15.json ¦ 48  ¦ 45     ¦ 94%      ¦
+---------+-----+--------+----------¦
¦ 16.json ¦ 33  ¦ 50     ¦ 152%     ¦
+---------+-----+--------+----------¦
¦ 17.json ¦ 18  ¦ 37     ¦ 206%     ¦
+---------+-----+--------+----------¦
¦ 18.json ¦ 8   ¦ 23     ¦ 288%     ¦
+---------+-----+--------+----------¦
¦ 19.json ¦ 50  ¦ 53     ¦ 106%     ¦
+---------+-----+--------+----------¦
¦ 20.json ¦ 38  ¦ 51     ¦ 134%     ¦
+---------+-----+--------+----------¦
¦ Total   ¦ 811 ¦ 958    ¦ 118%     ¦
+-----------------------------------+

Key:
* UTC     = Applying full UserTimingCompression (bytes)
* TS.gz   = gzip(UTC timestamp compression).length
* TS.gz % = TS.gz bytes / UTC bytes

Even with pre-applying the timestamp compression and gzipping the result, gzip doesn’t beat the full UserTimingCompression techniques. Here, in general, gzip is 18% larger than UserTimingCompression. There are a few cases where gzip is better, notably in test cases with a lot of repeating strings.

Additionally, applying gzip requires your app include a JavaScript gzip library, like pako — whose deflate code is currently around 26.3 KB minified. usertiming-compression.js is much smaller, at only 3.9 KB minified.

Finally, if you’re using gzip compression, you can’t just stick the gzip data into a Query String, as URL encoding will increase its size tremendously.

If you’re already using gzip to compress data, it’s a decent choice, but applying some domain-specific knowledge about our data-structures give us better compression in most cases.

Versus MessagePack

MessagePack is another interesting choice for compressing data. In fact, its motto is “It’s like JSON. but fast and small.“. I like MessagePack and use it for other projects. MessagePack is an efficient binary serialization format that takes JSON input and distills it down to a minimal form. It works with any JSON data structure, and is very portable.

How does MessagePack compare to the UserTiming compression techniques?

MessagePack only compresses the original UserTiming JSON to 72% of its original size. Great for a general compression library, but not nearly as good as UserTimingCompression can do. Notably, this is because MessagePack is retaining the JSON strings (e.g. startTime, duration, etc) for each UserTiming object:

+--------------------------------------------------------+
¦         ¦ JSON ¦ UTC ¦ UTC % ¦ JSON.pack ¦ JSON.pack % ¦
+---------+------+-----+-------+-----------+-------------¦
¦ Total   ¦ 6518 ¦ 811 ¦ 12%   ¦ 4718      ¦ 72%         ¦
+--------------------------------------------------------+

Key:
* UTC         = Applying UserTimingCompression (bytes)
* UTC %       = UTC bytes / JSON bytes
* JSON.pack   = MsgPack(JSON.stringify(UserTiming)).length
* JSON.pack % = TS.pack bytes / UTC bytes

What if we just MessagePack the compressed timestamps? (e.g. {"mark1":"2s.1e.9q", ...})

+---------------------------------------+
¦ Test    ¦ UTC ¦ TS.pack  ¦ TS.pack %  ¦
+---------+-----+----------+------------¦
¦ 01.json ¦ 66  ¦ 73       ¦ 111%       ¦
+---------+-----+----------+------------¦
¦ 02.json ¦ 11  ¦ 12       ¦ 109%       ¦
+---------+-----+----------+------------¦
¦ 03.json ¦ 18  ¦ 19       ¦ 106%       ¦
+---------+-----+----------+------------¦
¦ 04.json ¦ 36  ¦ 43       ¦ 119%       ¦
+---------+-----+----------+------------¦
¦ 05.json ¦ 66  ¦ 76       ¦ 115%       ¦
+---------+-----+----------+------------¦
¦ 06.json ¦ 43  ¦ 44       ¦ 102%       ¦
+---------+-----+----------+------------¦
¦ 07.json ¦ 43  ¦ 43       ¦ 100%       ¦
+---------+-----+----------+------------¦
¦ 08.json ¦ 20  ¦ 21       ¦ 105%       ¦
+---------+-----+----------+------------¦
¦ 09.json ¦ 63  ¦ 63       ¦ 100%       ¦
+---------+-----+----------+------------¦
¦ 10.json ¦ 32  ¦ 33       ¦ 103%       ¦
+---------+-----+----------+------------¦
¦ 11.json ¦ 20  ¦ 21       ¦ 105%       ¦
+---------+-----+----------+------------¦
¦ 12.json ¦ 19  ¦ 20       ¦ 105%       ¦
+---------+-----+----------+------------¦
¦ 13.json ¦ 34  ¦ 33       ¦ 97%        ¦
+---------+-----+----------+------------¦
¦ 14.json ¦ 145 ¦ 171      ¦ 118%       ¦
+---------+-----+----------+------------¦
¦ 15.json ¦ 48  ¦ 31       ¦ 65%        ¦
+---------+-----+----------+------------¦
¦ 16.json ¦ 33  ¦ 40       ¦ 121%       ¦
+---------+-----+----------+------------¦
¦ 17.json ¦ 18  ¦ 21       ¦ 117%       ¦
+---------+-----+----------+------------¦
¦ 18.json ¦ 8   ¦ 11       ¦ 138%       ¦
+---------+-----+----------+------------¦
¦ 19.json ¦ 50  ¦ 52       ¦ 104%       ¦
+---------+-----+----------+------------¦
¦ 20.json ¦ 38  ¦ 40       ¦ 105%       ¦
+---------+-----+----------+------------¦
¦ Total   ¦ 811 ¦ 867      ¦ 107%       ¦
+---------------------------------------+

Key:
* UTC       = Applying full UserTimingCompression (bytes)
* TS.pack   = MsgPack(UTC timestamp compression).length
* TS.pack % = TS.pack bytes / UTC bytes

For our 20 test cases, MessagePack is about 7% larger than the UserTiming compression techniques.

Like using a JavaScript module for gzip, the most popular MessagePack JavaScript modules are pretty hefty, at 29.2 KB, 36.9 KB, and 104 KB. Compare this to only 3.9 KB minified for usertiming-compression.js.

Basically, if you have good domain-specific knowledge of your data-structures, you can often compress better than a general-case minimizer like gzip or MessagePack.

Conclusion

It’s fun taking a data-structure you want to work with and compressing it down as much as possible.

UserTiming is a great (and under-utilized) browser API that I hope to see get adopted more. If you’re already using UserTiming, you might have already solved the issue of how to capture, transmit and store these datapoints. If not, I hope these techniques and tools will help you on your way towards using the API.

Do you have ideas for how to compress this data down even further? Let me know!

usertiming-compression.js (and usertiming-decompression.js) are available on Github.

Resources

Forensic Tools for In-Depth Performance Investigations

October 15th, 2015

Another talk Philip Tellis and I gave at Velocity New York 2015.  Check it out on Slideshare:

Forensic Tools for In-Depth Performance Investigations

Measuring the Performance of Single Page Applications

October 15th, 2015

Philip Tellis and I recently gave this talk at Velocity New York 2015.  Check out the slides on Slideshare:

Measuring the Performance of Single Page Applications

The talk is available on YouTube from midwest.io:

This content was also recently made into an eBook:

SPA eBook

 

UserTiming in Practice

May 29th, 2015

UserTiming is a specification developed by the W3C Web Performance working group, with the goal of giving the developer a standardized interface to log timestamps (“marks”) and durations (“measures”).

UserTiming utilizes the PerformanceTimeline that we saw in ResourceTiming, but all of the UserTiming events are put there by the you the developer (or the third-party scripts you’ve included in the page).

UserTiming is currently a Recommendation, which means that browser vendors are encouraged to implement it. As of early 2015, 59% of the world-wide browser market-share support NavigationTiming.

How was it done before?

Prior to UserTiming, developers have been keeping track of performance metrics, such as logging timestamps and event durations by using simple JavaScript:

var start = new Date().getTime();
// do stuff
var now = new Date().getTime();
var duration = now - start;

What’s wrong with this?

Well, nothing really, but… we can do better.

First, as discussed previously, Date().getTime() is not reliable.

Second, by logging your performance metrics into the standard interface of UserTiming, browser developer tools and third-party analytics services will be able to read and understand your performance metrics.

Marks and Measures

Developers generally use two core ideas to profile their code. First, they keep track of timestamps for when events happen. They may log these timestamps (e.g. via Date().getTime()) into JavaScript variables to be used later.

Second, developers often keep track of durations of events. This is often done by taking the difference of two timestamps.

Timestamps and durations correspond to “marks” and “measures” in UserTiming terms. A mark is a timestamp, in DOMHighResTimeStamp format. A measure is a duration, the difference between two marks, also measured in milliseconds.

How to use

Creating a mark or measure is done via the window.performance interface:

partial interface Performance {
    void mark(DOMString markName);

    void clearMarks(optional  DOMString markName);

    void measure(DOMString measureName, optional DOMString startMark,
        optional DOMString endMark);

    void clearMeasures(optional DOMString measureName);
};

interface PerformanceEntry {
  readonly attribute DOMString name;
  readonly attribute DOMString entryType;
  readonly attribute DOMHighResTimeStamp startTime;
  readonly attribute DOMHighResTimeStamp duration;
};

interface PerformanceMark : PerformanceEntry { };

interface PerformanceMeasure : PerformanceEntry { };

A mark (PerformanceMark) is an example of a PerformanceEntry, with no additional attributes:

  • name is the mark’s name
  • entryType is "mark"
  • startTime is the time the mark was created
  • duration is always 0

A measure (PerformanceMeasure) is also an example of a PerformanceEntry, with no additional attributes:

  • name is the measure’s name
  • entryType is "measure"
  • startTime is the startTime of the start mark
  • duration is the difference between the startTime of the start and end mark

Example Usage

Let’s start by logging a couple marks (timestamps):

// mark
performance.mark("start"); 
// -> {"name": "start", "entryType": "mark", "startTime": 1, "duration": 0}

performance.mark("end"); 
// -> {"name": "end", "entryType": "mark", "startTime": 2, "duration": 0}

performance.mark("another"); 
// -> {"name": "another", "entryType": "mark", "startTime": 3, "duration": 0}
performance.mark("another"); 
// -> {"name": "another", "entryType": "mark", "startTime": 4, "duration": 0}
performance.mark("another"); 
// -> {"name": "another", "entryType": "mark", "startTime": 5, "duration": 0}

Later, you may want to compare two marks (start vs. end) to create a measure (called diff), such as:

performance.measure("diff", "start", "end");
// -> {"name": "diff", "entryType": "measure", "startTime": 1, "duration": 1}

Note that measure() always calculates the difference by taking the latest timestamp that was seen for a mark. So if you did a measure against the another marks in the example above, it will take the timestamp of the third call to mark("another"):

performance.measure("diffOfAnother", "start", "another");
// -> {"name": "diffOfAnother", "entryType": "measure", "startTime": 1, "duration": 4}

There are many ways to create a measure:

  • If you call measure(name), the startTime is assumed to be window.performance.timing.navigationStart and the endTime is assumed to be now.
  • If you call measure(name, startMarkName), the startTime is assumed to be startTime of the given mark’s name and the endTime is assumed to be now.
  • If you call measure(name, startMarkName, endMarkName), the startTime is assumed to be startTime of the given start mark’s name and the endTime is assumed to be the startTime of the given end mark’s name.

Some examples of using measure():

// log the beginning of our task (assuming now is '1')
performance.mark("start");
// -> {"name": "start", "entryType": "mark", "startTime": 1, "duration": 0}

// do work (assuming now is '2')
performance.mark("start2");
// -> {"name": "start2", "entryType": "mark", "startTime": 2, "duration": 0}

// measure from navigationStart to now (assuming now is '3')
performance.measure("time to get to this point");
// -> {"name": "time to get to this point", "entryType": "measure", "startTime": 0, "duration": 3}

// measure from "now" to the "start" mark (assuming now is '4')
performance.measure("time to do stuff", "start");
// -> {"name": "time to do stuff", "entryType": "measure", "startTime": 1, "duration": 3}

// measure from "start2" to the "start" mark
performance.measure("time from start to start2", "start", "start2");
// -> {"name": "time from start to start2", "entryType": "measure", "startTime": 1, "duration": 1}

Once a mark or measure has been created, you can query for all marks, all measures, or specific marks/measures via the PerformanceTimeline. Here’s a review of the PerformanceTimeline methods:

window.performance.getEntries();
window.performance.getEntriesByType(type);
window.performance.getEntriesByName(name, type);
  • getEntries(): Gets all entries in the timeline
  • getEntriesByType(type): Gets all entries of the specified type (eg resource, mark, measure)
  • getEntriesByName(name, type): Gets all entries with the specified name (eg URL or mark name). type is optional, and will filter the list to that type.

Here’s an example of using the PerformanceTimeline to fetch a mark:

// performance.getEntriesByType("mark");
[
    {
        "duration":0,
        "startTime":1
        "entryType":"mark",
        "name":"start"
    },
    {
        "duration":0,
        "startTime":2,
        "entryType":"mark",
        "name":"start2"
    },
    ...
]

// performance.getEntriesByName("time from start to start2", "measure");
[
    {
        "duration":1,
        "startTime":1,
        "entryType":"measure",
        "name":"time from start to start2"
    }
]

Standard Mark Names

There are a couple of mark names that are suggested by the W3C specification to have special meanings:

  • mark_fully_loaded: The time when the page is considered fully loaded as marked by the developer in their application
  • mark_fully_visible: The time when the page is considered completely visible to an end-user as marked by the developer in their application
  • mark_above_the_fold: The time when all of the content in the visible viewport has been presented to the end-user as marked by the developer in their application
  • mark_time_to_user_action: The time of the first user interaction with the page during or after a navigation, such as scroll or click, as marked by the developer in their application

By using these standardized names, other third-party tools can pick up on your meanings and treat them specially (for example, by overlaying them on your waterfall).

Obviously, you can use these mark names for anything you want, and don’t have to stick by the recommended meanings.

Benefits

So why would you use UserTiming over just Date().getTime()?

First, it uses the PerformanceTimeline, so marks and measures are in the PerformanceTimeline along with other events

Second, it uses DOMHighResTimestamp instead of Date so the timestamps have sub-millisecond resolution, and are monotonically non-decreasing (so aren’t affected by the client’s clock).

UserTiming is very efficient, as the native browser runtime should be able to do math quicker and store things more performantly than your JavaScript runtime can.

Developer Tools

UserTiming marks and measures are currently available in the Internet Explorer F12 Developer Tools. They are called “User marks” and are shown as upside-down red triangles below:

UserTiming in IE F12 Dev Tools

Chrome and Firefox do not yet show marks or measures in their Developer Tools.

If you call console.timeStamp("foo"), a marker will show up in Chrome and Firefox timelines. However, if you do both console.timeStamp() and performance.mark(), you will get two markers in Internet Explorer.

So for now, if you want a cross-browser compatible way of showing a single marker in dev tools, you would have to sniff the UA (which is generally not recommended). There is an open bug on Chromium to get UserTiming marks to show in the dev tools.

Use Cases

How could you use UserTiming? Here are some ideas:

  • Any place that you’re already logging timestamps or calculating durations could be switched to UserTiming
  • Easy way to add profiling events to your application
  • Note important scenario durations in your Performance Timeline
  • Measure important durations for analytics

Compressing

If you’re adding UserTiming instrumentation to your page, you probably also want to consume it. One way is to grab everything, package it up, and send it back to your own server for analysis.

In my UserTiming Compression article, I go over a couple ways of how to do this. Versus just sending the UserTiming JSON, usertiming-compression.js can reduce the byte size down to just 10-15% of the original.

Availability

UserTiming is available in most modern browsers. According to caniuse.com 59% of world-wide browser market share supports ResourceTiming, as of May 2015. This includes Internet Explorer 10+, Firefox 38+, Chrome 25+, Opera 15+ and Android Browser 4.4+.

CanIUse - UserTiming

As usual, Safari (iOS and Mac) doesn’t yet include support for UserTiming.

However, if you want to use UserTiming for everything, there are polyfills available that work 100% reliably in all browsers.

I have one such polyfill, UserTiming.js, available on Github.

DIY / Open Source / Commercial

If you want to use UserTiming, you could easily compress and beacon the data to your back-end for processing.

WebPageTest sends UserTiming to Google Analytics, Boomerang and SOASTA mPulse:

WebPageTest UserTiming

SOASTA mPulse collects UserTiming information for any Custom Timers you specify (I work at SOASTA, on mPulse and Boomerang):

SOASTA mPulse

Conclusion

UserTiming is a great interface to log your performance metrics into a standardized interface. As more services and browsers support UserTiming, you will be able to see your data in more and more places.

That wraps up our talk about how to monitor and measure the performance of your web apps. Hope you enjoyed it.

Other articles in this series:

More resources:

Updates

  • 2015-12-01: Added Compressing section

ResourceTiming in Practice

May 28th, 2015

ResourceTiming is a specification developed by the W3C Web Performance working group, with the goal of exposing accurate performance metrics about all of the resources downloaded during the page load experience, such as images, CSS and JavaScript.

ResourceTiming builds on top of the concepts of NavigationTiming and provides many of the same measurements, such as the timings of each resource’s DNS, TCP, request and response phases, along with the final “loaded” timestamp.

ResourceTiming takes its inspiration from resource Waterfalls. If you’ve ever looked at the Networking tab in Internet Explorer, Chrome or Firefox developer tools, you’ve seen a Waterfall before. A Waterfall shows all of the resources fetched from the network in a timeline, so you can quickly visualize any issues. Here’s an example from the Chrome Developer Tools:

ResourceTiming inspiration

ResourceTiming is currently a Working Draft, which means it is still a work-in-progress, but many browsers already support it. As of early 2015, 67% of the world-wide browser market-share supports ResourceTiming.

How was it done before?

Prior to ResourceTiming, you could measure the time it took to download resources on your page by hooking into the associated element’s onload event, such as for Images.

Take this example code:

var start = new Date().getTime();
var image1 = new Image();
var loadCallback = function() {
    var now = new Date().getTime();
    var latency = now - start;
    alert("End to end resource fetch: " + latency);
};

image1.onload = loadCallback;
image1.src = 'http://foo.com/image.png';

With the code above, the image is inserted into the DOM when the script runs, at which point it sets the start variable to the current time. The image’s onload attribute is set to loadCallback, which calculates how long it took for the resource to be fetched.

While this is one method for measuring the download time of an image, it’s not very practical.

First of all, it only measures the end-to-end download time, plus any overhead required for the browser to fire the onload callback. For images, this could also include the time it takes to parse and render the image. You cannot get a breakdown of DNS, TCP, SSL, request or response times with this method.

Another issue is with the use of Date.getTime(), which has some major drawbacks. See our discussion on DOMHighResTimeStamp in the NavigationTiming discussion for more details.

Most importantly, to use this method you have to construct your entire web app dynamically, at runtime. Dynamically adding all of the elements that would trigger resource fetches in <script> tags is not practical, nor performant. You would have to insert all <img>, <link rel="stylesheet">, and <script> tags to instrument everything. Doing this via JavaScript is not performant, and the browser cannot pre-fetch resources that would have otherwise been in the HTML.

Finally, it’s impossible to measure all resources that are fetched by the browser using this method. For example, it’s not possible to hook into stylesheets or fonts defined via @import or @font-face statements.

ResourceTiming addresses all of these problems.

How to use

ResourceTiming data is available via several methods on the window.performance interface:

window.performance.getEntries();
window.performance.getEntriesByType(type);
window.performance.getEntriesByName(name, type);

Each of these functions returns a list of PerformanceEntrys. getEntries() will return a list of all entries in the PerformanceTimeline (see below), while if you use getEntriesByType("resource") or getEntriesByName("foo", "resource"), you can limit your query to just entries of the type PerformanceResourceTiming, which inherits from PerformanceEntry.

That may sound confusing, but when you look at the array of ResourceTiming objects, they’ll simply have a combination of the attributes below. Here’s the WebIDL (definition) of a PerformanceEntry:

interface PerformanceEntry {
    readonly attribute DOMString name;
    readonly attribute DOMString entryType;

    readonly attribute DOMHighResTimeStamp startTime;
    readonly attribute DOMHighResTimeStamp duration;
};

Each PerformanceResourceTiming is a PerformanceEntry, so has the above attributes, as well as the attributes below:

interface PerformanceResourceTiming : PerformanceEntry {
    readonly attribute DOMString initiatorType;

    readonly attribute DOMHighResTimeStamp redirectStart;
    readonly attribute DOMHighResTimeStamp redirectEnd;
    readonly attribute DOMHighResTimeStamp fetchStart;
    readonly attribute DOMHighResTimeStamp domainLookupStart;
    readonly attribute DOMHighResTimeStamp domainLookupEnd;
    readonly attribute DOMHighResTimeStamp connectStart;
    readonly attribute DOMHighResTimeStamp connectEnd;
    readonly attribute DOMHighResTimeStamp secureConnectionStart;
    readonly attribute DOMHighResTimeStamp requestStart;
    readonly attribute DOMHighResTimeStamp responseStart;
    readonly attribute DOMHighResTimeStamp responseEnd;
};

Interlude: PerformanceTimeline

The PerformanceTimeline is a critical part of ResourceTiming, and is the interface that you use to fetch ResourceTiming data, as well as other performance information, such as UserTiming data.

The methods getEntries(), getEntriesByType() and getEntriesByName() that you saw above are the primary interfaces of the PerformanceTimeline. The idea is to expose all browser performance information via a standard interface.

All browsers that support ResourceTiming (or UserTiming) will also support the PerformanceTimeline.

Here are the primary methods:

  • getEntries(): Gets all entries in the timeline
  • getEntriesByType(type): Gets all entries of the specified type (eg resource, mark, measure)
  • getEntriesByName(name, type): Gets all entries with the specified name (eg URL or mark name). type is optional, and will filter the list to that type.

We’ll use the PerformanceTimeline to fetch ResourceTiming data (and UserTiming data in the next article).

Back to ResourceTiming

ResourceTiming takes its inspiration from the NavigationTiming timeline.

Here are the phases a single resource would go through during the fetch process:

ResourceTiming timeline

To fetch all of the resources on a page, you simply call one of the PerformanceTimeline methods:

var resources = window.performance.getEntriesByType("resource");

/* eg:
[
    {
        name: "https://www.foo.com/foo.png",
        entryType: "resource",
        startTime: 566.357000003336,
        duration: 4.275999992387369,
        initiatorType: "img",
        redirectEnd: 0,
        redirectStart: 0,
        fetchStart: 566.357000003336,
        domainLookupStart: 566.357000003336,
        domainLookupEnd: 566.357000003336,
        connectStart: 566.357000003336,
        secureConnectionStart: 0,
        connectEnd: 566.357000003336,
        requestStart: 568.4959999925923,
        responseStart: 569.4220000004862,
        responseEnd: 570.6329999957234
    }, ...
]
*/

Looking at these attributes, you can see they are a combination of the base PerformanceEntry attributes (i.e. name, entryType, startTime and duration) with additional PerformanceResourceTiming attributes that are specific to ResourceTiming.

Please note that all of the timestamps are DOMHighResTimeStamps, so they are relative to window.performance.timing.navigationStart. Thus a value of 500 means 500 milliseconds after the page load started.

Here is a description of all of the ResourceTiming attributes:

  • name is the fully-resolved URL of the attribute (relative URLs in your HTML will be expanded to include the full protocol, domain name and path)
  • entryType will always be "resource" for ResourceTiming entries
  • startTime is the time a resource started being fetched
  • initiatorType is the localName of the element that initiated the fetch of the resource (see below for details)
  • redirectStart and redirectEnd encompass the time it took to fetch any previous resources that redirected to the final one listed. If either timestamp is 0, there were no redirects, or one of the redirects wasn’t from the same origin as this resource.
  • fetchStart is the time this specific resource started being fetched, not including redirects
  • domainLookupStart and domainLookupEnd are the timestamps for DNS lookups
  • connectStart and connectEnd are timestamps for the TCP connection
  • secureConnectionStart is the start timestamp of the SSL handshake, if any. If the connection was over HTTP, or if the browser doesn’t support this timestamp (eg. Internet Explorer), it will be 0.
  • requestStart is the timestamp that the browser started to request the resource from the remote server
  • responseStart and responseEnd are the timestamps for the start of the response and when it finished downloading
  • duration is the overall time required to fetch the resource

duration will include time to fetch all redirected resources (if any) as well as the final resource. To track the overall time it took to fetch just the final resource, you may want to use (responseEndfetchStart).

initiatorType is the localName of the element that fetched the resource — in other words, the name of the associated HTML element. The most common values seen for this attribute are:

  • img
  • link
  • script
  • css: url(), @import
  • xmlhttprequest
  • iframe

Note that CSS may have an initiatorType of "link" or "css" because it can either be triggered via a <link> tag or via an @import in a CSS file.

The iframe initiator type is for <IFRAME>s on the page, and the duration will be how long it took to load that frame (e.g. how long it took for the frame’s onload event to fire). This means that the duration will include any statically-included sub-resources within that frame. This is good news, as it can give you insight into cross-origin <IFRAME>s (such as advertisements) that you normally can’t access from ResourceTiming.

ResourceTiming does not (yet) include attributes that expose the HTTP status code of the resource, the number of transferred bytes, or the final object size.

What Resources are included

All of the resources that your browser fetches to construct the page should be listed in the ResourceTiming data. This includes, but is not limited to images, scripts, css, fonts, videos, IFRAMEs and XHRs.

Some browsers (eg. Internet Explorer) may include other non-fetched resources, such as about:blank and javascript: URLs in the ResourceTiming data. This is likely a bug and may be fixed in upcoming versions, but you may want to filter out non-http: and https: protocols.

Additionally, some browser extensions may trigger downloads and thus you may see some of those downloads in your ResourceTiming data as well.

Not all resources will be fetched successfully. There might have been a networking error, due to a DNS, TCP or SSL/TLS negotiation failure. Or, the server might return a 4xx or 5xx response. How this information is surfaced in ResourceTiming depends on the browser:

  • DNS failure
    • Chrome: No ResourceTiming event
    • Internet Explorer: domainLookupStart through responseStart are 0. responseEnd and duration are non-zero.
    • Firefox: domainLookupStart through responseStart are 0. duration is 0 in some cases. responseEnd is non-zero.
  • TCP failure
    • Chrome: No ResourceTiming event
    • Internet Explorer: domainLookupStart through responseStart are 0. responseEnd and duration are non-zero.
    • Firefox: domainLookupStart through responseStart and duration are 0. responseEnd is non-zero.
  • 4xx/5xx response
    • Chrome: No ResourceTiming event
    • Internet Explorer: All timestamps are non-zero.
    • Firefox: All timestamps are non-zero (though duration is 0 in FF < 41)

The working group is attempting to get these behaviors more consistent across browsers. You can read this post for further details.

Note that the root page (your HTML) is not included in ResourceTiming. You can get all of that data from NavigationTiming.

Cached Resources

Cached resources will show up in ResourceTiming right along side resources that were fetched from the network.

There’s no direct indication on the cached resource that it was served from the cache. In practice resources with a very short duration (say under 10 milliseconds) are likely to have been served from the browser’s cache. They might take a few milliseconds due to disk latencies.

There’s no definitive indication that a resource was served from cache due to privacy concerns (have-you-been-to-X-before attack).

The ResourceTiming Buffer

There is a ResourceTiming buffer (per document / IFRAME) that stops filling after its limit is reached. By default, all modern browsers currently set this limit to 150 entries.

The reasoning behind limiting the number of entries is to ensure that, for the vast majority of websites that are not consuming ResourceTiming entries, the browser’s memory isn’t consumed indefinitely holding on to a lot of this information. In addition, for sites that periodically fetch new resources (such as XHR polling), we would’t want the ResourceTiming buffer to grow unbound.

Thus, if you will be consuming ResourceTiming data, you need to have awareness of the buffer. If your site only downloads a handful of resources for each page load (< 100), and does nothing afterwards, you probably won’t hit the limit.

However, if your site downloads over a hundred resources, or you want to be able to monitor for resources fetched on an ongoing basis, you can do one of two things.

First, you can listen for the onresourcetimingbufferfull event which gets fired on the document when the buffer is full. You can then use setResourceTimingBufferSize(n) or clearResourceTimings() to resize or clear the buffer.

As an example, to keep the buffer size at 150 yet continue tracking resources after the first 150 resources were added, you could do something like this;

if ("performance" in window) {
  function onBufferFull() {
    var latestEntries = performance.getEntriesByType("resource");
    performance.clearResourceTimings();

    // analyze or beacon latestEntries, etc
  }

  performance.onresourcetimingbufferfull = performance.onwebkitresourcetimingbufferfull = onBufferFull;
}

Note onresourcetimingbufferfull is not currently supported in Internet Explorer (10, 11 or Edge).

If your site is on the verge of 150 resources, and you don’t want to manage the buffer, you could also just safely increase the buffer size to something reasonable in your HTML header:

<html><head>
<script>
if ("performance" in window 
    && window.performance 
    && window.performance.setResourceTimingBufferSize) {
    performance.setResourceTimingBufferSize(300);
}
</script>
...
</head>...

Note: As of Chrome 42, these are still prefixed as webkitSetResourceTimingBufferSize and webkitClearResourceTimings.

Don’t just setResourceTimingBufferSize(99999999) as this could grow your visitors’s browser’s memory unnecessarily.

Compressing ResourceTiming Data

The HTTP Archive tells us there are about 100 HTTP resources on average, per page, with an average URL length of 85 bytes.

On average, each resource is ~ 500 bytes when JSON.stringify()‘d.

That means you could expect around 45 KB of ResourceTiming data per page load on the “average” site.

If you’re considering beaconing ResourceTiming data back to your own servers for analysis, you may want to consider compressing it first.

There’s a couple things you can do to compress the data, and I’ve written about these methods already. I’ve shared an open-source script that can compress ResourceTiming data that looks like this:

{
    "responseEnd":323.1100000002698,
    "responseStart":300.5000000000000,
    "requestStart":252.68599999981234,
    "secureConnectionStart":0,
    "connectEnd":0,
    "connectStart":0,
    "domainLookupEnd":0,
    "domainLookupStart":0,
    "fetchStart":252.68599999981234,
    "redirectEnd":0,
    "redirectStart":0,
    "duration":71.42400000045745,
    "startTime":252.68599999981234,
    "entryType":"resource",
    "initiatorType":"script",
    "name":"http://foo.com/js/foo.js"
}

To something much smaller, like this (which contains 3 resources):

{
    "http://": {
        "foo.com/": {
            "js/foo.js": "370,1z,1c",
            "css/foo.css": "48c,5k,14"
        },
        "moo.com/moo.gif": "312,34,56"
    }
}

Overall, we can compresses ResourceTiming data down to about 15% of its original size.

Example code to do this compression is available on github.

Timing-Allow-Origin

A cross-origin resource is any resource that doesn’t originate from the same domain as the page. For example, if your visitor is on http://foo.com/ and you’ve fetched resources from http://cdn.foo.com or http://mycdn.com, those resources will both be considered cross-origin.

By default, cross-origin resources expose timestamps for only the fetchStart and responseEnd attributes. This is to protect your privacy (so an attacker can’t load random URLs to see where you’ve been). This means that all of the following attributes will be 0 for cross-origin resources:

  • domainLookupStart
  • domainLookupEnd
  • connectStart
  • connectEnd
  • secureConnectionStart
  • requestStart
  • responseStart

Luckily, if you control the domains you’re fetching other resources from, you can overwrite this default precaution by sending down the Timing-Allow-Origin HTTP response header:

Timing-Allow-Origin = "Timing-Allow-Origin" ":" origin-list-or-null | "*"

In practice, most people that send the Timing-Allow-Origin HTTP header just send a wildcard origin:

Timing-Allow-Origin: *

So if you’re serving any of your content from another domain name, i.e. from a CDN, it is strongly recommended that you set the Timing-Allow-Origin header for those responses.

Thankfully, third-party libraries for widgets, ads, analytics, etc are starting to set the header on their content. Only about 5% currently do, but this is growing (according to the HTTP Archive). Notably, Google, Facebook, Disqus, and mPulse send this header for their scripts.

Blocking Time

Browsers will only open a limited number of connections to each unique origin (protocol/server name/port) when downloading resources.

If there are more resources than the # of connections, the later resources will be “blocking”, waiting for their turn to download.

Blocking time is generally seen as “missing periods” (non-zero durations) that occur between connectEnd and requestStart (when waiting on a Keep-Alive TCP connection to reuse), or between fetchStart and domainLookupStart (when waiting on things like the browser’s cache).

The duration attribute includes Blocking time. So in general, you may not want to use duration if you’re only interested in actual network timings. Unfortunately, duration, startTime and responseEnd are the only attributes you get with cross-origin resources, so you can’t easily subtract out Blocking time from cross-origin resources.

To calculate Blocking time, you would do something like this:

var blockingTime = 0;
if (res.connectEnd && res.connectEnd === res.fetchStart) {
    blockingTime = res.requestStart - res.connectEnd;
} else if (res.domainLookupStart) {
    blockingTime = res.domainLookupStart - res.fetchStart;
}

ServiceWorkers

If you are using ServiceWorkers in your app, there is an ongoing discussion to update ResourceTiming so you can get information about how long it took for the ServiceWorker to process each resource.

There is a new workerStart attribute that will be exposed. The difference between workerStart and fetchStart is the processing time of the ServiceWorker:

var workerProcessingTime = 0;
if (res.workerStart && res.fetchStart) {
    workerProcessingTime = res.fetchStart - res.workerStart;
}

This isn’t available in any browsers yet, but there are open bugs to add it.

Use Cases

Now that resource timing data is available in the browser in an accurate and reliable manner, there are a lot of things you can do with the information. Here are some ideas:

  • Send all resource timings to your backend analytics
  • Raise an analytics event if any resource takes over X seconds to download (and trend this data)
  • Watch specific resources (eg third-party ads or analytics) and complain if they are slow
  • Monitor the overall health of your DNS infrastructure by beaconing DNS resolve time per-domain
  • Look for production resource errors (eg 4xx/5xx) in browsers that add errors to the buffer it (IE/Firefox)
  • Use ResourceTiming to determine your site’s “visual complete” metric by looking at timings of all above-the-fold images

The possibilities are nearly endless. Please leave a comment with how you’re using ResourceTiming data.

DIY and Open-Source

Here are several interesting DIY / open-source solutions that utilize ResourceTiming data:

Andy Davies’ Waterfall.js shows a waterfall of any page’s resources via a bookmarklet: github.com/andydavies/waterfall

Andy Davies' Waterfall.js

Mark Zeman’s Heatmap bookmarklet / Chrome extension gives a heatmap of when images loaded on your page: github.com/zeman/perfmap

Mark Zeman's Heatmap bookmarklet and extension

Nurun’s Performance Bookmarklet breaks down your resources and creates a waterfall and some interesting charts: github.com/nurun/performance-bookmarklet

Nurun's Performance Bookmarklet

Boomerang also captures ResourceTiming data and beacons it back to your backend analytics server: github.com/lognormal/boomerang

Commercial Solutions

If you don’t want to build or manage a DIY / Open-Source solution to gather ResourceTiming data, there are many great commercial services available.

Disclaimer: I work at SOASTA, on mPulse and Boomerang

SOASTA mPulse captures 100% of your site’s traffic and gives you Waterfalls for each visit:

SOASTA mPulse Resource Timing

New Relic Browser:

New Relic Browser

App Dynamics Web EUEM:

App Dynamics Web EUEM

Availability

ResourceTiming is available in most modern browsers. According to caniuse.com, 67% of world-wide browser market share supports ResourceTiming (as of May 2015). This includes Internet Explorer 10+, Firefox 36+, Chrome 25+, Opera 15+, and Android Browser 4.4+.

CanIUse - ResourceTiming - May 2015

Again, notably absent, is Safari (both iOS and Mac). This means a significant share of the mobile browser market does not have performance timing data available, which is a shame. Let’s hope that changes soon.

There are no polyfills available for ResourceTiming, as the data is just simply not available if the browser doesn’t expose it.

Tips

Here are some additional (and re-iterated) tips for using ResourceTiming data:

  • For many sites, most of your content will not be same-origin, so ensure all of your CDNs and third-party libraries send the Timing-Allow-Origin HTTP response header.
  • ResourceTiming data does not include some aspects of the resource, such as it’s transfer size, content size or HTTP code for privacy concerns.
  • If you’re going to be managing the ResourceTiming buffer, make sure no other scripts are managing it as well (eg third-party analytics scripts). Otherwise, you may have two listeners for onresourcetimingbufferfull stomping on each other.
  • The duration attribute includes Blocking time (when a resource is blocked behind other resources on the same socket).
  • Each IFRAME will have its own ResourceTiming data, and those resources won’t be included in the parent FRAME/document. You’ll need to traverse the document frames to get all resources. See github.com/nicjansma/resourcetiming-compression.js.
  • about:blank and javascript: URLs may be in the ResourceTiming data for some browsers, and you may want to filter them out.
  • Browser extensions may show up in ResourceTiming data, if they initiate downloads. We’ve seen Skype and other extensions show up.

ResourceTiming – Coming Soon

All of the notes above document what current browsers support as of May 2015. There is ongoing work on the specification to add some additional attributes. Some of the new attributes may include:

  • nextHopProtocol: ALPN Protocol ID
  • transferSize: Bytes transferred for HTTP header and response
  • decodedBodySize: Size of the body after removing any applied content-codings
  • encodedBodySize: Size of the body after prior to removing any applied content-codings

In addition, there is an ongoing discussion for a more performant way to observe new ResourceTiming entries instead of polling getEntriesByType().

Conclusion

ResourceTiming exposes accurate performance metrics for all of the resources fetched on your page. You can use this data for a variety of scenarios, from investigating the performance of your third-party libraries to taking specific actions when resources aren’t performing according to your performance goals.

Next up: Using UserTiming data to expose custom metrics for your JavaScript apps in a standardized way.

Other articles in this series:

More resources:

Updates

  • 2016-01-03: Updated Firefox’s 404 and DNS behavior via Aaron Peters