Boomerang Performance Update

December 5th, 2019

Table Of Contents

  1. Introduction
  2. Boomerang Loader Snippet Improvements
  3. ResourceTiming Compression Optimization
  4. Debug Messages
  5. Minification
  6. Cookie Size
  7. Cookie Access
  8. MD5 plugin
  9. SPA plugin
  10. Brotli
  11. Performance Test Suite
  12. Next

Boomerang is an open-source JavaScript library that measures the page load experience of real users, commonly called RUM (Real User Measurement).

Boomerang is used by thousands of websites large and small, either via the open-source library or as part of Akamai’s mPulse RUM service. With Boomerang running on billions of page loads a day, the developers behind Boomerang take pride in building a reliable and performant third-party library that everyone can use without being concerned about its measurements affecting their site.

Two years ago, we performed an audit of Boomerang’s performance, to help communicate the “cost” of including Boomerang on a page as a third-party script. In doing the audit, we found several areas of code that we wanted to improve, and have been working steadily to make it better for our customers.

This article highlights some of the improvements we’ve made in the last two years, and what we still want to work on next!

Boomerang Loader Snippet Improvements

We recommended loading Boomerang via our custom loader snippet. The snippet ensures Boomerang loads asynchronously and non-blocking, so it won’t affect the Page Load time. Version 10 (v10) of this snippet utilizes an IFRAME to host Boomerang, which gets it out of the critical path.

When we reviewed the performance impact of the snippet we found that on modern devices and browsers the snippet itself should take less than 10ms of CPU, but on older or slower devices it could take 20-40ms.

A significant portion of this CPU cost was creating a dynamic IFRAME and document.write()‘ing into it.

Last year, our team developed an updated version of the loader snippet we call Version 12 (v12), which utilizes Preload on modern browsers (instead of an IFRAME) to load Boomerang asynchronously. This avoids the majority of the CPU cost we saw when profiling the snippet.

As long as your browser supports Preload, the new loader snippet should have negligible CPU cost:

DeviceOSBrowserSnippet v10 (ms)Snippet v12 (ms)Method
PC DesktopWin 10Chrome 7371Preload
PC DesktopWin 10Firefox 6633IFRAME
PC DesktopWin 10IE 101212IFRAME
PC DesktopWin 10IE 111414IFRAME
PC DesktopWin 10Edge 4481Preload
MacBook Pro (2017)macOS High SierraSafari 1231Preload
Galaxy S4Android 4Chrome 56371Preload
Galaxy S8Android 8Chrome 7391Preload
Galaxy S10Android 9Chrome 7371Preload
iPhone 4iOS 7Safari 71919IFRAME
iPhone 5siOS 11Safari 1199IFRAME
iPhone 5siOS 12Safari 1291Preload

Browsers which don’t support Preload (such as IE and Firefox) will still use the IFRAME fallback, but it should still take minimal time to execute.

In addition, the new loader snippet is CSP-compliant, and brings some SEO improvements (i.e. creating an IFRAME in the <head> can confuse some web crawlers).

You can review the difference between the two versions here.

ResourceTiming Compression Optimization

Boomerang can collect ResourceTiming data for all of the resources on the page. We compress the data to reduce its size, but we found this was one of the most expensive things we did.

Boomerang's ResourceTiming Compression in CPU Profiles

On some sites — especially those with hundreds of resources — our ResourceTiming compression could take 20-100ms or more. Most of the cost is in our optimizeTrie() function. Let’s take a look at what it does.

Say you have a list of ResourceTiming entries, i.e. resources fetched from a website:

http://site.com/
http://site.com/assets/js/main.js
http://site.com/assets/js/lib.js
http://site.com/assets/css/screen.css

We convert this list of URLs into an optimized Trie, a data structure that compresses common prefixes (i.e. http://site.com/ for all URLs above):

{"http://site.com/": {
    "|": "[data]",
    "assets/": {
        "js/": {
            "main.js": "[data]",
            "lib.js": "[data]"
        },
        "css/screen.css": "[data]"
    }
}

Originally, we were compressing a perfectly-optimized Trie — we would evaluate every character of every URL to find if there are other URLs that share the same prefix. This character iteration would create call stacks N deep, where N is the longest URL on the page. This is pretty costly to execute as you can imagine!

We switched this Trie optimization to instead split each URL at every "/" instead of every character. This leads to a slightly less optimized Trie (meaning, a few bytes larger), but it’s significantly faster. Call stacks are now only as deep as the number of slashes in the URL.

These optimizations are most significant on large sites (i.e. 100+ URLs): on sites where the ResourceTiming optimization was taking > 100ms (on desktop CPUs), changing to splitting at "/" reduced CPU time to ~25ms at only a 4% growth in data.

On less complex sites that used to take 25-35ms to compress this data, changing our algorithm reduced CPU time to just ~10ms at only 2-3% data growth.

Collecting ResourceTiming is still one of the more expensive operations that Boomerang does, but its costs are much less noticeable!

You can review our change here.

Debug Messages

Our community noticed that even in the production builds of Boomerang, the BOOMR.debug() debug messages were included in the minified JavaScript, even though they would never be echo’d to the console.

Stripping these messages from the build saw immediate size improvements:

  • Original size (minified): 205,769 bytes
  • BOOMR.debug() and related messages removed: 196,125 bytes (-5%)

Gzipped:

  • Original size (minified, gzipped): 60,133 bytes
  • BOOMR.debug() and related messages removed (minified, gzipped): 57,123 bytes (-6%)

Removing dead code is always a win!

You can review our change here.

Minification

For production builds, we used UglifyJS-2 to minify Boomerang.

Minification is incredibly powerful — in our case, our “debug” builds are over 800 KB, and are reduced to 191 KB on minification:

  • Boomerang with all comments and debug code: 820,130 bytes
  • Boomerang minified: 196,125 bytes (76% savings)

When Boomerang was first created, it used the YUI Compressor for minification. We switched to UglifyJS-2 in 2015 as it offered the best minification rate at the time.

UgifyJS-3 is now out, so we wanted to investigate if it (or any other tool) offered better minification in 2019.

After some experimentation, we changed from UglifyJS-2 to UglifyJS-3, while also enabling the ie8:true option for compatibility and compress.keep_fnames option to improve Boomerang Error reporting.

After all of these changes, Boomerang is 2,074 bytes larger uncompressed (because of ie8 and keep_fnames), though 723 bytes smaller once gzipped. Had we not enabled those compatibility options, Uglify-3 would’ve been 2,596 bytes smaller than Uglify-2. We decided the compatibility changes were worth it.

You can review this change here.

Our team also looked at the Closure compiler, and we estimate it would give us additional savings:

  • UglifyJS-3: 47 KB brotli, 55 KB gzip
  • Closure: 42 KB brotli, 47 KB gzip

Unfortunately, there are some optimizations in the Closure compiler that complain about parts of the Boomerang source code. Boomerang today can collect performance metrics on IE6+, and switching to Closure might reduce our support for older browsers.

Cookie Size

mPulse uses a first-party cookie RT to track mPulse sessions between pages, and for tracking Page Load time on browsers that don’t support NavigationTiming.

Here’s an example of what the cookie might look like in a modern browser. Our cookie consists of name-value pairs (~322 bytes):

dm=virtualglobetrotting.com
si=9563ee29-e809-4844-88df-d4e84697f475
ss=1537818004420
sl=1
tt=7556
obo=0
sh=1537818012102%3D1%3A0%3A7556
bcn=%2F%2F17d98a5a.akstat.io%2F
ld=1537818012103
nu=https%3A%2F%2Fvirtualglobetrotting.com%maps%2F
cl=1537818015199
r=https%3A%2F%2Fvirtualglobetrotting.com%2F
ul=1537818015211

We reviewed everything that we were storing in the cookie, and found a few areas for improvement.

The two biggest contributors to its size are the "nu" and "r" values, which are the URL of the next page the user is navigating to, and the referring page, respectively. This means the cookie will be at least as long as those two URLs.

We decided that we didn’t need to store the full URL in the cookie: we could use a hash instead, and compare the hashes on the next page. This can save a lot of bytes on longer-URL sites.

We were also able to reduce the size of all of the timestamps in the cookie by switching to Base36 encoding, and offsetting each timestamp by the start time ("ss"). We also removed some unused debugging data ("sh").

Example cookie after all of the above changes (~189 bytes, 41% smaller):

dm=virtualglobetrotting.com
si=9563ee29-e809-4844-88df-d4e84697f475
ss=jmgp4udg
sl=1
tt=5tw
obo=0
bcn=%2F%2F17d98a5a.akstat.io%2F
ld=5xf
nu=13f91b339af573feb2ad5f66c9d65fc7
cl=8bf
ul=8br
z=1

You can review our change here.

Cookie Access

As part of the 2017 Boomerang Performance Audit we found that Boomerang was reading/writing the Cookie multiple times during the page load, and the relevant functions were showing up frequently in CPU profiles:

Boomerang Cookie Access

Upon review, we found that Boomerang was reading the RT cookie up to 21 times during a page load and setting it a further 8 times. This seemed… excessive. Cookie accesses may show up on CPU profiles because it can be accessing the disk to persist the data.

Through an audit of these reads/writes and some optimizations, we’ve been able to reduce this down to 2 reads and 4 writes, which is the minimal number we think are required to be reliable.

This change is in the process of being back-ported to the open-source repo, and should be available by the end of the year.

MD5 plugin

This improvement comes from our new Boomerang developer Tsvetan Stoychev. In his words:

“About 2 months ago we received a ticket in our GitHub repository about an issue developers experienced when they tried to build a Boomerang bundle that does not contain the Boomerang MD5 plugin. It was nothing critical but I started working on a small fix to make sure the issue doesn’t happen again in newer versions of Boomerang.

While working on the fix, I managed to understand the bigger picture and the use cases where we used the MD5 plugin. I saw an opportunity to replace MD5 implementation with an algorithm that doesn’t generate hashes as strong as MD5 but would still work in our use case.

The goal was to reduce the Boomerang bundle size and to keep things backward-compatible. I asked around in a group of professionals who do programming for IoT devices and received some good ideas about lightweight algorithms that could save a lot of bytes.

I looked at 2 candidates: CRC32 and FNV-1. The winner was a slightly modified version of FNV-1 and in terms of bytes it was 0.33 KB (uncompressed) compared to MD5, which was 8.17 KB (uncompressed). I also experimented with CRC32 but the JavaScript implementation was 2.55 KB (uncompressed) because the CRC32 source code contains a string of a CRC table that adds quite a few bytes. As a matter of fact, there is a way to use a smaller version that is 0.56 KB (uncompressed) where the CRC table is generated on the fly, but I found this version was adding more complexity and had a performance penalty when the CRC table was generated.

Let’s look at the data I gathered during my research where I compare performance, bytes and chances for collisions:

#CollisionsSize (uncompressed)Hash LengthHashes / Sec
MD508.17 KB3235,397
CRC3282.55 KB9-10253,680
FNV-1 (original)30.33 KB9-10180,056
FNV-1 (modified)00.34 KB6-8113,532
  • Collision testing was performed on 500,000 unique URLs.
  • Performance benchmark was performed with the help of jsPerf

FNV-1 modifications:

  • In order to achieve better entropy for the FNV-1 algorithm, we concatenate the length of our input string with the end of the original FNV-1 generated hash.
  • The FNV-1 output is actually a number that we convert to base 36 in the modified version in order to make the number’s string representation shorter.

Below is the final result in case you would like to test the modified version of FNV-1:

var fnv = function(string) {
    string = encodeURIComponent(string);
    var hval = 0x811c9dc5;
    for (var i = 0; i < string.length; i++) {
        hval = hval ^ string.charCodeAt(i);
        hval += (hval << 1) + (hval << 4) + (hval << 7) + (hval << 8) + (hval << 24);
    }
    var hash = (hval >>> 0).toString() + string.length;
    return parseInt(hash).toString(36);
}

This change is in the process of being back-ported to the open-source repo, and should be available by the end of the year.

SPA Plugin

This improvement comes from our Boomerang developer Nigel Heron, who’s been working on a large refactor (and simplification) of our Single Page App monitoring code. In his words:

“Boomerang had 4 SPA plugins: Angular, Ember, Backbone and History. The first 3 are SPA framework specific and hook into special functions in their respective frameworks to detect route changes (eg. Angular’s $routeChangeStart).

The History plugin could be used in 2 ways, either by hooking into the React Router framework’s history object or by hooking into the browser’s window.history object for use by any other framework that calls the window History API before issuing route changes.

As SPA frameworks have evolved over the years, calling into the History API before route changes has become standard. We determined that the timings observed by hooking History API calls vs hooking into framework specific events was negligible.

We modified the History plugin to remove React specific code and removed the 3 framework specific plugins from Boomerang.

This has simplified Boomerang’s SPA logic, simplified the configuration of Boomerang on SPA sites and as a bonus, has dropped the size of Boomerang!

unminifiedminifiedgzip
before796 KB202 KB57.27 KB
after779 KB200 KB57.07 KB

Brotli

This is a fix that could be applied to any third-party library, but when we ran our audit two years ago, we had not yet enabled Brotli compression for boomerang.js delivery when using mPulse.

mPulse’s boomerang.js is delivered via the Akamai CDN, and we were able to utilize Akamai’s Resource Optimizer to automatically enable Brotli compression for our origin-gzip-compressed JavaScript URLs, with minimal effort. We should have done this sooner!

The over-the-wire byte savings of Brotli vs gzip are pretty significant. Taking the most recent build of Boomerang with all plugins enabled:

  • gzip: 54,803 bytes
  • brotli: 48,663 bytes (11.2% savings)

Brotli is now enabled for all boomerang.js downloads for mPulse customers.

(Over-the-wire byte size does not reduce the parse/compile/initialization time of Boomerang, and we’re still looking for additional ways of making Boomerang and its plugins smaller and more efficient)

Performance Test Suite

After spending so much time improving the above issues, it would be a shame if we were working on unrelated changes and accidentally regressed our improvements!

To assist us in this, we built a lightweight performance test suite into the Boomerang build infrastructure that can be executed on-demand and the results can be compared to previous runs.

We built the performance tests on top of our existing End-to-End (E2E) tests, which are basic HTML pages we use to test Boomerang in various scenarios. Today we have over 550+ E2E tests that run on every build.

The new Boomerang Performance Tests utilize a similar test infrastructure and also track metrics built into the debug builds of Boomerang, such as the scenario’s CPU time or how many times the cookie was set.

Example results of running one Boomerang Test scenario:

> grunt perf

{
  "00-basic": {
    "00-empty": {
      "page_load_time": 30,
      "boomerang_javascript_time": 29.5,
      "total_javascript_time": 43,
      "mark_startup_called": 1,
      "mark_check_doc_domain_called": 4,
  ...
}

If we re-run the same test later after making changes, we can directly compare the results:

> grunt perf:compare

Running "perf-compare" task
Results comparison to baseline:
00-basic.00-empty.page_load_time                                       :  30 -6 (-20%)

We still need to be attentive to our changes and to run these test automatically to “hold the line”.

We also need to try to be mindful and put in instrumentation (metrics) when we’re making fixes to make sure we have something to track over time.

You can review our change here.

Next

With all of the above improvements, where are we at?

We’ve:

  • Reduced the overhead and increased the compatibility of the Loader Snippet
  • Reduced the CPU cost of gathering ResourceTiming data
  • Reduced the size of Boomerang by stripping debug messages, applying smarter minification, and enabling Brotli
  • Reduced the size of and overhead of cookies for Session tracking
  • Reduced the size of Boomerang and complexity of hashing URLs by switching to FNV, and by simplifying our SPA plugins
  • Created a Performance Test Suite to track our improvements over time

All of these changes had to be balanced against the other work we do to improve the library itself. We still have a large list of other performance-related items we hope to tackle in 2020.

One of the main areas we continue to focus on is Boomerang’s size. As more features, reliability and compatibility fixes get added, Boomerang continues to increase in size. While we’ve done a lot of work to keep it under 50 KB (over the wire), we know there is more we can do.

One thing we’re looking into for our mPulse customers is to deliver builds of Boomerang with the minimal features necessary for that application’s configuration. For example, if Single Page App support is not needed, all of the related SPA plugins can be omitted. We’re exploring ways of quickly switching between different builds of Boomerang for mPulse customers, while still having a high browser cache hit rate and yet allowing customers to change their configuration and see it “live” within 5 minutes.

Note that open-source users of Boomerang can always build a custom build with only the plugins they care about. Depending on the features needed, open-source builds can be under 30 KB with the majority of the plugins included.

Thanks

Boomerang is built by the developers at Akamai and the broader performance community on Github. Many people have a hand in building and improving Boomerang, including the changes mentioned above by: Nigel Heron, Nic Jansma, Andreas Marschke, Avinash Shenoy, Tsvetan Stoychev, Philip Tellis, Tim Vereecke, Aleksey Zemlyanskiy, and all of the other open-source contributors.

Side Effects of Boomerang’s JavaScript Error Tracking

April 9th, 2019

Table Of Contents

  1. Introduction
  2. What is Boomerang Doing
  3. Fixing Script Error
  4. Workarounds for Third Parties that Aren’t Sending ACAO
  5. Disabling Wrapping
  6. Side Effects of Wrapping
    1. Overhead
    2. Console Logs
    3. Browser CPU Profiling
    4. Chrome Lighthouse and Page Speed Insights
    5. WebPagetest
  7. Summary

Introduction

TL;DR: If you don’t have time to read this article, head down to the summary. If you’re specifically interested in how Boomerang may affect Chrome Lighthouse, Page Speed Insights or WebPagetest, check those sections.

Boomerang is an open-source JavaScript library that measures the page load time experienced by real users, commonly called RUM (Real User Measurement). Boomerang is easily integrated into personal projects, enterprise websites, and also powers Akamai’s mPulse RUM monitoring. Boomerang optionally includes a JavaScript Error Tracking plugin, which monitors the page for JavaScript errors and includes those error messages on the beacon.

When JavaScript Error Tracking is enabled for Boomerang, it can provide real-time telemetry (analytics) on the health of your application. However, due to the way Boomerang gathers important details about each error (e.g. the full message and stack), it might be incorrectly blamed for causing errors, or as the culprit for high CPU usage.

This unintentional side-effect can be apparent in tools like browser developer tools’ consoles, WebPagetest, Chrome Lighthouse and other benchmarks that report on JavaScript CPU usage. For example, Lighthouse may blame Boomerang under Reduce JavaScript Execution Time due to how Boomerang “wraps” functions to gather better error details. This unfortunately “blames” Boomerang for more work than it is actually causing.

Let’s explore why, and how you can ensure those tools report on the correct source of errors and JavaScript CPU usage.

This article is applicable to both the open-source Boomerang as well as the Boomerang used with Akamai mPulse. Where appropriate, differences will be mentioned.

Read more…

3rdParty.io

August 1st, 2018

3rdParty.io is a new tool I’ve released to evangelize best-practices for third-party scripts and libraries:

3rdParty.io

3rdParty.io is a YSlow-like developer tool that’s designed to run best-practice checklists on third-party scripts and libraries. While YSlow and other great synthetic testing tools (such as Chrome’s Lighthouse) perform audits on entire websites, 3rdParty.io focuses on individual third-party JavaScript files, validating that they won’t slow down your site or cause compatibility or security concerns.

What exactly is a third party? Any JavaScript script, library, widget, snippet, tag, ad, analytics, CSS, font, or resource that you include in your site that comes from somewhere else.

Third-party scripts have the ability to break your site, cause performance problems, leave you vulnerable, and cost you money.

There are many great providers of third-party scripts out there who are following best practices to ensure that their content is secure and performant. However, not every third-party has followed best practices, and these mistakes can wreak havoc on your site.

3rdParty.io’s goals are to monitor popular third-party scripts, promote best practices, and to ensure that content providers can feel comfortable including third-party scripts and libraries in their site without worry.

There’s an existing library of third-parties that are being constantly validated (with many more being included soon).  You can click on any of them to get a breakdown of what they’re doing right, and how they can improve:

3rdParty.io mPulse

Each check (there are about 60 today) can be expanded to get details of why the check is important, and how it can be fixed:

3rdParty.io TAO check

You can run any third-party JavaScript library through the site via the home page.  Simply paste the JavaScript URL into the Check! bar:

3rdParty.io On Demand

Once you hit Check!, we will run an automated checkup within a few moments:

3rdParty.io On Demand Results

It’s still a work-in-progress, with a lot still to be done: adding more checks, making it more reliable, loading more third-parties into the database.  I would love your feedback!

Please check out the tool at 3rdParty.io!

When Third Parties Stop Being Polite… and Start Getting Real

July 23rd, 2018

When Third Parties Stop Being Polite... and Start Getting Real

At Fluent 2018, Charlie Vazac and I gave a talk titled When Third Parties Stop Being Polite… and Start Getting Real. Here’s the abstract:

Would you give the Amazon Prime delivery robot the key to your house, just because it stops by to deliver delicious packages every day? Even if you would, do you still have 100% confidence that it wouldn’t accidentally drag in some mud, let the neighbor in, steal your things, or burn your house down? Worst-case scenarios such as these are what you should be planning for when deciding whether or not to include third-party libraries and services on your website. While most libraries have good intentions, by including them on your site, you have given them complete control over the kingdom. Once on your site, they can provide all of the great services you want—or they can destroy everything you’ve worked so hard to build.It’s prudent to be cautious: we’ve all heard stories about how third-party libraries have caused slowdowns, broken websites, and even led to downtime. But how do you evaluate the actual costs and potential risks of a third-party library so you can balance that against the service it provides? Every library requires nonzero overhead to provide the service it claims. In many cases, the overhead is minimal and justified, but we should quantify it to understand the real cost. In addition, libraries need to be carefully crafted so they can avoid causing additional pain when the stars don’t align and things go wrong.

Nic Jansma and Charles Vazac perform an honest audit of several popular third-party libraries to understand their true cost to your site, exploring loading patterns, SPOF avoidance, JavaScript parsing, long tasks, runtime overhead, polyfill headaches, security and privacy concerns, and more. From how the library is loaded, to the moment it phones home, you’ll see how third-parties can affect the host page and discover best practices you can follow to ensure they do the least potential harm.

With all of the great performance tools available to developers today, we’ve gained a lot of insight into just how much third-party libraries are impacting our websites. Nic and Charles detail tools to help you decide if a library’s risks and unseen costs are worth it. While you may not have the time to perform a deep dive into every third-party library you want to include on your site, you’ll leave with a checklist of the most important best practices third-parties should be following for you to have confidence in them.

You can watch the presentation on YouTube or catch the slides.

ResourceTiming Visibility: Third-Party Scripts, Ads and Page Weight

March 27th, 2018

Table Of Contents

  1. Introduction
  2. How to gather ResourceTiming data
  3. How cross-origins play a role
    3.1 Cross-Origin Resources
    3.2 Cross-Origin Frames
  4. Why does this matter?
    4.1 Third-Party Scripts
    4.2 Ads
    4.3 Page Weight
  5. Real-world data
  6. Workarounds
  7. Summary

1. Introduction

ResourceTiming is a specification developed by the W3C Web Performance working group, with the goal of exposing accurate performance metrics about all of the resources downloaded during the entire user experience, such as images, CSS and JavaScript.

For a detailed background on ResourceTiming, you can see my post on ResourceTiming in Practice.

One important caveat when working with ResourceTiming data is that in most cases, it will not give you the full picture of all of the resources fetched during a page load.

Why is that? Four reasons:

  1. Assets served from a different domain — known as cross-origin resources — have restrictions applied to them (on timing and size data), for privacy and security reasons, unless they have the Timing-Allow-Origin HTTP response header
  2. Each <iframe> on the page keeps track of the resources that it fetched, and the frame.performance API that exposes the ResourceTiming data is blocked in a cross-origin <iframe>
  3. Some asset types (e.g. <video> tags) may be affected by browser bugs where they don’t generate ResourceTiming entries
  4. If the number of resources loaded in a frame exceeds 150, and the page hasn’t called performance.setResourceTimingBufferSize() to increase the buffer size, ResourceTiming data will be limited to the first 150 resources

The combination of these restrictions — some in our control, some out of our control — leads to an unfortunate caveat when analyzing ResourceTiming data: In the wild, you will rarely be able to access the full picture of everything that was fetched on the page.

Given these blind-spots, you may have a harder time answering these important questions:

  • Are there any third-party JavaScript libraries that are significantly affecting my visitor’s page load experience?
  • How many resources are being fetched by my advertisements?
  • What is my Page Weight?

We’ll explore what ResourceTiming Visibility means for your page, and how you can work around these limitations. We will also sample ResourceTiming Visibility data across the web by looking at the Alexa Top 1000, to see how frequently resources get hidden from our view.

But first, some background:

2. How to gather ResourceTiming Data

ResourceTiming is a straightforward API to use: window.performance.getEntriesByType("resource") returns a list of all resources fetched in the current frame:

var resources = window.performance.getEntriesByType("resource");

/* eg:
[
    {
        name: "https://www.foo.com/foo.png",
        entryType: "resource",
        startTime: 566.357000003336,
        duration: 4.275999992387369,
        initiatorType: "img",
        redirectEnd: 0,
        redirectStart: 0,
        fetchStart: 566.357000003336,
        domainLookupStart: 566.357000003336,
        domainLookupEnd: 566.357000003336,
        connectStart: 566.357000003336,
        secureConnectionStart: 0,
        connectEnd: 566.357000003336,
        requestStart: 568.4959999925923,
        responseStart: 569.4220000004862,
        responseEnd: 570.6329999957234,
        transferSize: 10000,
        encodedBodySize: 10000,
        decodedBodySize: 10000,
    }, ...
]
*/

There is a bit of complexity added when you have frames on the site, e.g. from third-party content such as social widgets or advertising.

Since each <iframe> has its own buffer of ResourceTiming entries, you will need to crawl all of the frames on the page (and sub-frames, etc), and join their entries to the main window’s.

This gist shows a naive way of doing this. For a version that deals with all of the complexities of the crawl, such as adjusting resources in each frame to the correct startTime, you should check out Boomerang’s restiming.js plugin.

3. How cross-origins play a role

The main challenge with ResourceTiming data is that cross-origin resources and cross-origin frames will affect the data you have access to.

3.1 Cross-Origin Resources

For each window / frame on your page, a resource will either be considered “same-origin” or “cross-origin”:

  • same-origin resources share the same protocol, hostname and port of the page
  • cross-origin resources have a different protocol, hostname (or subdomain) or port from the page

ResourceTiming data includes all same-origin and cross-origin resources, but there are restrictions applied to cross-origin resources specifically:

  • Detailed timing information will always be 0:
    • redirectStart
    • redirectEnd
    • domainLookupStart
    • domainLookupEnd
    • connectStart
    • connectEnd
    • requestStart
    • responseStart
    • secureConnectionStart
    • That leaves you with just startTime and responseEnd containing timestamps
  • Size information will always be 0:
    • transferSize
    • encodedBodySize
    • decodedBodySize

For cross-origin assets that you control, i.e. images on your own CDN, you can add a special HTTP response header to force the browser to expose this information:

Timing-Allow-Origin: *

Unfortunately, for most websites, you will not be in control of all of third-party (and cross-origin) assets being served to your visitors. Some third party domains have been adding the Timing-Allow-Origin header to their responses, but not all — current usage is around 13%.

3.2 Cross-Origin Frames

In addition, every <iframe> you have on your page will have its own list of ResourceTiming entries. You will be able to crawl any same-origin and anonymous frames to gather their ResourceTiming entries, but cross-origin frames block access to the the contentWindow (and thus the frame.performance) object — so any resources fetched in cross-origin frames will be invisible.

Note that adding Timing-Allow-Origin to a cross-origin <iframe> (HTML) response will give you full access to the ResourceTiming data for that HTML response, but will not allow you to access frame.performance, so all of the <iframe>‘s ResourceTimings remain unreachable.

To recap:

ResourceTiming Visibility Diagram

4. Why does this matter?

Why does this all matter?

When you, a developer, are looking at a browser’s networking panel (such as Chrome’s Network tab), you have full visibility into all of the page’s resources:

Chrome Network Panel

Unfortunately, if you were to use the ResourceTiming API at the same time, you may only be getting a subset of that data.

ResourceTiming isn’t giving you the full picture.

Without the same visibility into the resources being fetched on a page as a developer has using browser developer tools, if we are just looking at ResourceTiming data, we may be misleading ourselves about what is really happening in the wild.

Let’s look at a couple scenarios:

4.1 Third-Party Scripts

ResourceTiming has given us fantastic insight into the resources that our visitors are fetching when visiting our websites. One of the key benefits of ResourceTiming is that it can help you understand what third-party libraries are doing on your page when you’re not looking.

Third-party libraries cover the spectrum of:

  • JavaScript libraries like jQuery, Modernizr, Angular, and marketing/performance analytics libraries
  • CSS libraries such as Bootstrap, animate.css and normalize.css
  • Social widgets like the Facebook, Twitter and Google+ icons
  • Fonts like FontAwesome, Google Fonts
  • Ads (see next section)

Each of the above categories bring in different resources. A JavaScript library may come alone, or it may fetch more resources and send a beacon. A CSS library may trigger downloads of images or fonts. Social widgets will often create frames on the page that load even more scripts, images and CSS. Ads may bring in 100 megabytes of video just because.

All of these third-party resources can be loaded in a multitude of ways as well. Some of may be embedded (bundled) directly into your site; some may be loaded from a CDN (such as cdnjs.com); some libraries may be loaded directly from a service (such a Google Fonts or Akamai mPulse).

The true cost of each library is hard to judge, but ResourceTiming can give us some insight into the content being loaded.

It’s also important to remember that just because a third-party library behaves one way on your development machine (when you have Developer Tools open) doesn’t mean that it’s not going to behave differently for other users, on different networks, with different cookies set, when it detects your Developer Tools aren’t open. It’s always good to keep an eye on the cost of your third-party libraries in the wild, making sure they are behaving as you expect.

So how can we use ResourceTiming data to understand third-party library behavior?

  • We can get detailed timing information: how long does it take to DNS resolve, TCP connect, wait for the first byte and take to download third-party libraries. All of this information can help you judge the cost of a third-party library, especially if it is for a render-blocking critical resource like non-async JavaScripts, CSS or Fonts.
  • We can understand the weight (byte cost) of the third party, and the dependencies it brings in, by looking at the transferSize. We can also see if it’s properly compressed by comparing encodedBodySize to decodedBodySize.

In order to get the detailed timing information and weight of third-party libraries, they need to be either served same-domain (bundled with your other assets), or with a Timing-Allow-Origin HTTP response header. Otherwise, all you get is the overall duration (without DNS, TCP, request and response times), and no size information.

Take the below screenshot as an example. In it, there are third-party (cross-origin) images on vgtstatic.com that have Timing-Allow-Origin set, so we get a detailed breakdown of the how long they took. We can determine Blocking, DNS, TCP, SSL, Request and Response phases of the requests. We see that many of these resources were Blocked (the grey phase) due to connection limits to *.vgtstatic.com. Once TCP (green) was established, the Request (yellow) was quick and the Response (purple) was nearly instant:

ResourceTiming Waterfall with Cross-Origin Resources

Contrast all of this information to a cross-origin request to http://www.google-analytics.com/collect. This resources does not have Timing-Allow-Origin set, so we only see its total Duration (deep blue). For some reason this beacon took 648 milliseconds, and it is impossible for us to know why. Was it delayed (Blocked) by other requests to the same domain? Did google-analytics.com take a long time to first byte? Was it a huge download? Did it redirect from http:// to https://?

Since the URL is cross-origin without Timing-Allow-Origin, we also do not have any byte information about it, so we don’t know how big the response was, or if it was compressed.

The great news is that many of the common third-party domains on the internet are setting the Timing-Allow-Origin header, so you can get full details:

> GET /analytics.js HTTP/1.1
> Host: www.google-analytics.com
>
< HTTP/1.1 200 OK
< Strict-Transport-Security: max-age=10886400; includeSubDomains; preload
< Timing-Allow-Origin: *
< ...

However, there are still some very common scripts that show up without Timing-Allow-Origin (in the Alexa Top 1000), even when they’re setting TAO on other assets:

  • https://connect.facebook.net/en_US/fbevents.js (200 sites)
  • https://sb.scorecardresearch.com/beacon.js (133 sites)
  • https://d31qbv1cthcecs.cloudfront.net/atrk.js (104 sites)
  • https://www.google-analytics.com/plugins/ua/linkid.js (54 sites)

Also remember that any third-party library that loads a cross-origin frame will be effectively hiding all of its cost from ResourceTiming. See the next section for more details.

4.2 Ads

Ads are a special type of third-party that requires some additional attention.

Most advertising on the internet loads rich media such as images or videos. Most often, you’ll include an ad platform on your website by putting a small JavaScript snippet somewhere on your page, like the example below:

<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<ins class="adsbygoogle" style="display:block;width: 120px; height: 600px" data-ad-client="ca-pub-123"></ins>
<script>(adsbygoogle = window.adsbygoogle || []).push({});</script>

Looks innocent enough right?

Once the ad platform’s JavaScript is on your page, it’s going to bootstrap itself and try to fetch the media it wants to display to your visitors. Here’s the general lifecycle of an ad platform:

  • The platform is loaded by a single JavaScript file
  • Once that JavaScript loads, it will often load another library or load a “configuration file” that tells the library where to get the media for the ad
  • Many ad platforms then create an <iframe> on your page to load a HTML file from an advertisor
  • That <iframe> is often in control of a separate third-party advertisor, which can (and will) load whatever content it wants, including additional JavaScript files (for tracking viewing habits), media and CSS.

In the end, the original JavaScript ad library (e.g. adsbygoogle.js at 25 KB) might be responsible for loading a total of 250 KB of content, or more — a 10x increase.

How much of this content is visible from ResourceTiming? It all depends on a few things:

  1. How much content is loaded within your page’s top level window and has Timing-Allow-Origin set?
  2. How much content is loaded within same-origin (anonymous) frames and has Timing-Allow-Origin set?
  3. How much content is loaded from a cross-origin frame?

Let’s take a look at one example Google AdSense ad:

  • In the screenshot below, the green content was loaded into the top-level window. Thankfully, most of Google AdSense’s resources have Timing-Allow-Origin set, so we have complete visibility into how long they took and how many bytes were transferred. These 3 resources accounted for about 26 KB of data.
  • If we crawl into accessible same-origin frames, we’re able to find another 4 resources that accounted for 115 KB of data. Only one of these was missing Timing-Allow-Origin.
  • AdSense then loaded the advertiser’s content in a cross-origin frame. We’re unable to look at any of the resources in the cross-origin frame. There were 11 resources that accounted for 98 KB of data (40.8% of the total)

Google Ads layout

Most advertising platforms’s ads are loaded into cross-origin frames so they are less likely to affect the page itself. Unfortunately, as you’ve seen, this means it’s also easy for advertisers to unintentionally hide their true cost to ResourceTiming.

4.3 Page Weight

Another challenge is that it makes it really hard to measure Page Weight. Page Weight measures the total cost, in bytes, of loading a page and all of its resources.

Remember that the byte fields of ResourceTiming (transferSize, decodedBodySize and encodedBodySize) are hidden if the resource is cross-origin. In addition, any resources fetched from a cross-origin frame will be completely invisible to ResourceTiming.

So in order for ResourceTiming to expose the accurate Page Weight of a page, you need to ensure the following:

  • All of your resources are fetched from the same origin as the page
    • Unless: If any resources are fetched from other origins, they must have Timing-Allow-Origin
  • You must not have any cross-origin frames
    • Unless: If you are in control of the cross-origin frame’s HTML (or you can convince third-parties to add a snippet to their HTML), you might be able to “bubble up” ResourceTiming data from cross-origin frames

Anything less than the above, and you’re not seeing the full Page Weight.

5. Real World data

How many of your resources are fully visible to ResourceTiming? How many are visible, but don’t contain the detailed timing and size information due to cross-origin restrictions? How many are completely hidden from your view?

There’s been previous research and HTTP Archive stats on Timing-Allow-Origin usage, and I wanted to expand on that research by also seeing how significantly the cross-origin <iframe> issue affects visibility.

To test this, I’ve built a small testing tool that runs Chrome headless (via puppetteer).

The tool loads the Alexa Top 1000 websites, monitoring all resources fetched in the page by the Chrome native networking APIs. It then executes a crawl of the ResourceTiming data, going into all of the frames it has access to, and compares what the browser fetched versus what is visible to ResourceTiming.

Resources are put into three visibility buckets:

  • Full: Same-origin resources, or cross-origin resources that have Timing-Allow-Origin set. These have all of the timing and size fields available.
  • Restricted: Cross-origin resources without Timing-Allow-Origin set. These only have startTime and responseEnd, and no size fields (so can’t be used for Page Weight calculations).
  • Missing: Any resource loaded in a cross-origin <iframe>. Completely invisible to ResourceTiming.

Overall

Across the Alexa Top 1000 sites, only 22.3% of resources have Full visibility from ResourceTiming.

A whopping 31.7% of resources are Missing from ResourceTiming, and the other 46% are Restricted, showing no detailed timing or size information:

ResourceTiming Visibility - Overall by Request Count
Overall ResourceTiming Visibilty by Request Count

If you’re interested in measuring overall byte count for Page Weight, even more data is missing from ResourceTiming. Over 50% of all bytes transferred when loading the Alexa Top 1000 were Missing from ResourceTiming. In order to calculate Page Weight, you need Full Visibility, so if you used ResourceTiming to calculate Page Weight, you would only be (on average) including 18% of the actual bytes:

ResourceTiming Visibility - Overall by Byte Count
Overall ResourceTiming Visibilty by Byte Count

By Site

Obviously each site is different, and due to how a site is structured, each site will have a differing degrees of Visibility. You can see that Alexa Top 1000 websites vary across the Visibility spectrum:

ResourceTiming Visibility Missing Percentage by Site

There are some sites with 100% Full Visibility. They’re serving nearly all of the content from their primary domain, or, have Timing-Allow-Origin on all resources, and don’t contain any cross-origin frames. For example, a lot of simple landing pages such as search engines are like this:

  • google.com
  • sogou.com
  • telegram.org

Other sites have little or no Missing resources but are primarily Restricted Visibility. This often happens when you serve your homepage on one domain and all static assets on a secondary domain (CDN) without Timing-Allow-Origin:

  • stackexchange.com
  • chess.com
  • ask.fm
  • steamcommunity.com

Finally, there are some sites with over 50% Missing resources. After reviewing these sites, it appears the majority of this is due to advertising being loaded from cross-origin frames:

  • weather.com
  • time.com
  • wsj.com

By Content

It’s interesting how Visibility commonly differs by the type of content. For example, CSS and Fonts have a good chance of being Full or at least Restricted Visibility, while Tracking Pixels and Videos are often being loaded in cross-origin frames so are completely Missing.

Breaking down the requests by type, we can see that different types of content are available at different rates:

ResourceTiming Visibility by Type
ResourceTiming Visibility by Type

Let’s look at a couple of these types.

HTML Content

For the purposes of this analysis, HTML resources have an initiatorType of iframe or were a URL ending in .htm*. Note that the page itself is not being reported in this data.

From the previous chart, HTML content is:

  • 20% Full Visibility
  • 25% Restricted
  • 53% Missing

Remember that all top-level frames will show up in ResourceTiming, either as Full (same-origin) or Restricted (cross-origin without TAO). Thus, HTML content that is being reported as Missing must be an <iframe> loaded within a cross-origin <iframe>.

For example, https://pagead2.googlesyndication.com/pagead/s/cookie_push.html is often Missing because it’s loaded within container.html (a cross-origin frame):

<iframe src="https://tpc.googlesyndication.com/safeframe/1-0-17/html/container.html">
    <html>
        <iframe src="https://pagead2.googlesyndication.com/pagead/s/cookie_push.html">
    </html>
</iframe>

Here are the top 5 HTML URLs seen across the Alexa Top 1000 that are Missing in ResourceTiming data:

URLCount
https://staticxx.facebook.com/connect/xd_arbiter/r/Ms1VZf1Vg1J.js?version=42221
https://pagead2.googlesyndication.com/pagead/s/cookie_push.html195
https://static.eu.criteo.net/empty.html106
https://platform.twitter.com/jot.html83
https://tpc.googlesyndication.com/sodar/6uQTKQJz.html62

All of these appear to be <iframes> injected for cross-frame communication.

Video Content

Video content is:

  • 2% Full Visibility (0.8% by byte count)
  • 22% Restricted (1.9% by byte count)
  • 75% Missing (97.3% by byte count)

Missing video content appears to be pretty site-specific — there aren’t many shared video URLs across sites (which isn’t too surprising).

What is surprising is how often straight <video> tags don’t often show up in ResourceTiming data. From my experimentation, this appears to be because of when ResourceTiming data surfaces for <video> content.

In the testing tool, I capture ResourceTiming data at the onload event — the point which the page appears ready. In Chrome, Video content can start playing before onload, and it won’t delay the onload event while it loads the full video. So the user might be seeing the first frames of the video, but not the entire content of the video by the onload event.

However, it looks like the ResourceTiming entry isn’t added until after the full video — which may be several megabytes — has been loaded from the network.

So unfortunately Video content is especially vulnerable to being Missing, simply because it hasn’t loaded all frames by the onload event (if that’s when you’re capturing all of the ResourceTiming data).

Also note that some browsers will (currently) never add ResourceTiming entries for <video> tags.

Most Missing Content

Across the Alexa Top 1000, there are some URLs that are frequently being loaded in cross-origin frames.

The top URLs appear to be a mixture of JavaScript, IFRAMEs used for cross-frame communication, and images:

URLCountTotal Bytes
tpc.googlesyndication.com/pagead/js/r20180312/r20110914/client/ext/m_window_focus_non_hydra.js291350946
staticxx.facebook.com/connect/xd_arbiter/r/Ms1VZf1Vg1J.js?version=422213178502
tpc.googlesyndication.com/pagead/js/r20180312/r20110914/activeview/osd_listener.js2195789703
tpc.googlesyndication.com/pagead/js/r20180312/r20110914/abg.js2154850400
pagead2.googlesyndication.com/pagead/s/cookie_push.html195144690
tpc.googlesyndication.com/safeframe/1-0-17/js/ext.js1781036316
pagead2.googlesyndication.com/bg/lSvH2GMDHdWiQ5txKk8DBwe8KHVpOosizyQXSe1BYYE.js178886084
tpc.googlesyndication.com/pagead/js/r20180312/r20110914/client/ext/m_js_controller.js1632114273
googleads.g.doubleclick.net/pagead/id1468246
www.google-analytics.com/analytics.js1261860568
static.eu.criteo.net/empty.html10622684
static.criteo.net/flash/icon/nai_small.png106139814
static.criteo.net/flash/icon/nai_big.png106234154
platform.twitter.com/jot.html836895

Most Restricted Origins

We can group Restricted requests by domain to see opportunities where getting a third-party to add the Timing-Allow-Origin header would have the most impact:

DomainCountTotal Bytes
images-eu.ssl-images-amazon.com71413204306
www.facebook.com65331286
www.google-analytics.com609757088
www.google.com4873487547
connect.facebook.net4376353078
images-na.ssl-images-amazon.com4368001256
tags.tiqcdn.com4023070430
sb.scorecardresearch.com365140232
www.googletagmanager.com2647684613
i.ebayimg.com2605489475
http2.mlstatic.com2533434650
ib.adnxs.com2431313041
cdn.doubleverify.com2394950712
mc.yandex.ru2352108922

Some of these domains are a bit surprising because the companies behind them have been leaders in adding Timing-Allow-Origin to their third-party content such as Google Analytics and the Facebook Like tag. Looking at some of the most popular URLs, it’s clear that they forgot to / haven’t added TAO to a few popular resources that are often loaded cross-origin:

  • https://www.google.com/textinputassistant/tia.png
  • https://www.google-analytics.com/plugins/ua/linkid.js
  • https://www.google-analytics.com/collect
  • https://images-eu.ssl-images-amazon.com/images/G/01/ape/sf/desktop/xyz.html
  • https://www.facebook.com/fr/b.php

Let’s hope Timing-Allow-Origin usage continues to increase!

Buffer Sizes

By default, each frame will only collect up to 150 resources in the PerformanceTimeline. Once full, no new entries will be added.

How often do sites change the default ResourceTiming buffer size for the main frame?

Buffer SizeNumber of Sites
150854
2002
2502
30015
4005
5005
10006
15001
1000003
10000001

85.4% of sites don’t touch the default ResourceTiming buffer size. Many sites double it (to 300), and 1% pick numbers over 1000.

There are four sites that are even setting it over 10,000:

  • messenger.com: 100,000
  • fbcdn.net: 100,000
  • facebook.com: 100,000
  • lenovo.com: 1,000,000

(this isn’t recommended unless you’re actively clearing the buffer)

Should you increase the default buffer size? Obviously the need to do so varies by site.

In our crawl of the Alexa Top 1000, we find that 14.5% of sites exceed 150 resources in the main frame, while only 15% of those sites had called setResourceTimingBufferSize() to ensure all resources were captured.

6. Workarounds

Given the restrictions we have on cross-origin resources and frames, what can we do to increase the visibility of requests on our pages?

6.1 Timing-Allow-Origin

To move Restricted Visibility resources to Full Visibility, you need to ensure all cross-origin resources have the Timing-Allow-Origin header.

This should be straightforward for any content you provide (e.g. on a CDN), but it may take convincing third-parties to also add this HTTP response header.

Most people specify * as the allowed origin list, so any domain can read the timing data. All third party scripts should be doing this!

Timing-Allow-Origin: *

6.2 Ensuring a Proper ResourceTiming Buffer Size

By default, each frame will only collect up to 150 resources in the PerformanceTimeline. Once full, no new entries will be added.

Obviously this limitation will affect any site that loads more than 150 resources in the main frame (or over 150 resources in a <iframe>).

You can change the default buffer size by calling setRessourceTimingBufferSize():

(function(w){
  if (!w ||
    !("performance" in w) ||
    !w.performance ||
    !w.performance.setResourceTimingBufferSize) {
    return;
  }

  w.performance.setResourceTimingBufferSize(500);
})(window);

Alternatively, you can use a PerformanceObserver (in each frame) to ensure you’re not affected by the limit.

6.3 Bubbling ResourceTiming

It’s possible to “leak” ResourceTiming to parent frames by using window.postMessage() to talk between cross-origin frames.

Here’s some sample code that listens for new ResourceTiming entries coming in via a PerformanceObserver, then “bubbles” up the message to its parent frame. The top-level frame can then use these bubbled ResourceTiming entries and merge them with its own list from itself and same-origin frames.

The challenge is obviously convincing all of your third-party / cross-origin frames to use the same bubbling code.

We are considering adding support for this to Boomerang.

6.4 Synthetic Monitoring

In the absence of having perfect Visibility via the above two methods, it makes sense to supplement your Real User Monitoring analytics with synthetic monitoring as well, which will have access to 100% of the resources.

You could also use synthetic tools to understand the average percentage of Missing resources, and use this to mentally adjust any RUM ResourceTiming data.

7. Summary

ResourceTiming is a fantastic API that has given us insight into previously invisible data. With ResourceTiming, we can get a good sense of how long critical resources are taking on real page loads and beyond.

Unfortunately, due to security and privacy restrictions, ResourceTiming data is limited or missing in many real-world site configurations. This makes it really hard to track your third party libraries and ads, or gather an accurate Page Weight,

My hope is that we get more third-party services to add Timing-Allow-Origin to all of the resources being commonly fetched. It’s not clear if there’s anything we can do to get more visibility into cross-origin frames, other than convincing them to always “bubble up” their resources.

For more information on how this data was gathered, you can look at the resourcetiming-visibility Github repo.

I’ve also put the raw data into a BigQuery dataset: wolverine-digital:resourcetiming_visibility.

Thanks to Charlie Vazac and the Boomerang team for feedback on this article.