Skip to content

What makes Sonda accurate?

TL;DR

It is the source maps.

Would you use a measurement tool if you knew it would give you inaccurate results? Of course not. You want to know that the information you have are reliable and accurate.

That is why one of the most important features of Sonda is its accuracy. But what makes Sonda accurate? The answer is fairly simple, but first let's take a look at why we often got inaccurate results with other bundle analyzers in the past.

Why are (some) other bundle analyzers inaccurate?

Depending on the bundle analyzers you used, you may have seen results that were way off. Usually, the reported bundle size was much larger than it actually was on disk. Why is that?

Web bundlers like webpack and Rollup have several hooks that allow plugins to get or modify data at different stages of the bundling process. It is essential for these plugins to run at the right time; otherwise they may break other plugins.

Graph showing the order of hooks in Rollup
The order of hooks in Rollup. Source: https://rollupjs.org

Let's take two plugins as an example: a plugin that allows you to use different language or syntax in your codebase (like TypeScript or JSX), and a plugin that minifies and optimizes your code. At what stages should these plugins run?

If the minification plugin runs before the code is transformed from TypeScript to JavaScript, it will throw an error because it doesn't understand TypeScript. Therefore, the transformation plugin should run before the minification plugin. Fortunately, thanks to the hook system, plugins can specify at which stage they should run, so you don't have to worry about this in most cases (although there are cases where two or more plugins need to run at the same stage, and the order of registration is important).

What does this have to do with the accuracy of the bundle analyzer?

The problem is that some analyzers try to use the data from the bundler using this hook system. However, because there are steps in the bundling process that must be done at the very end, such as tree-shaking, minification, or obfuscation, the data that these analyzers get is not the final data that is written to disk. As a result, the reported bundle size is often larger than what is actually on disk.

How does Sonda solve this problem?

Since we know that the data from the bundler is not the final data, what else can we do? We can wait for the bundler to do its job, and when it's done, use the source maps it outputs to analyze the bundles.

Source maps are files that allow tracing parts of the final bundle back to the original code. They are generated by the bundler and used by the browser to show you the original code in the developer tools, even though the code that actually runs is minified.

Arrow pointing a part of minified code back to the original code
Tracing minified code back to the original. Source: https://evanw.github.io/source-map-visualization/

Because source maps tell us where the parts of the final bundle came from, we can use this information to calculate exactly how much each file contributes to the final bundle.

Does this mean that Sonda doesn't use the hook system at all? Actually, it does because not everything can be read from the source maps.

Depending on the bundler, Sonda will read the list of source files the bundler parsed, their sizes, the module format type (ESM, CJS, etc.), and the list of files they import. While this data is not critical to the accuracy of the results, it does give you more context about the bundle, which can be useful when debugging the bundle size.

Positive side effect of relying on source maps

While working on the initial version of Sonda, I realized that using source maps had a positive side effect. By focusing primarily on source maps, Sonda minimizes dependence on specific bundler internals, so it can work with any bundler. After all, all bundlers can generate source maps, so we have almost all the data we need.

This is why Sonda works with Vite, Rollup, esbuild, webpack, Rspack, and more bundlers coming soon.

Released under the MIT License.