Duplicated code from dependencies is possibly the most needlessly endemic problem confronting modern web application bundles, making them bigger and slower. And, diagnosing how the usual suspects, npm and webpack, conspire together to include more than you need can be incredibly difficult and painful.

In this post we will do a deep dive into various ways that dependencies can unnecessarily bloat bundles to better understand the problem space. Then, we will introduce the Inspectpack DuplicatesPlugin -- a power tool to help you identify nuanced, actionable information about wasted bytes from dependencies in your webpack bundles.

Dependencies, Bundles, and Duplicates

To begin to understand how duplicated dependencies hurt modern frontend bundles, we need to look at the npm and webpack projects, and how they work together to produce a rising share of the ecosystem's delivered web applications.

Dependencies

Modern frontend web applications are rarely written from scratch. Most JavaScript-based applications rely on the expansive world of open source libraries, in particular those published as packages to the npm registry. With just a few additions to your package.json file, you can instantly get any number of libraries to help your application with anything from date formatting to full-blown application frameworks like React!

Bundles

Transforming your application code and dependencies into a web application capable of running in a browser usually involves a bundling tool, the most popular of which is webpack. Webpack ingests your code and dependencies, and packages all of the necessary files into 1+ "bundle" files that together are downloaded to end-user browsers.

Duplicates

Unfortunately, this power and flexibility comes with costs and complexities. Dependencies often have dependencies of their own, which makes analysis difficult for humans trying to optimize bundles and bundling tools trying to efficiently stitch code together.

All too often bundles suffer from one or more of the following duplication situations:

  • Identical code sources from the same package: Your application bundle ends with up 2+ versions of a file included that are byte-for-byte identical. E.g., you have the code from file lodash@4.2.3/get.js literally included twice in your final bundle.
  • Similar code files from different packages: Your application bundles two files with the same package name and file path that function similarly, but are not byte-for-byte identical. E.g., your bundle has code from the files lodash@3.1.0/get.js and lodash@4.2.3/get.js that is functionally the same, but not the actual identical code.
  • Identical code sources from different packages: This time you have different packages, but the specific file included in your bundle hasn't changed across the versions and is byte-for-byte identical. E.g., your bundle has code from the files lodash@3.1.0/get.js and lodash@4.2.3/get.js that is identical (although code in other unused files in the packages may differ).

In each of these scenarios your bundle ends up with more code than necessary because the same file (identical code or not) should be collapsed to a single version. To see how and why this can happen, it helps to review how npm and webpack work.

Understanding npm and webpack

Let's begin with a common frontend build situation wherein npm (or yarn) "get" code dependencies from the Internet and webpack then "puts" code sources in an application bundle. (Note that npm and webpack have changed how they handle duplicated packages and code, so we're going to cover the mechanics of each over time.)

We will work through a simple hypothetical application with a lot of different npm and webpack scenarios. You can find all of the discussed examples in a GitHub repository, complete with node_modules and application bundles placed into git source for easier online review.

npm Dependencies Installation

The npm tool first reads dependencies from the package.json:dependencies (and devDependencies for development) field. Let's consider an application like this:

// package.json
{
  "name": "my-app",
  "dependencies": {
    "lodash": "^4.1.0",
    "one": "1.2.3",
    "two": "2.3.4"
  }
}

with the one and two packages that "resolve" to match those dependencies. npm then looks at the resolved packages and recursively resolves the dependencies of each, essentially creating a dependency tree (or even abstractly a graph if there are circular dependencies). Here, we have:

// node_modules/one/package.json
{
  "name": "one",
  "dependencies": {
    "lodash": "^4.0.0"
  }
}

// node_modules/two/package.json
{
  "name": "two",
  "dependencies": {
    "lodash": "^3.0.0"
  }
}

So, what does our dependency tree look like? It's a bit complicated because the first-level dependencies in the root package like "lodash": "^4.1.0" get resolved to a single package like lodash@4.2.3 and downloaded. Only after this are the resolved package's dependencies read and semver ranges resolved to actual packages recursively. This means that our "abstract" dependency tree actually is really a mix of concrete resolutions of files in the resolution at a given point in time.

Aside: lodash is a real package with a get method. We will use some fictional implementations in our examples. one and two are fabricated packages. We picked lodash as a fictionalized example because it is so popular that there is a decent chance that a bundle with duplicates has some lodash duplicates.

Let's also define some quick terms that we'll use throughout the rest of this post:

  • "resolved": We start with a package name (lodash) that has a declared version constraint (^4.1.0). During an npm/yarn install, one early step is to take that constraint and resolve it to a single version in the registry that can be downloaded (lodash@4.2.3).
  • "installed": Resolved packages are downloaded, but not necessarily immediately placed in their final installation paths on disk in node_modules. Both npm and yarn reserve the right to move things around and flattened installed packages to single on-disk versions, etc. Only when npm or yarn finish the installation command do we say a package is actually installed at a given location on disk (e.g., node_modules/lodash).
  • "depended": We use the term "depended" to describe a logical, unique path in the abstract dependency tree that causes a package to be included. In our example above, we have three packages depending on lodash (my-app, one, and two).

For the present example, our abstract depended tree (with resolved versions in comments on the right) is:

# Depended            # Resolved
# =================== # ========
- my-app:
  - lodash@^4.1.0     # 4.2.3
  - one@1.2.3:        # 1.2.3
    - lodash@^4.0.0   # 4.2.3
  - two@2.3.4:        # 2.3.4
    - lodash@^3.0.0   # 3.1.0

So, what happens when we install this on disk via npm install or yarn install? Well, the answer depends on which version of npm / yarn you use.

Old npm

Older versions of npm used to install dependencies exactly as would match the abstract dependency tree, namely:

# Installed           # Resolved
# =================== # ========
my-app/node_modules
  lodash              # 4.2.3
  one                 # 1.2.3
    node_modules
      lodash          # 4.2.3
  two                 # 2.3.4
    node_modules
      lodash          # 3.1.0

Even though lodash@4.2.3 resolves to a single package version, it is installed twice.

Flattening with modern npm

In recognition of issues with wasted disk space and bloated frontend bundles, modern versions of npm and yarn implement a scheme of "flattening" the installed node_modules dependency tree. Following the Node.js require resolution algorithm, some of these dependencies can be collapsed to one package, within an acceptable semantic version range. (The actual mechanics of flattening and Node.js require resolution are complex and out of scope for this article -- we're just going to gloss over the subject.)

In the above example, resolved version 4.2.3 of lodash is compatible with both ~/lodash@^4.1.0 and ~/one@1.2.3/~/lodash@^4.0.0. Thus, by using a flattening installer, you could end up with an installed node_modules tree like this:

# Installed           # Resolved
# =================== # ========
my-app/node_modules
  lodash              # 4.2.3 (for `my-app` _and_ for `one`)
  one                 # 1.2.3
  two                 # 2.3.4
    node_modules
      lodash          # 3.1.0

~ Note: Following a webpack convention, we use the ~ character to mean the node_modules directory. The two can be used interchangeably.

Un-flattenable dependencies in modern npm

But, let's not get our hopes up too quickly. Even with modern npm, identical packages may still not be able to be flattened.

If our abstract dependency tree changes to:

# Depended            # Resolved
# =================== # ========
- my-app:
  - lodash@^4.1.0     # 4.2.3
  - one@1.2.3:        # 1.2.3
    - lodash@^3.0.0   # 3.1.0 (CHANGED!)
  - two@2.3.4:        # 2.3.4
    - lodash@^3.0.0   # 3.1.0

The root install will be ~/lodash at 4.2.3 with two identical 3.1.0 versions like:

# Installed           # Resolved
# =================== # ========
my-app/node_modules
  lodash              # 4.2.3
  one                 # 1.2.3
    node_modules
      lodash          # 3.1.0
  two                 # 2.3.4
    node_modules
      lodash          # 3.1.0

Together, these scenarios outline different ways that npm can place packages on disk in node_modules. Once there, how do depended-on code files end up in your frontend application?

The answer depends on your bundling tool. In this post, we'll focus on the widely-used webpack project.

Webpack bundling

A Webpack build starts at an entry point, which is typically your application code. It traverses all module imports (require or import) recursively to ingest code from your app or node_modules, process it, and concatenate everything into one or more bundles. (An oversimplification, but bear with us here.)

Let's say we have an application comprising of:

// my-app/index.js (entry point)
const { get } = require("lodash"); // lodash@^4.1.0
const { getOne } = require("one");
const { getTwo } = require("two");

const OBJ = { one: { two: "hi" } };

console.log("Get from lodash", get("one.two", OBJ)); // => `"hi"`
console.log("Get from one", getOne(OBJ)); // => `{ two: "hi" }`
console.log("Get from two", getTwo(OBJ)); // => `undefined`
// node_modules/one/index.js
const { get } = require("lodash"); // lodash@^4.0.0

module.exports = {
  getOne: (obj) => get("one", obj)
};
// node_modules/two/index.js
const { get } = require("lodash"); // lodash@^3.0.0

module.exports = {
  getTwo: (obj) => get("two", obj)
};

With this setup, we will definitely end up with a bundle that includes the files:

  • my-app/index.js
  • node_modules/one/index.js
  • node_modules/two/index.js

But the big question is: what files from lodash end up in our final webpack bundle?

And, of course, the answer is: it's complicated. It depends on how npm installed the packages into node_modules and how webpack behaves.

Old webpack

The original version of webpack came with the webpack.optimize.DedupePlugin plugin. The plugin replaces subsequent identical sections of an original code source with a pointer reference. Configuring the plugin is as simple as:

// webpack.config.js
module.exports = {
  plugins: [
    new webpack.optimize.DedupePlugin()
  ]
};

and a bundle will conveniently omit all extraneous identical code sources using any version of npm or yarn.

New webpack

The DedupePlugin was removed in webpack@3 with an indication that modern npm flattening should be sufficient to collapse duplicates in the installed node_modules folder such that Webpack would no longer have to do anything.

... but is that really the case?

Into the weeds with various duplication scenarios

Let's look at a few scenarios that might arise in bundles using old/new npm and old/new webpack. (See our examples repository for the full inputs and build outputs.)

  1. Old npm
  2. New npm flattened
  3. New npm unflattened
  4. New npm flattened with identical sources

These scenarios use the application discussed above that imports get() from lodash, getOne() from one, and getTwo() from two. The contrived getOne() and getTwo() methods use another import of lodash at differing versions. Although lodash has a real get() method, instead we will make up two hypothetical versions as follows:

lodash@3.1.0

module.exports = {
  // A very, very rough and naive object getter with forEach.
  get: (path, obj) => {
    let memo = obj;
    path.split(".").forEach((key) => {
      memo = memo === undefined ? memo : memo[key];
    });

    return memo;
  }
};

lodash@4.2.3

module.exports = {
  // A very, very rough and naive object getter with reduce.
  get: (path, obj) => path.split(".").reduce((memo, key) => {
    return memo === undefined ? memo : memo[key];
  }, obj)
};

Scenario 1 - Old npm

Old npm installs the node_modules folder something like:

# Installed           # Resolved
# =================== # ========
my-app/node_modules
  lodash              # 4.2.3
  one                 # 1.2.3
    node_modules
      lodash          # 4.2.3
  two                 # 2.3.4
    node_modules
      lodash          # 3.1.0

On disk, we have identical code sources at:

  • node_modules/lodash/index.js (4.2.3)
  • node_modules/one/node_modules/lodash/index.js (4.2.3)

and similar code files at:

  • node_modules/two/node_modules/lodash/index.js (3.1.0)

Let's look at how old and new webpack bundle these:

Scenario 1.a - Old npm + old webpack

Old webpack's DedupePlugin deduplicates the identical code sources so that only 1 code instance remains in our bundle. Looking at these lines of the bundle:

/* 3 */
/*!*****************************************!*\
  !*** ./old-npm/~/one/~/lodash/index.js ***!
  \*****************************************/
1,

instead of real code, there is an integer 1 which points to the full identical source at index 1 in the bundle which corresponds to node_modules/lodash/index.js.

Assessment: Examining our potential duplication problems, our bundle stacks up as follows:

  • Identical code sources from the same package: None. The plugin deduplicates.
  • Similar code files from different packages: Duplicates. The deduplicated file lodash@4.2.3/index.js is similar in functionality to the different file lodash@3.1.0/index.js. If only one of the two files were chosen we would save the other file's byte size.
  • Identical code sources from different packages: N/A. Our example doesn't have identical sources across different packages. But if it did, the plugin would deduplicate them.

Scenario 1.b - Old npm + new webpack

Modern webpack doesn't deduplicate, so our bundle contains the following full code sources:

  • node_modules/lodash/index.js (4.2.3)
  • node_modules/one/node_modules/lodash/index.js (4.2.3)
  • node_modules/two/node_modules/lodash/index.js (3.1.0)

Assessment:

  • Identical code sources from the same package: Duplicates. No webpack plugin deduplication.
  • Similar code files from different packages: Duplicates. Same as scenario 1.a.
  • Identical code sources from different packages: N/A. Our example doesn't have identical sources across different packages. But if it did, they would still remain as duplicates because modern webpack doesn't deduplicate identical sources.

Scenario 2 - New npm flattened

Let's upgrade to a modern npm version or any version of yarn. Both package managers now inspect the entire dependency tree and "flatten" dependencies to single packages higher up in node_modules whenever they can.

As mentioned above, this translates to an installed layout of:

# Installed           # Resolved
# =================== # ========
my-app/node_modules
  lodash              # 4.2.3 (for root _and_ for `one`)
  one                 # 1.2.3
  two                 # 2.3.4
    node_modules
      lodash          # 3.1.0

collapsing the dependencies for lodash@4.2.3 to a single on-disk location.

Thus, we end up with no identical code sources and two similar code files:

  • node_modules/lodash/index.js (4.2.3)
  • node_modules/two/node_modules/lodash/index.js (3.1.0)

Let's turn to webpack and bundling this installation.

Scenario 2.a - New npm flattened + old webpack

Old webpack's DedupePlugin doesn't have anything to do this time, as there are no identical duplicate code sources.

Assessment: Our bundle has the following issues:

  • Identical code sources from the same package: None. npm was able to flatten away our identical sources by collapsing lodash@4.2.3.
  • Similar code files from different packages: Duplicates. The different files lodash@4.2.3/index.js and lodash@3.1.0/index.js waste bytes if a single file could be used instead.
  • Identical code sources from different packages: N/A. Same as scenario 1.a.

Scenario 2.b - New npm flattened + new webpack

As the old DedupePlugin never came into play in this scenario, modern webpack has pretty much exactly the same substantive bundle as in scenario 2.a with the same advantages and disadvantages.

Scenario 3 - New npm unflattened

Unfortunately, modern npm and yarn cannot flatten all semver-compatible packages because they are ultimately bound by the rules of the Node.js require resolution algorithm. Thus, if we have a slight change in dependencies (the one package now depends on lodash@3.1.0.), we end up with an installed on-disk layout of:

# Installed           # Resolved
# =================== # ========
my-app/node_modules
  lodash              # 4.2.3
  one                 # 1.2.3
    node_modules
      lodash          # 3.1.0 (Cannot be collapsed)
  two                 # 2.3.4
    node_modules
      lodash          # 3.1.0 (Cannot be collapsed)

Thus, we have identical code sources:

  • node_modules/one/node_modules/lodash/index.js (3.1.0)
  • node_modules/two/node_modules/lodash/index.js (3.1.0)

and similar code files:

  • node_modules/lodash/index.js (4.2.3)

Scenario 3.a - New npm unflattened + old webpack

Old webpack's DedupePlugin is able to deduplicate the identical code sources across one and two's lodash@3.1.0. The code from two is collapsed to a reference integer 3 in these lines.

Assessment: Our bundle stacks up as follows:

  • Identical code sources from the same package: None. The plugin deduplicates.
  • Similar code files from different packages: Duplicates. The deduplicated file lodash@3.1.0/index.js is similar to lodash@4.2.3/index.js, but both remain in the bundle.
  • Identical code sources from different packages: N/A. Same as scenario 1.a.

Scenario 3.b - New npm unflattened + new webpack

Modern webpack doesn't deduplicate, so we end up with full code sources of:

  • node_modules/lodash/index.js (4.2.3)
  • node_modules/one/node_modules/lodash/index.js (3.1.0)
  • node_modules/two/node_modules/lodash/index.js (3.1.0)

in our bundle pretty much analogously to scenario 1.b. Ultimately, modern npm that cannot flatten is equivalent to old npm that didn't even try.

Scenario 4 - New npm flattened with identical sources

We return to our original package version setup, but introduce a different twist: now lodash@3.1.0/index.js and lodash@4.2.3/index.js are identical code sources. Both use the reduce version of our get function.

We end up with the same installed layout as scenario 1:

# Installed           # Resolved
# =================== # ========
my-app/node_modules
  lodash              # 4.2.3 (for root _and_ for `one`)
  one                 # 1.2.3
  two                 # 2.3.4
    node_modules
      lodash          # 3.1.0

collapsing the dependencies for lodash@4.2.3 to a single on-disk install.

So now we end up with no similar code files and 2 identical code sources at:

  • node_modules/lodash/index.js (4.2.3)
  • node_modules/two/node_modules/lodash/index.js (3.1.0)

Let's see how different webpacks handle this final scenario:

Scenario 4.a - New npm flattened with identical sources + old webpack

Old webpack's DedupePlugin is able to deduplicate the identical code sources across the flattened lodash@4.2.3/index.js (for root and one) and the different package (but identical code source) of lodash@3.1.0/index.js from two. The identical code source from two's dependency is collapsed to a reference integer 1 in these lines.

Assessment: Our bundle has no duplicates anywhere!

  • Identical code sources from the same package: None. New npm / yarn takes care of this.
  • Similar code files from different packages: N/A. The scenario has no similar-but-not-identical code sources.
  • Identical code sources from different packages: None. The old webpack DedupePlugin is now able to collapse these, even across different packages.

Scenario 4.b - New npm flattened with identical sources + new webpack

Although modern npm / yarn take care of the flattened packages of lodash@4.2.3/index.js, there is a missed opportunity for the identical code source in lodash@3.1.0/index.js.

Assessment: Our bundle now has different types of duplicates from previous scenarios:

  • Identical code sources from the same package: None. npm was able to flatten away our identical sources by collapsing lodash@4.2.3.
  • Similar code files from different packages: N/A. The scenario has no similar-but-not-identical code sources.
  • Identical code sources from different packages: Duplicates. Without old webpack deduplication, we end up with identical code across the two lodash versions in our bundle.

Finding and fixing duplicates in real applications

It takes a surprisingly large amount of background to dig into even the most common cases of duplicate dependency creep in your bundle. We have reviewed how old and modern npm and webpack together transform source code into a full application bundle. We've investigated just a few of the many, many scenarios that demonstrate how unnecessary dependency duplicates can produce a larger bundle than you need. And, we've identified some of the things that influence duplicates, namely:

  • Where dependencies are coming from;
  • How npm has placed the dependencies on disk;
  • How webpack has included code from dependencies into the bundle.

The tough part here is that so far, we've only analyzed a truly trivial application via human inspection of the actual application bundle. Nearly all real applications will be significantly larger and more complex, all but foreclosing such manual analysis.

How do we apply this for reals?

Real-world development and production workflows all but require some baseline of programmatic tooling support for these issues.

On the positive side, there are a multitude of different webpack analysis tools that break down the various parts of the webpack compilation process. A good starting point is SurviveJS' dedicated page for build analysis with a subsection on duplicates analysis.

Nonetheless, while many of these projects can together provide a lot of information about every part of webpack compilation and your bundle, finding a dedicated report that is specifically actionable for reducing duplicates can remain challenging.

Introducing the Inspectpack DuplicatesPlugin

The Inspectpack project has been analyzing Webpack innards for quite some time. It's the information engine behind the popular webpack-dashboard, which provides a captive terminal display with a NASA-like control center. Inspectpack also provides a powerful CLI tool that consumes a Webpack stats object from disk to report on size, duplicates, and package version information in a variety of formats (text, TSV, JSON).

At its core, the Inspectpack library can efficiently discover code duplicates as well as inspect installed node_modules directories and infer how both npm and webpack impact a final production application bundle. We're very pleased to announce that we have taken this power and concentrated it into a new easy-to-use webpack plugin -- the Inspectpack DuplicatesPlugin.

Webpack integration

Following the online guide, first install the plugin and add it to your development dependencies:

$ npm install --save-dev inspectpack # OR
$ yarn add --dev inspectpack

Then, integrate the plugin into the plugins field of your webpack configuration file:

// webpack.config.js
const { DuplicatesPlugin } = require("inspectpack/plugin");

module.exports = {
  plugins: [
    new DuplicatesPlugin({
      // Emit compilation warning or error? (Default: `false`)
      emitErrors: false,
      // Display full duplicates information? (Default: `false`)
      verbose: false
    })
  ]
};

Our options are as follows:

  • emitErrors: By default (false), the DuplicatesPlugin emits duplicate issues to compilation.warnings. If this value is true, then the plugin will emit issues to compilation.errors which will fail typical modern webpack builds.
  • verbose: By default (false), the report only shows package information for packages that have files that end up duplicated somewhere in a given bundle. Setting this value to true additionally displays information for duplicated files, including file size and type (identical (I) or similar (S)).

And that's pretty much it. It's worth noting that the plugin supports every version of webpack (currently 1-4) and is fully tested on each version!

Discovering and understanding duplicates

Now that we have integrated the plugin, let's try it out!

The README documentation provides a guide to understanding plugin reports, but it's probably easier to just see it in practice. Picking an exemplary situation, we return to scenario 3.b, which uses modern npm with unflattened dependencies.

Default report

Running webpack with the default options (DuplicatesPlugin()) produces the following report:

scenario 3.b default DuplicatesPlugin report

Our report is a bit terse, but contains a summary of the information from our previous, manual analysis:

  • For duplicate files, we have 2 similar and 3 similar or identical code sources. We also get a summary of the total number of bytes at issue. For us here, that's 703 for 3 code sources, which we could presumably roughly cut to a third the size if we fix the duplicates.
  • For packages, we have 1 unique package (lodash) with 2 resolved versions (3.1.0, 4.2.3), 3 installed paths on disk(~/one/~/lodash, ~/two/~/lodash, ~/lodash), and 3 depended abstract paths (starting from root application, one, and two).

We then get a per-asset report (just bundle.js in our case) with a per-unique-package drill-down of the form:

{PACKAGE_NAME} (Found {NUM} resolved, {NUM} installed, {NUM} depended. Latest version {VERSION}.)
  {INSTALLED_PACKAGE_VERSION NO 1} {INSTALLED_PACKAGE_PATH NO 1}
    {DEPENDENCY PATH NO 1}
    {DEPENDENCY PATH NO 2}
    ...
  {INSTALLED_PACKAGE_VERSION NO 1} {INSTALLED_PACKAGE_PATH NO 2}
  ...
  {INSTALLED_PACKAGE_VERSION NO 2} {INSTALLED_PACKAGE_PATH NO 3}
  ...

Our report contains a package summary for lodash as it is the only duplicate-producing package in the scenario. Then we drill down into each installed version / path, and further drill down into each depended graph.

That's all of the package information we manually figured out before! But what if we want a bit more information about the duplicate code sources?

Verbose report

Enter the verbose report. Configuring the option DuplicatesPlugin({ verbose: true }) will additionally produce duplicate code source information:

scenario 3.b default DuplicatesPlugin verbose report

Now, in addition to our dependency graphs, we get a report of each duplicated code source path, a note indicating if it is identical (I) to some other file in the bundle or merely similar (S), and the file byte size (e.g., 249 and 205 bytes respectively). As discussed previously, we have:

  • Identical code sources from the same package: Duplicates. We have ~/one/~/lodash/index.js and ~/two/~/lodash/index.js from separate lodash@3.1.0 installations.
  • Similar code files from different packages: Duplicates. Both of these lodash@3.1.0 files are similar to ~/lodash/index.js from lodash@4.2.3 in the bundle.

Assessing duplicates

With DuplicatesPlugin added to our webpack configuration file, we now have a detailed assessment of how npm dependencies are installed and what webpack ends up placing in the bundle. More specifically, we can answer these important questions:

  • What package versions were resolved?
  • Where were the resolved packages installed on disk in node_modules?
  • What parent packages depended on a given package to cause it to be resolved and installed?
  • Which duplicate files from a package ended up being included in the final application bundle?

Note - DedupePlugin: The DuplicatesPlugin does not specifically detect that old webpack's DedupePlugin has programmatically collapsed a duplicate code source. Given that very few folks use a pre-webpack@3, we chose to omit a special case report (although we could if there was community demand).

Fixing duplicates

So, now that we can automatically report on duplicate dependencies, how do we fix them? How do we smash our duplicates?

Staying true to our running theme, the answer is: it's complicated.

The Inspectpack documentation has an introductory guide discussing how to fix bundle duplicates.

Summarizing these for convenience, we first look to meta-level tips on prioritization and focus:

  • Look first to identical code sources: When choosing what to do first, a good bet is to focus on literally identical code duplicated in your bundle and dependencies. Although not guaranteed to be collapsible (due to considerations like other depended-on code), it's a great first stop.
  • Change dependencies in your root package.json: Anything you can change directly in your applications package.json is likely to have the least unintended consequences.
  • Critically examine and scrutinize your dependencies: An easy win is always to just have less code. Look at your dependency tree. Do you really need all of the bundled packages? Keep a critical eye out for packages that have lots of transitive other dependencies, prioritize based on total size brought in by a package (using tools like the awesome webpack-bundle-analyzer), and see if you can live without the dependency or find a smaller, equivalent replacement.

Unfortunately, many times you won't be able to harmonize and collapse your abstract dependency graph / installed node_modules tree easily. The next step is to potentially force packages and code sources to collapse to single entities, even if the normal rules of npm/yarn/webpack would prevent it.

These are essentially sledgehammer approaches for otherwise intractable duplicate situations, but because they can violate the assumptions and rules of semantic versioning, your application is potentially at risk for breaking behavior and bugs.

All in all, like most web application optimization work, reducing duplicate dependencies is more of an art than science. The above tips are just starting points for an overall effort that digs deep into why your bundle is so large and how to most effectively leverage this information to reduce its size.

Conclusion

After our (rather lengthy) deep dive, we have now uncovered more of how and why duplicates can occur in your application bundles. Although we only discussed a few hundred wasted bytes to keep our build scenarios comprehensible, it is easy to extrapolate an impact of several orders of magnitude for more complex, real-world web applications.

We hope you find the new Inspectpack DuplicatesPlugin to be a useful, actionable tool in finding and squashing duplicate dependencies in your webpack builds. Do your application bundles a favor and give it a whirl today!