Finding and loading modules using require and import

1. Understanding File modules

The CommonJS and ES6 modules we’ve just looked at are what the Node.js documentation describes as a file module. Such modules are contained within a single file, whose filename ends with .js, .cjs, .mjs, .json, or .node. The latter are compiled from C or C++ source code, or even other languages such as Rust, while the former are, of course, written in JavaScript or JSON.

The module identifier of a file module must start with ./ or ../. This signals Node.js that the module identifier refers to a local file. As should already be clear, this module identifier refers to a pathname relative to the currently executing module.

It is also possible to use an absolute pathname as the module identifier. In a CommonJS module, such an identifier might be /path/to/some/directory/my- module.js. In an ES6 module, since the module identifier is actually a URL, then we must use a file:// URL like file:///path/to/some/directory/my- module.mjs. There are not many cases where we would use an absolute module identifier, but the capability does exist.

One difference between CommonJS and ES6 modules is the ability to use extensionless module identifiers. The CommonJS module loader allows us to do this, which you should save as extensionless.js:

const simple = require(‘./simple’);

console.log(simple.hello());

console.log(‘${simple.next()}’);

console.log(‘${simple.next()}’);

This uses an extension-less module identifier to load a module we’ve already discussed, simple.js:

$ node ./extensionless

Hello, world!

1

2 

And we can run it with the node command using an extension-less module identifier. But if we specify an extension-less identifier for an ES6 module:

$ node ./simpledemo2

internal/modules/cjs/loader.js:964

throw err;

^

Error: Cannot find module ‘/home/david/Chapter03/simpledemo2’

at Function.Module._resolveFilename

(internal/modules/cjs/loader.js:961:17)

at Function.Module._load (internal/modules/cjs/loader.js:854:27)

at Function.executeUserEntryPoint [as runMain]

(internal/modules/run_main.js:71:12)

at internal/main/run_main_module.js:17:47 {

code: ‘MODULE_NOT_FOUND’,

requireStack: []

}

We get the error message making it clear that Node.js could not resolve the file name. Similarly, in an ES6 module, the file name given to the import statement must have the file extension.

Next, let’s discuss another side effect of ES6 module identifiers being a URL.

2. The ES6 import statement takes a URL

The module identifier in the ES6 import statement is a URL. There are several important considerations.

Since Node.js only supports the file:// URLs, we’re not allowed to retrieve a module over from a web server. There are obvious security implications, and the corporate security team would rightfully get anxious if modules could be loaded from http:// URLs.

Referencing a file with an absolute pathname must use the file:///path/to/file.ext syntax, as mentioned earlier. This is different from require, where we would use /path/to/file.ext instead.

Since ? and # have special significance in a URL, they also have special significance to the import statement, as in the following example:

import ‘./module-name.mjs?query=1’

This loads the module named module-name.mjs with a query string containing query=1. By default, this is ignored by the Node.js module loader, but there is an experimental loader hook feature by which you can do something with the module identifier URL.

The next type of module to consider is those baked into Node.js, the core modules.

3. Understanding the Node.js core modules

Some modules are pre-compiled into the Node.js binary. These are the core Node.js modules documented on the Node.js website at https://nodejs.org/api/index. html.

They start out as source code within the Node.js build tree. The build process compiles them into the binary so that the modules are always available.

We’ve already seen how the core modules are used. In a CommonJS module, we might use the following:

const http = require(‘http’);

const fs = require(‘fs’).promises;

And the equivalent in an ES6 module would be as follows:

import http from ‘http’;

import { promises as fs } from ‘fs’;

In both cases, we’re loading the http and fs core modules that would then be used by other code in the module.

Moving on, we will next talk about more complex module structures.

4. Using a directory as a module

We commonly organize stuff into a directory structure. The stuff here is a technical term referring to internal file modules, data files, template files, documentation, tests, assets, and more. Node.js allows us to create an entry-point module into such a directory structure.

For example, with a module identifier like ./some-library that refers to a directory, then there must be a file named index.js, index.cjs, index.mjs, or index.node in the directory. In such a case, the module loader loads the appropriate index module even though the module identifier did not reference a full pathname. The pathname is computed by appending the file it finds in the directory.

One common use for this is that the index module provides an API for a library stored in the directory and that other modules in the directory contain what’s meant to be private implement details.

While overloading the word module this way might be a little confusing, it’s going to get even more so as we consider the packages we install from other sources.

5. Comparing installed packages and modules

Every programming platform supports the distribution of libraries or packages that are meant to be used in a wide array of applications. For example, where the Perl community has CPAN, the Node.js community has the npm registry. A

Node.js installed package is the same as we just described as a folder as a module, in that the package format is simply a directory containing a package.json file along with the code and other files comprising the package.

The package.json file describes the package. A minimal set of fields are defined by Node.js, specifically as follows:

{ “name” : “some-library”,

“main” : “./lib/some-library.js” }

The name field gives the name of the package. If the main field is present, it names the JavaScript file to use instead of index.js to load when the package is loaded. The package manager applications like npm and Yarn support many more fields in package.json, which they use to manage dependencies and versions and everything else.

If there is no package.json, then Node.js will look for either index.js or index.node. In such a case, require(‘some-library’) will load the file module in /path/to/some-library/index.js.

Installed packages are kept in a directory named node_modules. When JavaScript source code has require(‘some-library’) or import ‘some-library’, Node.js searches through one or more node_modules directories to find the named package.

Notice that the module identifier, in this case, is just the package name. This is different from the file and directory module identifiers we studied earlier since both those are pathnames. In this case, the module identifier is somewhat abstract, and that’s because Node.js has an algorithm for finding packages within the nested structure of the node_modules directories.

To understand how that works, we need a deeper dive into the algorithm.

6. Finding the installed package in the file system

One key to why the Node.js package system is so flexible is the algorithm used to search for packages.

For a given require, import(), or import statement, Node.js searches upward in the file system from the directory containing the statement. It is looking for a directory named node_modules containing a module satisfying the module identifier.

For example, with a source file named /home/david/projects/notes/foo.js and a require or import statement requesting the module identifier bar.js, Node.js tries the following options:

As just said, the search starts at the same level of the file system as foo.js. Node.js will look either for a file module named bar.js or else a directory named bar.js containing a module as described earlier in Using a Directory as a module. Node.js will check for this package in the node_modules directory next to foo.js and in every directory above that file. It will not, however, descend into any directory such as express or express/node_modules. The traversal only moves upward in the file system, not downward.

While some of the third-party packages have a name ending in .js, the vast majority do not. Therefore, we will typically use require(‘bar’). Also typically the 3rd party installed packages are delivered as a directory containing a package.json file and some JavaScript files. Therefore, in the typical case, the package module identifier would be bar, and Node.js will find a directory named bar in one of the node_modules directories and access the package from that directory.

This act of searching upward in the file system means Node.js supports the nested installation of packages. A Node.js package that in turn depends on other modules that will have its own node_modules directory; that is, the bar package might depend on the fred package. The package manager application might install fred as /home/david/projects/notes/node_modules/bar/node_modules/fred:

In such a case, when a JavaScript file in the bar package uses require(‘fred’) its search for modules starts in /home/david/projects/notes/node_modules/bar/node_modules, where it will find the fred package. But if the package manager detects that other packages used by notes also use the fred package, the package manager will install it as /home/david/projects/notes/node_modules/fred.

Because the search algorithm traverses the file system upwards, it will find fred in either location.

The last thing to note is that this nesting of node_modules directories can be arbitrarily deep. While the package manager applications try to install packages in a flat hierarchy, it may be necessary to nest them deeply.

One reason for doing so is to enable using two or more versions of the same package.

6.1. Handling multiple versions of the same installed package

The Node.js package identifier resolution algorithm allows us to install two or more versions of the same package. Returning to the hypothetical notes project, notice that the fred package is installed not just for the bar package but also for the express package.

Looking at the algorithm, we know that require(‘fred’) in the bar package, and in the express package, will be satisfied by the corresponding fred package installed locally to each.

Normally, the package manager applications will detect the two instances of the fred package and install only one. But, suppose the bar package required the fred version 1.2, while the express package required the fred version 2.1.

In such a case, the package manager application will detect the incompatibility and install two versions of the fred package as so:

  • In /home/david/projects/notes/node_modules/bar/node_modules, it will install fred version 1.2.
  • In /home/david/projects/notes/node_modules/express/node_mod ules, it will install fred version 2.1.

When the express package executes require(‘fred’) or import ‘fred’, it will be satisfied by the package in  /home/david/projects/notes/node_modules/express/node_modules/fred. Likewise, the bar package will be satisfied by the package

in /home/david/projects/notes/node_modules/bar/node_modules/fred. In both cases, the bar and express packages have the correct version of the fred package available. Neither is aware there is another version of fred installed.

The node_modules directory is meant for packages required by an application. Node.js also supports installing packages in a global location so they can be used by multiple applications.

7. Searching for globally installed packages

We’ve already seen that with npm we can perform a global install of a package. For example, command-line tools like hexy or babel are convenient if installed globally. In such a case the package is installed in another folder outside of the project directory. Node.js has two strategies for finding globally installed packages.

Similar to the PATH variable, the NODE_PATH environment variable can be used to list additional directories in which to search for packages. On Unix-like operating systems, NODE_PATH is a colon-separated list of directories, and on Windows it is semicolon-separated. In both cases, it is similar to how the PATH variable is interpreted, meaning that NODE_PATH has a list of directory names in which to find installed modules.

There are three additional locations that can hold modules:

  • $HOME/.node_modules
  • $HOME/.node_libraries
  • $PREFIX/lib/node

In this case, $HOME is what you expect (the user’s home directory), and $PREFIX is the directory where Node.js is installed.

Some recommend against using global packages. The rationale is the desire for repeatability and deployability. If you’ve tested an app and all its code is

conveniently located within a directory tree, you can copy that tree for deployment to other machines. But, what if the app depended on some other file that was magically installed elsewhere on the system? Will you remember to deploy such files? The application author might write documentation saying to install this then install that and install something-else before running npm install, but will the users of the application correctly follow all those steps?

The best installation instructions is to simply run npm install or yarn install. For that to work, all dependencies must be listed in package.json.

Before moving forward, let’s review the different kinds of module identifiers.

8. Reviewing module identifiers and pathnames

That was a lot of details spread out over several sections. It’s useful, therefore, to quickly review how the module identifiers are interpreted when using the require, import(), or import statements:

  • Relative module identifiers: These begin with ./ or ../, and absolute identifiers begin with /. The module name is identical to POSIX filesystem semantics. The resultant pathname is interpreted relative to the location of the file being executed. That is, a module identifier beginning with ./ is looked for in the current directory, whereas one starting with ../ is looked for in the parent directory.
  • Absolute module identifiers: These begin with / (or file:// for ES6 modules) and are, of course, looked for in the root of the filesystem. This is not a recommended practice.
  • Top-level module identifiers: These do not begin with those strings and are just the module name. These must be stored in a node_modules directory, and the Node.js runtime has a nicely flexible algorithm for locating the correct node_modules directory.
  • Core modules: These are the same as the top-level module identifiers, in that there is no prefix, but the core modules are prebaked into the Node.js binary.

In all cases, except for the core modules, the module identifier resolves to a file that contains the actual module, and which is loaded by Node.js. Therefore, what Node.js does is to compute the mapping between the module identifier and the actual file name to load.

Some packages offer what we might call a sub-package included with the main package, let’s see how to use them.

9. Using deep import module specifiers

In addition to a simple module identifier like require(‘bar’), Node.js lets us directly access modules contained within a package. A different module specifier is used that starts with the module name, adding what’s called a deep import path. For a concrete example, let’s look at the mime module (https://www.npmjs.com/package/mime), which handles mapping a file name to its corresponding MIME type.

In the normal case, you use require(‘mime’) to use the package. However, the authors of this package developed a lite version of this package that leaves out a lot of vendor-specific MIME types. For that version, you use require(‘mime/lite’) instead. And of course, in an ES6 module, you use import ‘mime’ and import ‘mime/lite’, as appropriate.

The specifier mime/lite is an example of a deep import module specifier.

With such a module identifier, Node.js first locates the node_modules directory containing the main package. In this case, that is the mime package. By default, the deep import module is simply a path-name relative to the package directory, for example, /path/to/node_modules/mime/lite. Going by the rules we’ve already examined, it will be satisfied by a file named lite.js or a by a directory named lite containing a file named index.js or index.mjs.

But it is possible to override the default behavior and have the deep import specifier refer to a different file within the module.

9.1. Overriding a deep import module identifier

The deep import module identifier used by code using the package does not have to be the pathname used within the package source. We can put declarations in package.json describing the actual pathname for each deep import identifier. For example, a package with interior modules named ./src/cjs-module.js and ./src/es6-module.mjs can be remapped with this declaration in package.json:

{

“exports”: {

“./cjsmodule”: “./src/cjs-module.js”,

“./es6module”: “./src/es6-module.mjs”

}

}

With this, code using such a package can load the inner module using require(‘module-name/cjsmodule’) or import ‘module-name/es6module’. Notice that the filenames do not have to match what’s exported.

In a package.json file using this exports feature, a request for an inner module not listed in exports will fail. Supposing the package has a ./src/hidden- module.js file, calling require(‘module-name/src/hidden-module.js’) will fail.

All these modules and packages are meant to be used in the context of a Node.js project. Let’s take a brief look at a typical project.

10. Studying an example project directory structure

A typical Node.js project is a directory containing a package.json file declaring the characteristics of the package, especially its dependencies. That, of course, describes a directory module, meaning that each module is its own project. At the end of the day, we create applications, for example, an Express application, and these applications depend on one or more (possibly thousands of) packages that are to be installed:

This is an Express application (we’ll start using Express in Chapter 5, Your First Express Application) containing a few modules installed in the node_modules directory. A typical Express application uses app.js as the main module for the application, and has code and asset files distributed in the public, routes, and views directories. Of course, the project dependencies are installed in the node_modules directory.

But let’s focus on the content of the node_modules directory versus the actual project files. In this screenshot, we’ve selected the express package. Notice it has a package.json file and there is an index.js file. Between those two files, Node.js will recognize the express directory as a module, and calling require(‘express’) or import ‘express’ will be satisfied by this directory.

The express directory has its own node_modules directory, in which are installed two packages. The question is, why are those packages installed in express/node_modules rather than as a sibling of the express package?

Earlier we discussed what happens if two modules (modules A and B) list a dependency on different versions of the same module (C). In such a case, the package manager application will install two versions of C, one as A/node_modules/C and the other as B/node_modules/C. The two copies of C are thus located such that the module search algorithm will cause module A and module B to have the correct version of module C.

That’s the situation we see with express/node_modules/cookie. To verify this, we can use an npm command to query for all references to the module:

$ npm ls cookie

notes@0.0.0 /Users/David/chap05/notes

├─┬  cookie-parser@1.3.5

│ └── cookie@0.1.3

└─┬ express@4.13.4

└── cookie@0.1.5

This says the cookie-parser module depends on version 0.1.3 of cookie, while Express depends on version 0.1.5.

Now that we can recognize what a module is and how they’re found in the file system, let’s discuss when we can use each of the methods to load modules.

Source: Herron David (2020), Node.js Web Development: Server-side web development made easy with Node 14 using practical examples, Packt Publishing.

Leave a Reply

Your email address will not be published. Required fields are marked *