asyncbyKaran Raina

Scale Up Your Express.js App Using Clusters

6 min read

You would be aware that Node.js runs in a single thread in a single process. Therefore, it does not, by default, leverage fully the resources of a multi-core CPU. This is where clustering helps us. It spawns multiple independent child workers which run their own instances of Node.js in a separate sub-process.

In this article, we will dive directly into code to see how clustering our Express app increases server throughput and allows us to fully leverage our CPU cores.

What is Cluster? 🤔

From the official Node.js docs, the cluster module allows easy creation of child processes that share server port. It helps spawn multiple worker processes using child_process.fork() method. The worker processes can communicate with the parent process via IPC and pass server requests back and forth.

Why Do We Need Clustering? 🤔

Node.js runs in a single thread in a single process. Therefore, it does not, by default, leverage fully the resources of a multi-core CPU. This is where clustering helps us.

It spawns multiple independent child workers which run their own instances of Node.js in a separate sub-process. Now, if the event loop of one of the processes gets blocked, our Node application would still be able to accept requests and delegate them to other available child workers.

Let's Get Our Hands Dirty 😎

We will dive directly into code to see how clustering our Express increases server throughput and allows us to fully leverage our CPU cores.

Ready Our Tool

Before we write any code, we need to have a tool that would automate testing of our application's performance. In this article, I will use an npm package called loadtest to load test our app.

You can run the following command to globally install loadtest package:

npm install -g loadtest

Develop a Simple Express Server App

Open any empty project directory in terminal, and run npm init. After some basic prompts, it will create a package.json file in the directory. Then create an index.js file in the project directory and add a start script in your package.json file.

{
  "name": "express-cluster",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1",
    "start": "node index.js"
  },
  "keywords": [],
  "author": "",
  "license": "ISC"
}

Now we need to install Express in order to bootstrap a little server. Run npm i express in the terminal. This will install Express as our project dependency and then create an Express server in index.js file:

const express = require('express');
const app = express();

app.get('/', (request, response) => {
    let count = 0;

    // Do some fake work
    for (let i = 0; i <= 5000000; i++) {
        count += i;
    }

    // send the response
    response.json({});
});

app.listen(3000, () => {
    console.log('Server is running on port 3000');
})

We have created a simple Express server which accepts GET requests and does some fake work to simulate any other operation done by a server. Now, run the server using npm start command. Browse http://localhost:3000/ in the browser to confirm if our server is running.

Testing Without Clusters

We can now use loadtest to test the performance of our app. In the terminal, run:

loadtest -n 1000 -c 10 http://localhost:3000

Where:

  • -n flag is for the number of requests to be done
  • -c refers to the concurrency of requests

Results:

  • ~94 requests per second
  • Mean latency: ~105 ms

As is evident from the results, our server could take only ~94 requests per second (a pretty simple app that does nothing) with a mean latency of ~105 ms.

Invoke the Power of Clusters 🔥

First we need to require (or import) cluster and os packages. We do not need to install them as they are already shipped with Node.js. The os module helps us with interacting with the operating system. We can get the total number of CPU cores present in the machine using cpu().length:

const express = require('express');
// require cluster
const cluster = require('cluster');
// require os
const totalCPUs = require('os').cpus().length;

const app = express();

Now, the idea behind clustering is that we have a primary process which spawns multiple worker processes. So, we will update our code so that when the app is run in a primary process, it checks for the number of CPU cores available and then spawns equal number of worker threads.

If the app is run in a worker thread, it will listen to an Express server. So, the overall architecture of our application would be to have a primary process which acts as a cluster manager and several worker processes listening to the same server port.

const express = require('express');
// require cluster
const cluster = require('cluster');
// require os
const totalCPUs = require('os').cpus().length;

const app = express();

// Logic for the server to respond to requests
// will remain the same for all processes
app.get('/', (request, response) => {
    let count = 0;

    // Do some fake work
    for (let i = 0; i <= 5000000; i++) {
        count += i;
    }

    // send the response
    response.json({});
});


// Check if we are in the master process
if (cluster.isPrimary) {
    console.log(`Number of CPUs is ${totalCPUs}`);
    console.log(`Master ${process.pid} is running`);

    // Fork workers.
    for (let i = 0; i < totalCPUs; i++) {
        cluster.fork();
    }

} else {
    //   Listen on port 3000 in worker threads
    app.listen(3000, () => {
        console.log(`Server is running on port 3000 in worker ${process.pid}`);
    })
}

Note: If you're running Node.js <= v16.0.0, use cluster.isMaster instead of cluster.isPrimary.

Restart the server using npm start in terminal and you should see output confirming that our server has been started on multiple processes:

Number of CPUs is 8
Master 12345 is running
Server is running on port 3000 in worker 12346
Server is running on port 3000 in worker 12347
Server is running on port 3000 in worker 12348
Server is running on port 3000 in worker 12349
Server is running on port 3000 in worker 12350
Server is running on port 3000 in worker 12351
Server is running on port 3000 in worker 12352
Server is running on port 3000 in worker 12353

Testing With Clusters

We are ready to test our new server with the loadtest tool. Run the same command again:

loadtest -n 1000 -c 10 http://localhost:3000

Results:

  • ~600 requests per second 🔥
  • Mean latency: ~16 ms

Woohoo!!! Our app can now handle ~600 requests in 1 second with a mean latency of ~16ms. That's a 6x improvement in throughput!

Conclusion

Using clusters highly increases the throughput of the Express server and prevents any downtime of the application. By leveraging all available CPU cores, we can:

  • Increase throughput by 5-6x or more
  • Reduce latency significantly
  • Improve reliability - if one worker crashes, others continue serving requests
  • Maximize hardware utilization - use all available CPU cores

The cluster module is a powerful built-in feature of Node.js that every production Express application should consider implementing for optimal performance.