Keeping Your Express App Healthy: Monitoring Server Pressure in Production

December 10, 2024•

9 min read

Production outages are expensive. When your Express server goes down, it doesn't just affect uptime metrics-it impacts revenue, user trust, and your team's sleep schedule. The irony? Most Node.js crashes are preventable if you catch the warning signs early.

In this post, we'll explore the fundamental problems that plague Node.js web applications under load and how implementing intelligent health monitoring can transform your server from fragile to resilient.

The Problem: Node.js Under Pressure

Node.js runs on a single-threaded event loop. This design is brilliant for I/O-heavy workloads but creates unique challenges when your server starts struggling. Unlike multi-threaded systems that might slow down gradually, Node.js applications often exhibit binary behavior: they work perfectly until suddenly, they don't.

The Four Horsemen of Node.js Performance Degradation

1. Event Loop Delay

The event loop is the heartbeat of your Node.js application. Every callback, every promise resolution, every piece of asynchronous work goes through it. When the event loop gets blocked, everything stops.

// A seemingly innocent route that blocks the event loop
app.get('/process-data', (req, res) => {
  const data = [];
  // Synchronous heavy computation
  for (let i = 0; i < 10000000; i++) {
    data.push(Math.sqrt(i));
  }
  res.json({ processed: data.length });
});

The Symptom: Response times increase from milliseconds to seconds. New requests pile up in the queue. Eventually, load balancers mark your server as unhealthy, but by then, it's already drowning.

The Reality: A healthy event loop delay should be under 10-50ms. When it creeps past 100ms, you're in trouble. Past 1000ms? You're in crisis mode.

2. Memory Leaks (Heap Pressure)

JavaScript's garbage collector is good, but it's not magic. Memory leaks happen more often than we'd like to admit:

// A subtle memory leak in your middleware
const cache = new Map();

app.use((req, res, next) => {
  // Oops, never cleaning this up
  cache.set(req.session.id, {
    timestamp: Date.now(),
    data: req.body, // Could be large
  });
  next();
});

The Symptom: Heap usage grows steadily. Performance degrades over hours or days. Eventually, you hit the V8 memory limit and your process crashes with a fatal error: JavaScript heap out of memory.

The Reality: V8's default heap limit is around 1.4GB on 64-bit systems. Once you're using 80-90% of available heap, garbage collection becomes aggressive, causing performance to crater before the actual crash.

3. RSS Memory Bloat

Resident Set Size (RSS) represents the total memory your process is using, including heap, code, and native modules. While heap is managed by V8, RSS can grow due to:

Native addon memory leaks
Buffer allocations that aren't released
Excessive caching in native modules

The Symptom: Your container's memory usage climbs until Kubernetes OOMKills your pod or your cloud provider throttles you.

4. Event Loop Utilization

This metric (introduced in Node.js 14+) tells you what percentage of time the event loop is actively working versus idle. High utilization (>95%) means your server has no breathing room to handle new requests.

// Creating constant event loop pressure
setInterval(() => {
  // Expensive operation running constantly
  const result = heavyComputation();
  processResult(result);
}, 0); // No delay = constant pressure

The Symptom: Your server can technically respond but is always at capacity. Any spike in traffic causes immediate degradation.

The Traditional (Broken) Approach

Most teams discover these issues too late. The typical progression:

"It works on my machine" - Local testing with low load shows no issues
Deploy to production - Everything seems fine initially
Traffic increases - Response times start climbing
Cascade failure - One slow instance causes load balancers to send more traffic to healthy ones, overwhelming them too
Total outage - All instances crash or become unresponsive
3 AM page - Someone's phone rings

The problem isn't just that these issues happen-it's that you have no visibility until it's too late. By the time your monitoring alerts fire, you're already in firefighting mode.

The Solution: Proactive Health Monitoring

What if your server could detect when it's struggling and take action before it crashes? This is where intelligent health monitoring comes in.

The concept is simple but powerful:

Monitor key health metrics continuously
Detect when thresholds are breached
Respond gracefully by rejecting new requests
Recover by giving the server breathing room

This is the circuit breaker pattern applied to server health. Instead of accepting requests until you crash, you gracefully degrade when under pressure.

Introducing express-under-pressure

Inspired by Fastify's excellent @fastify/under-pressure plugin, I built express-under-pressure to bring the same intelligent health monitoring to Express applications.

How It Works

Install the package:

npm install express-under-pressure

Add it to your Express app:

import express from 'express';
import { underPressure } from 'express-under-pressure';

const app = express();

// Configure health thresholds
underPressure(app, {
  maxEventLoopDelay: 1000,        // Event loop delay in ms
  maxHeapUsedBytes: 200 * 1024 * 1024,  // 200MB heap limit
  maxRssBytes: 300 * 1024 * 1024,       // 300MB RSS limit
  maxEventLoopUtilization: 0.98,  // 98% utilization max
  message: 'Service Temporarily Unavailable',
  retryAfter: 30,  // Tell clients to retry after 30 seconds
});

app.get('/api/data', (req, res) => {
  // Your route logic
  res.json({ data: 'response' });
});

app.listen(3000);

Now your server continuously monitors these metrics. When any threshold is breached:

New requests receive a 503 Service Unavailable response immediately
A Retry-After header tells clients when to try again
Your server gets breathing room to recover
Existing requests can complete

Advanced: Custom Pressure Handlers

For more control, implement custom pressure handlers:

import { 
  underPressure, 
  pressureReason, 
  pressureType 
} from 'express-under-pressure';

function handlePressure(req, res) {
  const reason = res.locals[pressureReason];
  const type = res.locals[pressureType];
  
  // Log detailed metrics
  logger.warn('Server under pressure', {
    reason,
    type,
    eventLoopDelay: process.memoryUsage().eventLoopDelay,
    heapUsed: process.memoryUsage().heapUsed,
  });
  
  // Increment monitoring metrics
  metrics.increment('server.pressure', { type });
  
  // Send custom response
  res.setHeader('Retry-After', 60);
  res.status(503).json({
    error: 'Service Temporarily Unavailable',
    reason: 'High server load',
    retryAfter: 60,
  });
}

underPressure(app, {
  maxEventLoopDelay: 1000,
  maxHeapUsedBytes: 200 * 1024 * 1024,
  pressureHandler: handlePressure,
});

Conditional Logic Based on Pressure

You can also check pressure status within your routes to make smarter decisions:

app.get('/api/expensive-operation', (req, res) => {
  if (res.locals.isUnderPressure()) {
    // Skip expensive operations when under pressure
    return res.json({ data: cachedResult });
  }
  
  // Perform expensive computation only when healthy
  const result = performExpensiveOperation();
  res.json({ data: result });
});

Real-World Impact

Let's look at what happens with and without pressure monitoring:

Without Pressure Monitoring:

Normal traffic: 200 req/s, 50ms p95 latency ✅
Traffic spike:  500 req/s
  → Event loop delay: 2000ms
  → Heap usage: 95%
  → Response time: 10s+
  → Cascading failures
  → Total outage
  → Recovery time: 5-10 minutes

With express-under-pressure:

Normal traffic: 200 req/s, 50ms p95 latency ✅
Traffic spike:  500 req/s
  → Event loop delay hits 1000ms
  → express-under-pressure activates
  → New requests get 503 immediately
  → Existing requests complete normally
  → Server recovers in 10-30 seconds
  → Gradual return to normal
  → No outage! 🎉

Best Practices

1. Set Realistic Thresholds

Don't wait until you're at 100% to activate pressure mode. Set thresholds at 70-80% of your capacity:

underPressure(app, {
  maxEventLoopDelay: 1000,     // Alert at 1s, not 10s
  maxHeapUsedBytes: 150 * 1024 * 1024,  // 150MB if max is 200MB
  maxEventLoopUtilization: 0.85,  // 85%, not 99%
});

2. Scope to Critical Routes

Apply more aggressive limits to expensive routes:

const apiRouter = express.Router();

underPressure(apiRouter, {
  maxEventLoopDelay: 500,  // Stricter limits for API routes
  maxHeapUsedBytes: 100 * 1024 * 1024,
});

apiRouter.get('/heavy-operation', handler);

app.use('/api', apiRouter);

3. Combine with Load Balancers

Use pressure monitoring alongside your load balancer health checks:

underPressure(app, {
  maxEventLoopDelay: 1000,
  maxHeapUsedBytes: 200 * 1024 * 1024,
});

// Health check endpoint for load balancer
app.get('/health', (req, res) => {
  if (res.locals.isUnderPressure()) {
    return res.status(503).json({ status: 'unhealthy' });
  }
  res.json({ status: 'healthy' });
});

4. Monitor and Alert

Integrate with your monitoring stack:

function trackPressure(req, res) {
  const reason = res.locals[pressureReason];
  const type = res.locals[pressureType];
  
  // Send to Datadog, Prometheus, etc.
  metrics.increment('server.under_pressure', {
    type,
    route: req.path,
  });
  
  // Alert if pressure lasts too long
  if (pressureDuration > 60000) {
    alerting.critical('Server under pressure for >1min');
  }
  
  res.status(503).json({ error: 'Service Unavailable' });
}

underPressure(app, {
  maxEventLoopDelay: 1000,
  pressureHandler: trackPressure,
});

Comparison: Express vs Fastify Under Pressure

If you're curious about Fastify's implementation (which inspired this package), here are the key similarities and differences:

Similarities:

Both monitor the same four metrics (event loop delay, heap, RSS, ELU)
Both use the circuit breaker pattern
Both support custom pressure handlers
Both can be scoped to specific routes/routers

Differences:

express-under-pressure is middleware-based (Express paradigm)
@fastify/under-pressure is a plugin (Fastify paradigm)
Fastify's version includes a /status endpoint by default
Express version provides isUnderPressure() helper in res.locals

Both are production-ready and follow the same core principles. Choose based on your framework.

Conclusion

Node.js applications fail for predictable reasons: event loop blockage, memory pressure, and resource exhaustion. The key to building resilient systems isn't preventing these issues entirely-it's detecting them early and responding gracefully.

By implementing intelligent health monitoring with express-under-pressure, you transform your server from a binary system (working or crashed) into a resilient system that can detect, respond, and recover from pressure gracefully.

The next time your traffic spikes or a memory leak starts forming, your server will handle it like a pro: gracefully degrading performance instead of catastrophically failing.

Key Takeaways:

Monitor continuously - Don't wait for crashes to tell you something's wrong
Fail gracefully - Send 503s early, not after you're already drowning
Set realistic thresholds - Activate pressure mode at 70-80% capacity, not 100%
Combine strategies - Use alongside clustering, load balancers, and autoscaling
Measure impact - Track metrics to prove the value of resilience

Your production servers will thank you. So will your on-call team.

Want to try it? Install express-under-pressure:

npm install express-under-pressure

GitHub Repository: github.com/karankraina/express-under-pressure

Inspired by: github.com/fastify/under-pressure

asyncby