Keeping Your Express App Healthy: Monitoring Server Pressure in Production
Production outages are expensive. When your Express server goes down, it doesn't just affect uptime metrics—it impacts revenue, user trust, and your team's sleep schedule. The irony? Most Node.js crashes are preventable if you catch the warning signs early.
In this post, we'll explore the fundamental problems that plague Node.js web applications under load and how implementing intelligent health monitoring can transform your server from fragile to resilient.
The Problem: Node.js Under Pressure
Node.js runs on a single-threaded event loop. This design is brilliant for I/O-heavy workloads but creates unique challenges when your server starts struggling. Unlike multi-threaded systems that might slow down gradually, Node.js applications often exhibit binary behavior: they work perfectly until suddenly, they don't.
The Four Horsemen of Node.js Performance Degradation
1. Event Loop Delay
The event loop is the heartbeat of your Node.js application. Every callback, every promise resolution, every piece of asynchronous work goes through it. When the event loop gets blocked, everything stops.
// A seemingly innocent route that blocks the event loop
app.get('/process-data', (req, res) => {
const data = [];
// Synchronous heavy computation
for (let i = 0; i < 10000000; i++) {
data.push(Math.sqrt(i));
}
res.json({ processed: data.length });
});
The Symptom: Response times increase from milliseconds to seconds. New requests pile up in the queue. Eventually, load balancers mark your server as unhealthy, but by then, it's already drowning.
The Reality: A healthy event loop delay should be under 10-50ms. When it creeps past 100ms, you're in trouble. Past 1000ms? You're in crisis mode.
2. Memory Leaks (Heap Pressure)
JavaScript's garbage collector is good, but it's not magic. Memory leaks happen more often than we'd like to admit:
// A subtle memory leak in your middleware
const cache = new Map();
app.use((req, res, next) => {
// Oops, never cleaning this up
cache.set(req.session.id, {
timestamp: Date.now(),
data: req.body, // Could be large
});
next();
});
The Symptom: Heap usage grows steadily. Performance degrades over hours or days. Eventually, you hit the V8 memory limit and your process crashes with a fatal error: JavaScript heap out of memory.
The Reality: V8's default heap limit is around 1.4GB on 64-bit systems. Once you're using 80-90% of available heap, garbage collection becomes aggressive, causing performance to crater before the actual crash.
3. RSS Memory Bloat
Resident Set Size (RSS) represents the total memory your process is using, including heap, code, and native modules. While heap is managed by V8, RSS can grow due to:
- Native addon memory leaks
- Buffer allocations that aren't released
- Excessive caching in native modules
The Symptom: Your container's memory usage climbs until Kubernetes OOMKills your pod or your cloud provider throttles you.
4. Event Loop Utilization
This metric (introduced in Node.js 14+) tells you what percentage of time the event loop is actively working versus idle. High utilization (>95%) means your server has no breathing room to handle new requests.
// Creating constant event loop pressure
setInterval(() => {
// Expensive operation running constantly
const result = heavyComputation();
processResult(result);
}, 0); // No delay = constant pressure
The Symptom: Your server can technically respond but is always at capacity. Any spike in traffic causes immediate degradation.
The Traditional (Broken) Approach
Most teams discover these issues too late. The typical progression:
- "It works on my machine" - Local testing with low load shows no issues
- Deploy to production - Everything seems fine initially
- Traffic increases - Response times start climbing
- Cascade failure - One slow instance causes load balancers to send more traffic to healthy ones, overwhelming them too
- Total outage - All instances crash or become unresponsive
- 3 AM page - Someone's phone rings
The problem isn't just that these issues happen—it's that you have no visibility until it's too late. By the time your monitoring alerts fire, you're already in firefighting mode.
The Solution: Proactive Health Monitoring
What if your server could detect when it's struggling and take action before it crashes? This is where intelligent health monitoring comes in.
The concept is simple but powerful:
- Monitor key health metrics continuously
- Detect when thresholds are breached
- Respond gracefully by rejecting new requests
- Recover by giving the server breathing room
This is the circuit breaker pattern applied to server health. Instead of accepting requests until you crash, you gracefully degrade when under pressure.
Introducing express-under-pressure
Inspired by Fastify's excellent @fastify/under-pressure plugin, I built express-under-pressure to bring the same intelligent health monitoring to Express applications.
How It Works
Install the package:
npm install express-under-pressure
Add it to your Express app:
import express from 'express';
import { underPressure } from 'express-under-pressure';
const app = express();
// Configure health thresholds
underPressure(app, {
maxEventLoopDelay: 1000, // Event loop delay in ms
maxHeapUsedBytes: 200 * 1024 * 1024, // 200MB heap limit
maxRssBytes: 300 * 1024 * 1024, // 300MB RSS limit
maxEventLoopUtilization: 0.98, // 98% utilization max
message: 'Service Temporarily Unavailable',
retryAfter: 30, // Tell clients to retry after 30 seconds
});
app.get('/api/data', (req, res) => {
// Your route logic
res.json({ data: 'response' });
});
app.listen(3000);
Now your server continuously monitors these metrics. When any threshold is breached:
- New requests receive a
503 Service Unavailableresponse immediately - A
Retry-Afterheader tells clients when to try again - Your server gets breathing room to recover
- Existing requests can complete
Advanced: Custom Pressure Handlers
For more control, implement custom pressure handlers:
import {
underPressure,
pressureReason,
pressureType
} from 'express-under-pressure';
function handlePressure(req, res) {
const reason = res.locals[pressureReason];
const type = res.locals[pressureType];
// Log detailed metrics
logger.warn('Server under pressure', {
reason,
type,
eventLoopDelay: process.memoryUsage().eventLoopDelay,
heapUsed: process.memoryUsage().heapUsed,
});
// Increment monitoring metrics
metrics.increment('server.pressure', { type });
// Send custom response
res.setHeader('Retry-After', 60);
res.status(503).json({
error: 'Service Temporarily Unavailable',
reason: 'High server load',
retryAfter: 60,
});
}
underPressure(app, {
maxEventLoopDelay: 1000,
maxHeapUsedBytes: 200 * 1024 * 1024,
pressureHandler: handlePressure,
});
Conditional Logic Based on Pressure
You can also check pressure status within your routes to make smarter decisions:
app.get('/api/expensive-operation', (req, res) => {
if (res.locals.isUnderPressure()) {
// Skip expensive operations when under pressure
return res.json({ data: cachedResult });
}
// Perform expensive computation only when healthy
const result = performExpensiveOperation();
res.json({ data: result });
});
Real-World Impact
Let's look at what happens with and without pressure monitoring:
Without Pressure Monitoring:
Normal traffic: 200 req/s, 50ms p95 latency ✅
Traffic spike: 500 req/s
→ Event loop delay: 2000ms
→ Heap usage: 95%
→ Response time: 10s+
→ Cascading failures
→ Total outage
→ Recovery time: 5-10 minutes
With express-under-pressure:
Normal traffic: 200 req/s, 50ms p95 latency ✅
Traffic spike: 500 req/s
→ Event loop delay hits 1000ms
→ express-under-pressure activates
→ New requests get 503 immediately
→ Existing requests complete normally
→ Server recovers in 10-30 seconds
→ Gradual return to normal
→ No outage! 🎉
Best Practices
1. Set Realistic Thresholds
Don't wait until you're at 100% to activate pressure mode. Set thresholds at 70-80% of your capacity:
underPressure(app, {
maxEventLoopDelay: 1000, // Alert at 1s, not 10s
maxHeapUsedBytes: 150 * 1024 * 1024, // 150MB if max is 200MB
maxEventLoopUtilization: 0.85, // 85%, not 99%
});
2. Scope to Critical Routes
Apply more aggressive limits to expensive routes:
const apiRouter = express.Router();
underPressure(apiRouter, {
maxEventLoopDelay: 500, // Stricter limits for API routes
maxHeapUsedBytes: 100 * 1024 * 1024,
});
apiRouter.get('/heavy-operation', handler);
app.use('/api', apiRouter);
3. Combine with Load Balancers
Use pressure monitoring alongside your load balancer health checks:
underPressure(app, {
maxEventLoopDelay: 1000,
maxHeapUsedBytes: 200 * 1024 * 1024,
});
// Health check endpoint for load balancer
app.get('/health', (req, res) => {
if (res.locals.isUnderPressure()) {
return res.status(503).json({ status: 'unhealthy' });
}
res.json({ status: 'healthy' });
});
4. Monitor and Alert
Integrate with your monitoring stack:
function trackPressure(req, res) {
const reason = res.locals[pressureReason];
const type = res.locals[pressureType];
// Send to Datadog, Prometheus, etc.
metrics.increment('server.under_pressure', {
type,
route: req.path,
});
// Alert if pressure lasts too long
if (pressureDuration > 60000) {
alerting.critical('Server under pressure for >1min');
}
res.status(503).json({ error: 'Service Unavailable' });
}
underPressure(app, {
maxEventLoopDelay: 1000,
pressureHandler: trackPressure,
});
Comparison: Express vs Fastify Under Pressure
If you're curious about Fastify's implementation (which inspired this package), here are the key similarities and differences:
Similarities:
- Both monitor the same four metrics (event loop delay, heap, RSS, ELU)
- Both use the circuit breaker pattern
- Both support custom pressure handlers
- Both can be scoped to specific routes/routers
Differences:
- express-under-pressure is middleware-based (Express paradigm)
- @fastify/under-pressure is a plugin (Fastify paradigm)
- Fastify's version includes a
/statusendpoint by default - Express version provides
isUnderPressure()helper inres.locals
Both are production-ready and follow the same core principles. Choose based on your framework.
Conclusion
Node.js applications fail for predictable reasons: event loop blockage, memory pressure, and resource exhaustion. The key to building resilient systems isn't preventing these issues entirely—it's detecting them early and responding gracefully.
By implementing intelligent health monitoring with express-under-pressure, you transform your server from a binary system (working or crashed) into a resilient system that can detect, respond, and recover from pressure gracefully.
The next time your traffic spikes or a memory leak starts forming, your server will handle it like a pro: gracefully degrading performance instead of catastrophically failing.
Key Takeaways:
- Monitor continuously - Don't wait for crashes to tell you something's wrong
- Fail gracefully - Send 503s early, not after you're already drowning
- Set realistic thresholds - Activate pressure mode at 70-80% capacity, not 100%
- Combine strategies - Use alongside clustering, load balancers, and autoscaling
- Measure impact - Track metrics to prove the value of resilience
Your production servers will thank you. So will your on-call team.
Want to try it? Install express-under-pressure:
npm install express-under-pressure
GitHub Repository: github.com/karankraina/express-under-pressure
Inspired by: github.com/fastify/under-pressure
