Load Balancers: Distributing Traffic Without Bottlenecks
What Is a Load Balancer?
A load balancer sits in front of your servers and distributes incoming requests across them. That's it. One IP address, many servers behind it.
Without one, every request hits the same machine. That machine becomes your ceiling — in capacity and in availability. When it goes down, your service goes down.
The Problem It Solves
Single servers have hard limits. You can only add so much CPU and RAM before vertical scaling becomes impractical or cost-prohibitive. At some point, you need to add more servers — and a load balancer is what makes them look like one.
It solves two things:
- Capacity — spread load across many servers so no one machine is saturated
- Availability — if one server dies, the load balancer routes around it
Note
How It Works
The load balancer receives every request and picks a server to forward it to. Common algorithms:
- Round-robin — request 1 goes to server A, request 2 to B, request 3 to C, back to A. Simple and even.
- Least connections — route to whichever server is handling the fewest active requests. Better for long-lived connections.
- IP hash — same client always hits the same server. Used when you can't avoid sticky sessions.
It also runs health checks. Every few seconds it pings each server — if a server stops responding, it gets removed from the pool. Automatically.
When to Add It
Add a load balancer when:
- Your single server is consistently hitting CPU or memory limits
- You want zero-downtime deploys (deploy to half the pool, shift traffic, deploy the other half)
- You need more than one server for redundancy
- Traffic is above ~2K sustained RPS on a single box
Rule of thumb
When NOT to Add It
- Early stage, single server, <1K RPS — it's unnecessary complexity
- When your bottleneck is the database, not the app server — adding more app servers behind a load balancer won't help if they all hammer the same DB
Common mistake
Real World
AWS ALB, Google Cloud Load Balancing, Nginx, and HAProxy are the common choices. Nginx is free and handles millions of requests per second on modest hardware. AWS ALB handles SSL termination, health checks, and path-based routing out of the box.
Netflix runs thousands of load balancers across regions. They treat them as commodity infrastructure — replaceable, stateless, instrumented.
Takeaways
- A load balancer distributes traffic and routes around failed servers
- It requires your app servers to be stateless — if they store local session state, round-robin breaks everything
- Add one when you have >1 server or >2K RPS
- It solves the app tier bottleneck — not the database bottleneck
- Don't add one prematurely; it adds latency and a new component to operate