Application load balancers are hardware devices that sit in between a bunch of servers in the data centre and the outside world (people requesting access to the servers). They balance the load (traffic, sessions) across a bunch of servers (applications actually) and make those multiple servers (and applications) look like one big server to the outside world. Actually the applications are replicated across multiple servers and the incoming requests are routed to any of them, based on multiple parameters. This is required for multiple reasons, chief of them given below.
Why are Application Load Balancers required?
Scalability: Suppose a single server is used to host a website or has some application that is frequently used by large number of users. There would come a point when the servers maximum capacity (concurrent connections, processor speed, bandwidth limit etc) is reached and multiple servers are required to be used. So, an application load balancer decides which user goes to which server. In other words, it distributes and balances the load (incoming process requests) across multiple servers and more importantly allows the addition of multiple servers.
High Availability: Suppose you are running a very critical application (like an ecommerce website) on a single server, there is always a possibility that there could be an application failure or server hardware failure. In those cases, it is better to run the same application in multiple servers – both for load balancing purposes and to avoid complete disruption of services in case a server/application fails as the load balancer can automatically identify if a server/application is down and not route any connection requests to it, until it is up and running once again.
Control: Load balancers not only determine if a server is available or not, but they also predict the approximate usage levels of each server/ application. This is required to decide where to forward the incoming requests – it is better to forward them to the server that is being least used at that point of time, for example.
A little History about Application Load Balancers:
Before we go in to the features of application load balancers, we would see a little history to understand the current implementations better.
Initially methods like DNS Round Robin was used to distribute the load across servers. It was a simple method where, if three servers are present, the connections would be sent first to one server, then the second and then third – one after another and the next time the order could differ. While this method was good at distributing load across servers, it was not actually load balancing. There was also no way of determining if a server was down so Availability was not always 100% as it depended on manual methods to determine that. There was another problem: Clients tend to cache the server information (including IP address) and go back to the same server they used before.
Then load balancing was built into the application software. Here, all the client requests go to the cluster IP first and then it is distributed to one of the available and the most suitable physical IP address (of the server/ application port). The problem of High Availability and load balancing is solved as the application developers would know the health of an application (if it is down) and can determine the connection density based on real time parameters to apply load balancing algorithms that is best for the particular application. But the problem is the fact that load balancing is entirely dependant on the application vendors (which might not be provided at all by some of them) and it needs to be done separately for each application. It becomes complex in a virtual server environment.
Network based Application Load Balancing Hardware:
The following steps are performed by network based application load balancing hardware devices which sit in between the leased lines (users) and the host servers to do load balancing of applications.
¤ When the user attempts to connect to the servers, the load balancer accepts the connections on behalf of the server (through a virtual IP address), changes the destination IP address to the physical server IP address and port numbers and forwards the request to the appropriate server.
¤ The server accepts the request, processes it and replies back to the load balancer.
¤ The load balancer now forwards this reply after changing the virtual IP address to the actual user IP address in the destination field so that the user thinks that the reply has come directly from the server.
Application level load balancing: A load balancer can make a distinction between a physical server and the application services running on it. It individually interacts with the applications instead of the underlying hardware, giving the load balancer the ability to load balance at application level instead of server level. Load balancers can balance the load of multiple applications, uniformly.
Health Monitoring for HA: Load balancers can individually verify if a server is working or not. They do this by conducting multiple tests (with increasing complexity) on the servers like Pinging etc. Generally this is done regularly and before the packets are sent to the server in order to ensure HA.
Load balancing parameters: The decision to route a connection request to a particular server over the other servers is taken based on a lot of real time parameters that are measured by application load balancers like load, response times, usage and utilization statistics, current connection counts, host utilization monitors, and a lot more based on the vendor. They also enable dynamic load balancing – sending more traffic to bigger servers (having more processing power) than smaller servers.
Connection persistence: After deciding to connect a particular user to a particular server , the load balancer still ought to determine if the traffic that follows afterwards from that user needs to be load balanced or not. It the session is a longer TCP connection (like FTP), then it should not get load balanced. If the session consists of multiple short lived TCP connections (like http) then it could be load balanced. But for certain http sessions like e-commerce applications, it is important that the users need to be connected to the same server. In such cases, first the applications need to be identified, and the decision to load balance or not, could be taken based on parameters like user name etc (instead of IP address as the proxy servers and NAT gives same IP address for multiple users) which are more permanent and can be read from the incoming packets.
You could stay up to date on the various computer networking technologies by subscribing to this blog with your email address in the sidebar box that says ‘Get email updates when new articles are published’.