Load balancing is a way to distribute a work load over two or more servers in order to achieve increased redundancy, scalability or security. The "balancing" can be performed in either software or dedicated hardware, but both result in a distribution of load across multiple machines. From a web server or web service perspective, the load balancer is then utilized by clients conveniently accessing just one URL. User requests sent to this location will then be handled by the load balancer and transparently sent to one of the available machines behind the scenes who perform the actual processing.
There are many characteristics of a load balancer and some have features and capabilities that others do not. The following summarizes various qualities that can play an important role when deploying a load balancer into a GIS environment.
Physical Characteristics
Software or hardware? Software based solutions can have the advantages of lesser cost and the ability to run on existing servers. Hardware based solutions on the other hand, can offer more features and may have built-in accelerators for certain tasks, SSL offload engines.
Balancing Metrics
Load balancers have to determine how they spread the work to be done through some type of metric or scheduler. Typically, this can be done via balancing algorithms like round robin, server load, server computing power, least connection or random. Having the ability to customize this method can prove handy. If the backend servers are not all identical, adjusting the balancing method based on the server computing power may provide an even load as slower machines are not overloaded.
Security
Security is always important. Load balancers normally bring with them the added benefit of hiding the network and servers that are behind the scenes doing the actual work for the requested service. Requesting a map service from one URL (the load balancer) may actually be assembled from many different servers that were behind a firewall. Since only the load balancer is allowed to communicate through the firewall security is increased as there would only be one point of entry.
Some load balancers also come equipped with some analysis capabilities to detect denial of service (DoS) attacks and take preventative measures.
Performance
Just having the fastest CPUs on the servers that are behind your load balancer does not always guarantee fast response times for requests. Offloading certain chores and duties of your web server to the load balancer, like SSL acceleration, can yield more capacity from them. Additionally, some load balancers have the ability to perform HTTP compression and HTTP caching.
HTTP compression can make certain requests that are sent back to the user smaller, saving precious network bandwidth. This of course, requires decompression by the client's browser which is normally transparent to the end user. HTTP caching can help eliminate requests to the backend servers all together as they are immediately retrieved from "memory" on the load balancer itself, this can greatly decrease the response times requests and increase overall throughput. It is important to note that a compromise to caching map data is that it may not always be the most current, but the load balancer configurations of cache expirations and regenerations can provide some flexible options.
Monitoring and Maintenance
Distributing the work load is important but being able to automatically detect and handle the failure and the recovery of backend servers is crucial. Load balancers offer many different capabilities in this area. While some can only detect if the server is alive and on the network, others have very elaborate methods such as retrieving content from pre-determined web pages. This can be advantageous in a GIS environment. Not only could you periodically check that your HTTP server was up and running but that your map service was returning the expected output. It is worth mentioning that the ability to check a map service to establish the availability of its contents from the software stack should be done carefully. Executing a check too frequently can spike resource consumption on backend servers which can hurt performance. If recurring verification is needed, use the lightest and/or smallest possible features to analyze against.
Similar to automatic failure detection, having the option to manually mark a machine as failed is great for maintenance. Marking a server as "healthy", either manually or automatically, where it can start serving requests again is also a powerful feature.
Scalability
Scalability is one of the key benefits and uses of a load balancer. This balancing gives it the ability to help keep response times low by distributing the requests over two or more servers. If a server is overloaded with too much work, queuing starts and response times go up. With all the backend servers performing work evenly, you can get more users into your environment and achieve scalability.
Session State
Some applications carry with them a "session state". When a state is involved, more care needs to be taken for balancing the load. Since the state is residing in the application server space by default, requests cannot go to just any of the other servers. The load balancer can help accommodate for this by using stickiness or persistence of the session state which keeps that user going to the originating server. An alternative would be to store the sessions into a shared resource like a database. Of course, this database should then also have its own load balancing mechanism. Depending on their size, the copying of session states to another tier in the environment can have performance impacts and should be taken into consideration.
Redundancy and Failover
Web server redundancy can be achieved from the machines behind a load balancer, but that one load balancer is still a single point of failure. A solution to this problem is to logically chain multiple load balancers together. This group then creates a failover cluster of the load balancing tier and removes the single point of failure which was the use of just one load balancer.
This clustering commonly comes in two forms of deployment, active/active and active/passive. With active/active, both (or all) the load balancers are up and capable of servicing requests. With active/passive, one node (the primary) can be up servicing requests with the secondary node coming online only when the first one failed.
In both cases, a link or heartbeat should be established between the load balancers to determine if one of the nodes in the high availability cluster has become unfit to fulfill its duties.
Another level of redundancy is obtained with ArcGIS Server by having the SOM load balance between multiple SOC servers. The SOM can be configured to adjust the work load sent to the SOCs based on their machines physical capacity.