Various
Clustering Solutions
for commercial web related environments explored
Contact: sales@atlasindia.com
General
Requirements for fault tolerant, high availability systems with respect to the
web
in a mission critical application scenario.
1. Constant
Throughput - where the mean throughput variance is as low from the mean as
possible.
2. Request fullfillment times are approximately the same as under
average usage/load.
3. No Breakdown of service due to excessive or
unforecasted load.
4. System resource integrity even under high load
situations.
5. Under situations where degradation of service is inevitable, a
planned and systematic degradation response in possible.
6. System
Reliability and Scalability.
7. Some load balancing mechanism.
8.
Admission Control.
Term Definitions:
1. Load Balancing:
This is a term that refers to systematic redistribution of processing jobs and can be widely applied to any mechanism that distributes usage of a particular resource on a computing system. This can be hard disk usage, processor usage or any resource on a system.
In a web server scenario, it applies to the technique of routing user requests over a certain number of networked computers, so as to keep the average usage of any system's resource approximately the same withing that network that acts as a functional unit.
2. Scalability
If it is determined that the systems under a network operation are close to their peak capacities and that more system resources are required, scalable system architectures provide the facility to add more machines to the network, update certain configuration files and the additional system resources are added to the capacity of the network.
3. Reliability
Let's say there is a web server environment that consists of a certain number of frontend webservers, some backend servers and two DNS servers (primary and secondary).
The failure of any one of the frontend servers is detected by the other frontend servers and another frontend server assumes the responsibility for servicing all domains associated with the frontend server that failed.
Failure of the primary DNS is taken care by a connection to the secondary DNS server.
4. Admission Control
Many times web servers accept jobs that way beyond their capacity to handle in a reasonable amount of time. This leads to a disruption in the Quality of Service. Admission control aims to place an intelligent system of managing requests, where it is ensured that all jobs accepted for processing are completed in an acceptable period of time. When available resources are running low, new jobs are throttled. This is done by actively monitoring server resource usage and the main idea here is to complete jobs that have been accepted in a satisfactory period of time assuring a standardization of Quality of Service.
5. High Availability
A high availability
situation is when all of a network's resources are available for the maximum
amount of time.
Theoretically, the availability percentage can never be 100% but it the
endeavour is to bring this percentage of time as close to 100% as
possible.
Causes of downtimes
1. Physical
failures in system hardware and components.
2. Design flaws in the hardware
or software.
3. Human error during network administration and
operations.
4. Environmental problems such as power failures, natural
disasters.
5. Planned maintenance shutdowns.
"High Availability is an implied commitment that every organization, large or small, amkes when making it's first appearance on the Web. Anything less will mean lost customers and prospects."
-Steve Bourgeois, Blue Hills Technology Corporation, Oracle Magazine March/April 1999
Current Problems:
1. Internet Traffic
and Internet Network Bandwidth have both increased substantially.
2. Web
Server technology with respect to general Web servers setup for general
situations now have to cope up with an increase in traffic, increase in system
resources due to diverse application usage.
3. Web Server reliability has not
increased at the same rate as web usage rates.
4. Existing Web Server
softwares dominating the market are not adequately geared up to address issues
that result in a degradation of service.
"Eddie" from Ericsson
(www.ericsson.com):
Eddie is an Ericsson sponsored Open Source project. The aim of the project is make a multiplatform, commercial grade web server system Eddie is written in the functional programming language Erlang. Eddie works on Solaris, Linux and FreeBSD and a Windows NT version is coming soon. It works with a range of web server software including Apache. The software can support web sites distributed across multiple servers, in geographically distributed locations.
Eddie Implementation:
Eddie has two main software packages.
a) An Intelligent
HTTP Gateway
b) An Enhanced DNS Server
At each site, both these servers are installed. The Enhanced DNS server software replaces any existing DNS server.
Features of each of these software components of Eddie.
The Intelligent HTTP Gateway (IHG) server servers as the Front End and the the existing HTTP servers become the backend at each site Eddie is implemented.
The IHG continually gets load information from each of the backend servers. Requests are distributed evenly throughout the backend servers by the IHG. The IHG ensures that fraction of resource utilization for each of the backend web servers as compared to it's full capacity is constant throughout the network. This system of load balancing is dynamic load balancing and is more effective than static load balancing schemes.
The Design of Eddie ensures that the servers do not need to know about server, brand, model, processor speed. This makes it possible for servers not to be configured the same way, or to use the same operating system or to be reconfigured in case of any hardware upgrades. This protects the interests of the service provider using Eddie.
Scalability:
Eddie allows new servers to be added to each site by simply changing the Eddie configuration. This server's capacity is available throughout the distributed network thru the Enhanced DNS server package. This makes it simple for service providers to upgrade their network capacities at each of their locations.
Performance Optimization and Service Protection:
The IHG allows specialization of function among backend servers. Servers among the backend could be CGI Machines, Database servers etc.
The IHG parses all incoming requests and splits multiple HTTP 1.1 persistent connections into single requests.
It also schedules HTTP 1.0 and 1.1 requests to Backend servers assigned to such requests.
Quality of Service:
Eddie handles Quality of Service in a unique manner. It ensuresthat if a single page is served, then the user will be guaranteed delivery of all the other pages, in a highly satisfactory time. The IHG monitors several factors for each of the backend webservers. This includes, CPU load. memeory usage, disk delays, page faults etc. The system administrator decides the threshold value for critical resources which is used by the Admission control function to decide whether the particular system is overloaded.
If there exists an overload state, the user is shown a page that states that the system is busy and that the user should wait. If the user retries, he/she is given the status of the request which has already been placed in a queue. This facility saves system resources as the user does not keep hitting the same link button again and again. It also saves system resources to service all 'accepted' viewers and has more successful and satisfied users.
Reliability:
In Eddie's case this could be termed as 'seamless continuity'. When a backend or a frontend system fails, the IHG automatically detects the same and redirects requests to the IP address of the failed resource to another server with a copy of the same resource. The technique used here is IP address migration. When the failed server is brought online again the IP address is migrated back to it again.
The Enhanced DNS Server (EDS):
The EDS monitors load statistics from each of the front end servers, along with summary information from each of the front end and back end servers. Eddie is able to route traffic away from failed servers. In some cases where information is not able to be exchanged between EDS, the EDS independently makes an inference regarding the availability of each site. DNS entries are given a Short Time to Live so the user accessing the same site on different days gets the latest information.
Ideal Candidates for Eddie Usage:
a. Medium to Large
ISPs.
b. Companies with a significant web presence requiring high
availability and reliability.
Netware Cluster Services (NCS):
Netware Cluster Services (NCS) anables a collection of individual network servers to work together to provide users highly available access to their critical network resources including data, apllications and other services. If one netwrok server (node) happens to fail, another node in the cluster will automatically take over responsibility for the resources and services previously provided by the failed node, resulting in high availability of all clustered resources.
NCS Features:
1. High Availability:
If a network resource fails, NCS transparently restarts the failed node's applications on a surviving cluster.
2. Underlying Network:
NCS runs on top of Netware 5.
3. Signle System Image:
NDS presents a cluster of servers to the user as a single manageable object through the Console One Interface. Users and network administrators see the cluster resources as being provided by a single system.
4. Multi Node Distributed Failover:
On a node failure, applications and services are distributed over mutliple surviving servers to prevent an overload of any single node.
5. Transparent Client Reconnect:
When a node fails, a surviving node in the cluster takes over responsiblity for all of the failed node's services. Transparent client reconnects preserve users' drive mappings when their volumes are mounted on a surviving server. Win95/98 clients that use opne files and file locks are also supported.
6. Off the Shelf Hardware:
NCS requires no special hardware.
7. Remote, Automated Installation:
NCS installation runs from a client workstation that deploys the software over to all network servers targeted to become nodes on the cluster.
8. Automatic Trustee Migration:
When a node fails the NDS trustee rights are automatically migrated to the surviving node. The user has continuous access to the data.
9. Split Brain Detection:
NCS avoid data corruption during conditions where a multiple servers on the network try to mount the same colume of a failed node. This condition happens when normal inter node conditions do not occur and separate nodes think that they are the only surviving members of a cluster. NCS has a mechanism called SPLIT BRAIN DETECTOR that avoids this situation for potential data corruption.
10. Single Point Administration:
Netware Cluster Services leverages the power of Novell Directory Services, taking advantage of its single point administration capabilities.
Some clustering feature similarities between NCS and Eddie:
Each server node is in constant communication with other nodes. Slave and Master nodes exchange "heartbeat" signals. Similar to the Eddie solution, IP addresses and resources are dynamically tranferable.
"FLEX" from HP:
Traditional load balancing solutions try to distribute requests uniformly across all nodes regardless of the content. This interferes with efficient use of RAM in the cluster. The popular files tend to occupy RAM space on all the nodes. This redundant replication lowers available RAM. Another clustering variant would be where content is partitioned among the machines thus avoiding document replication in the RAMs. But static partitioning does not accomodatee varying access patterns. This has led to the design of "locality aware" balancing strategies which aim to avoid unnecessary document replication across the RAMs of the nodes to improve system performance. FLEX is a new "locality aware" solution for load balancing and management of an efficient Web Hosting Service.
Traditional load balancing techniques can be categorized into two major groups:
DNS based Approach
Round Robin DNS: this is a functionality built into the newer versions of DNS. Round Robin DNS looks at a list of IP addresses within a cluster that can serve up the requested content. Each time the same content is accessed it places the next IP address higher up in the list. So different clients are mapped onto different nodes in the cluster automatically.
Advantages:
Simple and Easy to setup and reasonable load balancing is achieved. Since it uses the existing DNS infrastructure, there is no additional cost.
Disadvantages:
The Round Robin technique has no way of knowing whether the next node on the IP Address list is active or not or whether it is already overloaded or not. If the node is unavailable, a request still gets passed onto it. To resolve this problem additional commercial grade software is required.
IP/TCP/HTTP based Approach
Hardware Load Balancers
In this type of load balancing solution, load balancing servers are positioned between a router (connected to the Internet) and a LAN switch which fans traffic to the Web Servers. Load Balancing Servers use their own proprietary alogrithms to route traffic to servers in a cluster. The Load Balancer uses a virtual IP address to communicate with the router, masking the IP addresses of the individual servers. So only the virtual address is visible to the internet community.
IP Addresses are never sent back to the browser because if a brwser were to start communicating directly with a server then the clustering mechanism would be defeated.
Software Load Balancers
In this case the server directly responds to the browser once the TCP session is handed off to it. Vendors claim performance improvements due to the fact that responses don't have to be re reouted through the balancing server and time is also saved because there is no IP address translation involved here.
"Locality Aware" Balancing Strategies:
In this kind of clustering implementation servers are divided into Front End and Back End Servers Front Ends act in the same manner as smart routers or switches and route requests to the appropriate node in the cluster.
The FLEX approach:
FLEX depends on partitioning of content based on sites rather than files and uses existing DNS infrastructure without the need for tcp connection handoffs. Flex is highly suited for virtual hosting environments where there are multiple sites and a bunch of servers. The FLEX approach advocates partitioning of content based on sites in such a way so that sites are served up by servers so that each server services approximately the same bandwidth, uses the same memory and processing resources.
The solution is completely self monitoring.
Flex performs best when each site is allotted to exactly one server so that there is no wastage of RAM by replication.
Flex does not have a centralized component or a front end switch, which could become a bottleneck on large cluster implementations.
Polyserve Server
UNDERSTUDY Clustering
and Load Balancing Software:
Polyserve Understudy is a software only server clustering solution for high availability and load balancing. The primary market for this product is end users and service providers who are into e-business.
Understudy manages, servers as virtual hosts, directing traffic to only those hosts that are correctly servicing end user requests. A virtual host can be configured to service any number of hosts in a cluster, either for load sharing or primary/backup support of servers. Understudy supports multiple virtual hosts per server making it an ideal for ISPs.
An Understudy cluster is a set of two host systems configured by Understudy as a cluster. All clusters run the Understudy daemon. A host may run any OS. Load balancing is achieved using the "round robin" technique. If a server in a pool fails the rest of the servers in the pool take over. An Understudy cluster can also function without load balancing where if a primary server fails, a backup server takes over. Each host in an Understudy cluster lets the other hosts know that it is up an available for client requests. This is the Understudy "heartbeat" towards validation that a server is alive and running.
Understudy Features:
1. Automatic IP Failover in case of failed server.
2. Automatic recovery and reintegration of failed servers into cluster.
3. Support for Linux, NT, Free BSD and Sun OS.
4. Primary/Backup Cluster Option.
5. Application Service Monitoring: HTTP, FTP, SMTP, TCP services are actively monitored, and failover will occur even if the server responds but these services do not.
6. Multiple Virtual Hosts per server cluster.
7. Interoperation with DNS round robin load balancing (BIND 4.9+)
8. Distributed Operation: No Single point of failure in cluster system.
9. Java Based GUI.
Sun Clusters:
Sun has been delivering cluster solutions for nearly four years. The Solstice HA (High Availability) software product and the Sun Enterprise Cluster Server (with parallel database support) deliver availability and performance scaling respectively.
Solstice HA is a software fault management product used in the Ultra Enterprise Cluster HA Server. The platform provides a high availability system that can automatically recover from any single point of failure.
A typical HA cluster has two nodes connected via Ethernet or FDDI. A "heartbeat" is passed periodically between nodes to monitor node status. Storage arrays on the network are redundantly connected to the servers. Although only one node owns a logical diskset, the other node can takeover in the case of failover.
When a node in an HA cluster fails, HA services are automatically transferred from the failed node to a backup node using a logical ID. Configurations within this two node cluster can be symmetric or asymmetric. In an asymmetric configuration applications on either node can failover to the other node.
Sun Enterprise Cluster:
The Sun Enterprise Cluster runs the two largest commercial parallel databases - Oracle OPS and Informix XPS.
This system has been designed for commercial enterprise computing and supports a high availability, large capacity architechture that also serves as a parallel database platform. This system is typically applied to OLTP, Data warehousing and decision support applications.
The most recent release of Sun Cluster 2.0, highly available applications that use data service modules can run on the same cluster as a parallel database.
High Performance Computing thru the compute cluster:
The Sun Ultra HPC Server creates an environment that shares parallel execution of applications across a number of networked machines, this product is sometimes referred to as the computer cluster. As in the case of any clustering solutions, this product includes a complete set of load balancing and monitoring tools, batch facilities, remote job execution capabilities, and libraries for the development of message passing environments; Load Sharing Facility (LSF), PMI and PVM.
Scalability within a Node:
This is basically clustering within a single machine.
The Ultra Enterprise 10000 System provides scalability upto 64 processors within a single node. With the Dynamic Reconfiguration feature the same system can run several instances of the operating system each running it's own application. So even application failover within the same node is supported by Sun. Currently the Sun Clusters run on the SPARC but INTEL based ports are coming soon.
The Sun Series of Cluster ready operating systems offer:
a. Powerful
netwroked computing environment.
b. Full multi threaded
support.
c.
Hot Plugging of disks, CPUs and devices.
d. Java
Computing.
e. Security.
f. Strong Relationship with thousands of ISVs.
g. Scaling of upto 64 CPUs
within a single machine.
h. Automatic Load
Balancing
IBM HACMP (High Availability Cluster Multi Processing) for AIX
This solution has been around since September 1992.
Features of this clustering solution:
1. Installation and
customization from a single console.
2. User can select the network adapters/
disk subsystems/ RS5500 Processor.
3. Concurrent and Parallel Data
Access.
This particular solution does not appear to be geared towards the web.
General Views on the various clustering schemes.
Various clustering schemes are available today that can be applied to a web related clustering enviroment. With respect to the Web with pure reference to features in the clustering arrangement, Eddie does offer more features and facilities as compared to other clustering products. HP's FLEX offers the advantage of using no single central Front End server, which Eddie does not support. But the fact the Eddie is Free and readily implementable across a wide variety of platforms, makes it a good bet for a high availability web enviroment. Additionally Eddie is Open Source, so it can be tailored to suit individual needs.
Resources Used for this study:
1) http://www.eddieware.org
2) http://www.hpl.hp.com/techreports/1999/HPL-1999-52R1.html
3) http://www.sun.com/clusters/wp.html
4) http://www.polyserve.com
5) http://www.microsoft.com/ntserver/support/faqs/clustering_faq.asp
6) http://www.turbolinux.com
7) http://www.cobalt.com