 |
 |
High Availability Web Service
|
 |
|
 |
|
Fresh-Faced Recruit
Join Date: May 2001
Location: USA
Status:
Offline
|
|
I'm trying to designed a truly redundant web service. I've done some research, but maybe all you Mac gurus can tell me if I'm way off on this one
Coyote Point Load Balancing Server and Hot Spare ($7000)
2 XServes ($3000/ea)
So, I'm up to $13,000 already. Now either I need redundant storage that both computers access or some sort of replication. So let's say...
XRaid ($5000?)
A grand total of $18,000
Any better (read: cheaper) ideas?
|
|
|
| |
|
|
|
 |
|
 |
|
Moderator Emeritus 
Join Date: Dec 2000
Location: College Park, MD
Status:
Offline
|
|
I thought OSX server can replicate another box, and (automatically?) become it if the master goes down.
You might want to look into that, it'd save you $7000, and you wouldn't need to buy XRaid or anything like that.
|
|
|
| |
|
|
|
 |
|
 |
|
Fresh-Faced Recruit
Join Date: May 2001
Location: USA
Status:
Offline
|
|
You're right. And I thought SquirrelMail was the only cool thing in Jaguar server...
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: May 1999
Location: San Jose, CA
Status:
Offline
|
|
Originally posted by Scotttheking:
I thought OSX server can replicate another box, and (automatically?) become it if the master goes down.
On what are you basing this statement? Got a link?
I'm not saying it isn't true, just that I haven't seen any discussion of this feature or how it's controlled. It would be cool it if were true.
|
|
Gods don't kill people - people with Gods kill people.
|
| |
|
|
|
 |
|
 |
|
Fresh-Faced Recruit
Join Date: May 2001
Location: USA
Status:
Offline
|
|
It says it right at the top of the "Jaguar Server: Tech Specs" page on Apple's site. It's also in the Admin Manual for Jaguar Server.
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: May 1999
Location: San Jose, CA
Status:
Offline
|
|
Originally posted by Chiznibitz:
It says it right at the top of the "Jaguar Server: Tech Specs" page on Apple's site. It's also in the Admin Manual for Jaguar Server.
You're right. It doesn't look like it's too hard, although the documentation is a little lightweight, but that should provide basic failover capability.
The advantage that you'd get with a dedicated load balancer is that you'd be able to have both servers active at the same time, as well as support more than two servers (which is all that Mac OS X Server's failover supports). While useful for large site, these features might not be necessary for smaller users, and saves a lot of cash.
|
|
Gods don't kill people - people with Gods kill people.
|
| |
|
|
|
 |
|
 |
|
Moderator Emeritus 
Join Date: Dec 2000
Location: College Park, MD
Status:
Offline
|
|
Sorry for not posting a link, I didn't have one. I just remembered that Apple had said it somewhere, at some point. I had tried looking for one at the time, but didn't find one. Figured I'd post anyway.
If you do it, post how it works. I'm curious.
|
|
|
| |
|
|
|
 |
|
 |
|
Dedicated MacNNer
Join Date: Dec 1999
Location: Canton, OH
Status:
Offline
|
|
Wow...didn't know that Xserve would do this with built-in gui tools.
We just setup a new web cluster setup with 6 machines (Dell 1650 2x1.4PIII's) for a total of about $12,000.00. All servers are running RH 7.3.
2 machines are controllers. Only 1 is live and the other is the failover.
3 machines are web servers that are all live and share load. We will be able to serve up ~60,000,000 monthly page views with this setup without seeing any degredation in performance. Besides, we can easily add more web servers in our setup (up to 24 or so).
1 machine is attached to a 240G raid array and it acts as the staging machine and shared logging machine.
The concept of the Xserve setup is very simple in that the backup Xserve would just need to ping the primary Xserve every 5 seconds or so to see if everything is running ok. If not, it would then take over. I will be interested to know if OSX uses the network to do the ping or if you could use a usb or firewire cable. (We are using serial cables.)
What I would like to know is if the Xserve would actually check services as well. The probability of loosing a nic, power supply, hard drive, mother board, etc is low compared to httpd or the database having problems and going down while all the hardware is still functional.
PS - I'll second the request for more information once you find it out. If the Xserve does have gui utilities to set this up then it may be some competition for RH Advanced Server.
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: May 1999
Location: San Jose, CA
Status:
Offline
|
|
According to the Mac OS X Server Admin Guide it uses a network ping to validate the non/operation of the first machine, so will only failover on a network-level failure. That said, with the recent IP-over-FireWire tools from Apple, you should be able to use a FireWire connection for this.
That said, the tools that come with XServe, namely InterMapper, can start and stop applications based on layer 7 services. So InterMapper can check if the remote web server is running, and start up a local web server if needed.
However, that doesn't resolve the load balancing issue. Is that what the controller does? There's no load balancing element in XServe's failover, so you'd still need some way of load balancing amongst the active web servers.
Speaking of which, are you sure you can serve 60m pages off those three servers? That's a peak run rate of 700 pages per minute and I don't think you're going to get that off a dual PIII.
|
|
Gods don't kill people - people with Gods kill people.
|
| |
|
|
|
 |
|
 |
|
Dedicated MacNNer
Join Date: Dec 1999
Location: Canton, OH
Status:
Offline
|
|
Originally posted by Camelot:
According to the Mac OS X Server Admin Guide it uses a network ping to validate the non/operation of the first machine, so will only failover on a network-level failure. That said, with the recent IP-over-FireWire tools from Apple, you should be able to use a FireWire connection for this.
That said, the tools that come with XServe, namely InterMapper, can start and stop applications based on layer 7 services. So InterMapper can check if the remote web server is running, and start up a local web server if needed.
However, that doesn't resolve the load balancing issue. Is that what the controller does? There's no load balancing element in XServe's failover, so you'd still need some way of load balancing amongst the active web servers.
Speaking of which, are you sure you can serve 60m pages off those three servers? That's a peak run rate of 700 pages per minute and I don't think you're going to get that off a dual PIII.
Our current setup is a single PIII 933, 1.5G RAM (PC133 Interleaved), 36G 10K (Dell PowerApp Web 120). This webserver can handle ~10,000,000 page views monthly by itself. Well, actually, it is starting to choke out at 10,000,000 monthly page views. We are running our own build of RH 7.3 on the server. Since we built the kernel we built it for speed and reliability. Basically, if we didn't have to do updates we would never need to reboot the server.
Now, we aren't running database on the same machine. Also, everything we have is HEAVY php/mysql. During a one month period our MySQL database server sees just under 80,000,000 queries per month. This machine is a dual Xeon 2.4 with HyperThreading Enabled. Also, this kernel is built with only MySQL tools for the most part (for speed).
The cpu's on all machines aren't under too much load really. The biggest bottleneck is I/O, both network and disk. When we deploy the new cluster we are probably going to do port trunking to the database server over 2-4 Gbit connections. We are also running 2 gigabit networks, one for back-end and one on the front end to help network load.
By comparison, the system used to be all asp with a MS SQL back end running on Windows 2000 Server/IIS5. The web server was a dual Xeon 500 and it couldn't handle 1/3 the load of our current single machine setup. The database server was running SQL 7 with dual Xeon 550's (I believe) and it can't handle the database at all.
We did test Windows 2000 Advanced Server about 1 year ago when we were deciding if we wanted to re-write our asp system into php or not. Windows 2000 Advanced Server has WLBS (Windows Load Balancing Services) right out of the box. We found it to be unreliable at best in our setup. This might be a good setup for low to moderate traffic (or hosting a single web site) but not for high volume traffic. (At least from our testing.) WLBS relies on the fact that you are using a "stupid" switch or a hub. Whenever we tested our setup with a level 3 or higher switch it wasn't allways reliable. The general concept in this setup is the server would masquerade the Mac Address of its own NIC. The drawback to this setup is...what if one of the machines fails during a user session? From our testing the users browser would sometimes want to still connect to the down server.
In the setup we are going with we have a single live controller (with a hot spare) that directs all requests to any one of the servers in our cluster. The method we have chosen (can't think of the name) sends all requests for web pages through the controller during a user session. Now, we are using php sessions for every single visitor and we could do one of two things:
1. Leave our setup the way it is where the session information is stored on the servers file system. This requires that user sessions keep their affinity with the same server in the cluster during the session. The only down side to this is if a server is lost during the middle of the day so are 300 or so user sessions on that server. (Not very good) Howerver, we expect very good uptime for each machine and we can gracefully take them out of the cluster for updates at any time. (We chose this)
2. Store the user session information in a database. This would be ideal as user requests could then hit any server in the cluster and the server would retrieve and store user session information in a database. This was very concerning to us however because of the network overhead and the quality of hardware we would need for the database server.
What I like most about having controllers and individual servers is the ability to easily add a server to the cluster. We are doing net install and we can actually have a new server up and running in about 10 minutes ;-) So, six months from now, if we notice that we are serving up 50,000,000 pages/month and the servers aren't keeping up we just need to purchase a couple $2,000.00 servers and we would have them both up and running in about 10 minutes!
What are you looking to do exactly with your setup? Are you running a single web site off the servers? Does site content change very often? etc...
|
|
|
| |
|
|
|
 |
|
 |
|
Moderator Emeritus 
Join Date: Dec 2000
Location: College Park, MD
Status:
Offline
|
|
Chiznibitz, what exactly are you trying to do?
How many forecasted hits, is it static content, dynamic, SQL backend, etc?
That would decide whether 2 servers is enough or if you need a cluster.
Also, how experienced are you? Could you manage and operate a cluster?
Let us know.
--Scott
|
|
|
| |
|
|
|
 |
|
 |
|
Dedicated MacNNer
Join Date: Dec 1999
Location: Canton, OH
Status:
Offline
|
|
I'll second Scott's questions! I just checked out the documentation on failover services for OX Server and this may be all you need with no controllers.
|
|
|
| |
|
|
|
 |
|
 |
|
Fresh-Faced Recruit
Join Date: May 2001
Location: USA
Status:
Offline
|
|
So now I have 2 XServes. I'm playing around with IPFailover. I'm thinking of using "round robin" DNS to load balance and then set the 2 servers up to failover onto each other (yes, i am insane). i may add app-level checking later.
to clarify for everyone, the primary server broadcasts UDP packets every second via "heartbeatd" to no one in particular. the secondary, via "failoverd," listens for these broadcasts and (theoretically) takes a specified ip address when the broadcasts stop. so far, it's not happening.
|
|
|
| |
|
|
|
 |
|
 |
|
Moderator Emeritus 
Join Date: Dec 2000
Location: College Park, MD
Status:
Offline
|
|
Originally posted by Chiznibitz:
So now I have 2 XServes. I'm playing around with IPFailover. I'm thinking of using "round robin" DNS to load balance and then set the 2 servers up to failover onto each other (yes, i am insane). i may add app-level checking later.
to clarify for everyone, the primary server broadcasts UDP packets every second via "heartbeatd" to no one in particular. the secondary, via "failoverd," listens for these broadcasts and (theoretically) takes a specified ip address when the broadcasts stop. so far, it's not happening.
You don't want round robin DNS. If one server goes down, you have 50% of hits sent to a down box. High availability = spare equipment. Have you tested the failover? I'm curious. What happens if you say, pull the ethernet cable out of the primary? Let us know.
|
|
|
| |
|
|
|
 |
|
 |
|
Fresh-Faced Recruit
Join Date: May 2001
Location: USA
Status:
Offline
|
|
Round robin WITH IP failover. Won't that work? I was thinking if one server goes down, and the other picks up its IP address then I'll avoid the traditional round-robin problem.
I can't do any fun things like pull out ethernet cables (I'm about 100 miles away from the servers). I do know that you'd have to pull BOTH ethernet cables out for anything to happen (the crossover and the public interface).
so far i can't get real-time failover to happen. if i break the primary on purpose, i have to reboot the secondary before it realizes it should failover. it's strange. and, of course, sendmail is broken, so i don't even get the notification emails.
what's strange is that when it does failover it sets a weird netmask. normally, the netmask on the public network is 255.255.255.224. when it adopts the primary's IP it uses 255.255.255.255. if anyone can explain why it does that and how it's supposed to be accessible, i'd like to know.
so even when i force it to failover, i still don't get to see the secondary's webpages.
|
|
|
| |
|
|
|
 |
 |
|
 |
| |
|
|
|

|
|
 |
Forum Rules
|
 |
 |
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
 |
 |
 |
 |
|
 |
|