 |
 |
[SAH] Massive Outage Coming?
|
 |
|
 |
|
Dedicated MacNNer
Join Date: Jun 2006
Location: Chicago
Status:
Offline
|
|
According to a Technical News article posted on 8/10 (reproduced below), a "massive outage" may coming.
UCB claims "without much disruption to normal user activity," but we know what could happen. Might be time to add a couple of days to your "Connect to Network Every N Days" setting.
Of course, no info as to when this will take place...
August 10, 2006 - 14:00 UTC
How about an update?
During the regular weekly outage the web servers were also taken down to replace batteries in a failing UPS. Something fishy happened during this procedure and a wire shorted out, causing a massive spark and one of the new battery connectors to vaporize to nothing. It was exciting and luckily nobody got hurt. The servers in question are on a spare UPS until the replacement batteries get replaced.
The new data recorder is still currently unable to collect data because of inexplicable tape drive errors. While we are researching that we are attempting to use disk drives to collect data in the meantime. These are hot-swappable SATA II drives in trays that will be shipped back and forth between Berkeley and Arecibo.
Speaking of tape drive errors, the last of our original set of DLT IV tape drives finally bit the dust. Sure, these drives were old, but we still have a lot of SETI@home data on these tapes to read! We've been using a Super DLT drive to read tapes in the meantime, and a replacement drive is on order.
The new Sun server (see July 26 news item for more information) is almost done being configured. Why is it taking so long? There was some confusion about how linux recognized the boot drives. Long story short, in a 24 drive configuration, the boot drive is /dev/sdm, and the secondary boot drive is /dev/sdo. Of course, the Fedora Core 5 installer does everything in its power to install on /dev/sda. Once the OS was installed, the creation of several large (>2TB) linux volume managed filesystems sitting on top of software RAID simply took a lot of time - the initial RAID sync's tooks hours, as did putting a filesystems on the new RAID devices.
The plan is for this server to become the sole science database server, replacing an E3500 system with flaky fibre channels and failing disk drives. We will have to unload all the data to files and reload it into the new database, meaning a massive outage, but most of this will happen "behind the scenes" without much disruption to normal user activity - only the splitters and assimilators will be offline during this procedure.
QS
|
|
|
| |
|
|
|
 |
|
 |
|
Administrator 
Join Date: May 2000
Location: California
Status:
Offline
|
|
They think it won't disrupt the users much. However, they do have a lot in the air at the moment. They may be one surprise away from a major outage.
BOINC covers this with all the alternative projects.
|
|
|
| |
|
|
|
 |
 |
|
 |
|
|
|
|
|

|
|
 |
Forum Rules
|
 |
 |
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
 |
 |
 |
 |
|
 |