 |
 |
Quality Data Storage & Protection for OSX Server
|
 |
|
 |
|
Forum Regular
Join Date: Jun 2000
Location: Indianapolis, Indiana
Status:
Offline
|
|
I will be getting a server for an academic laboratory. The lab is small now, but will grow over time. I need to have a good data depository, but it needn't be too expensive. What is a good way to economically (and practically interface with OSX Server) store a decent amount of data and have it safely redundantly backed up? Probably 1TB of actual space is enough for the moment. I don't want to skimp, but neither do I think I really need an entire industrial RAID array setup.
Are two external 1TB drives that are mirrored good enough? (can OSX setup RAID1 across external drives?)
Is Time machine good enough for that kind of thing?
Is it worth the expense to get a hardware mirror setup?
|
|
|
| |
|
|
|
 |
|
 |
|
Administrator 
Join Date: May 2000
Location: California
Status:
Offline
|
|
You could use Time Machine to a pair of externals. One at a time, with the other rotated off site. That would give reasonable protection against fire, flood, or ninja raids by the competition.
|
|
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
I personally wouldn't use Time Machine, I don't think it is particularly trustworthy in my experience, and you can reproduce its functionality minus GUI in ways that give you more direct control and transparency. A mirrored drive ought to be fine plus offsite backup if all you need is 1 TB. A hardware raid would be better suited for when you need a lot of I/O for live data, but for snapshots of 1 TB you probably don't need to go this far.
I'm a big fan of rsync for backups, but I'm also thoroughly entrenched in Unix stuff and don't really care about OS X specific metadata. I think you ought to think about exactly how you would want your backups to work and find a solution based on your needs. There are so many different ways to back stuff up and so many different features and options that it's hard to suggest something without knowing what your needs are and what you are comfortable with.
|
|
|
| |
|
|
|
 |
|
 |
|
Moderator 
Join Date: May 2001
Location: Hilbert space
Status:
Offline
|
|
Before I would give you any sort of recommendation, I'd like to know a bit more:
(1) How much data needs to be backed up now? What is the projected growth?
(2) You mention your lab is very likely to grow: how many members and how much data do you reckon you need to backup within, say, the next three years?
(3) What kind of redundancy are you looking for?
(4) Does your employer/university have any sort of backup service? (My university offers a free managed backup service via Tivoli Storage Manager.)
Keep in mind that in a professional situation, you always want to have at least 15-20 % free space, that's usually the point at which you upgrade your hardware.
rsync alone is not a backup tool, but there are quite a few unix backup tools based on rsync, rdiff-backup for instance. I've used it on the FreeBSD server I've had back in the day. However, it sounds as if you prefer something with a GUI. Perhaps you should also have a look at Synk. In any case, it won't be a matter of simply picking the right software.
In a professional environment it can also make a lot of sense to get a tape drive in addition to backups on harddrives. Offsite backups also make a lot of sense, especially if you can put the harddrive storage somewhere far away and still have GBit ethernet access to them (or in my case 100 MBit/s).
(Last edited by OreoCookie; Aug 21, 2009 at 06:46 AM.
)
|
|
I don't suffer from insanity, I enjoy every minute of it.
|
| |
|
|
|
 |
|
 |
|
Posting Junkie
Join Date: Nov 2000
Location: in front of my Mac
Status:
Offline
|
|
Both TM and rsync are fine tools. If you're a geek rsync (and asr) can do all you want, but obviously they require some effort. And my guess is that if you'd be familiar and comfortable with those tools, you'd have gone ahead already.
I'll therefore assume you'd prefer a simple solution. In that case TM and cloning are things you can look into. You could set up TM to backup to an external drive. That will give you version control and the option of simple rollbacks to pretty much any snapshot you like. For added protection I suggest you create a clone of the TM backup at regular intervals and then store than off site. If the lab burns down you'll still have a TM backup with your almost entire version history. Now a TM backup isn't everything either and as besson correctly points out, you shouldn't just blindly trust that one method. So I would also consider cloning (in Disk Utility > Restore if you want a GUI, 'asr' if you're fine with the CLI) of your main disk on a regular basis. You can have nightly clones done automatically. Of course less frequent is ok too if it suits your needs. For added protection you could think of rotating these clones with an off-site clone. Maybe take the clone home once every while again in case the lab burns down, gets flooded, whatever.
The software costs is zero since it's all included with every OS X installation. The hardware cost is minimal. You need a good TM disk, a clone disk for that TM disk, and maybe another two disks if you want to rotate off-site clones of your main disk too. It would become more expensive if you'd want to consider RAID, for example for the TM backup. RAID is a good thing, but you should note it is not a backup measure. You would want to have a RAID TM disk, if for example you knew that restoring your TM disk form a clone after it breaks down would take too long or would miss too many recent additions. RAID1/5 per se has much more to do with availability than with backing up. My guess is you don't need the kind of availability RAID is geared at. Buying regular external disks is going to make it cheaper than having to buy RAID cabinets. Of course if the added cost is low, a RAID1 or RAID5 TM disk is certainly not a bad thing.
Finally, let me say this may all sound complicated and overkill. But it's basically just a summary of things I think could be reasonable in your environment. You don't nee all of this to be 'safe'. Not trusting just one method alone is a good thing. Keeping something off site is a good thing. Keep that in mind, chose something you think is reasonable and stick to that. It's absolutely true that having a very elaborate scheme that then turns out to be too tedious to be actually followed is definitely worse than having a simpler scheme that is however carrier out strictly.
|
|
•
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
If it were me I wouldn't bother with a complete clone. Making complete clones in the form of disk images is a very wasteful use of resources on a system where you are aware of what files and directories change on a regular basis, and results in you backing up the same static stuff over and over again whether it needs it or not. For my servers it is preferable to keep the time it takes to run a backup to a minimal.
If this is also the case with you, focus on backups of your content that you cannot restore without your backup. Keeping a copy of your build in the form of a disk image, or even a complete copy of your system that you sync against is fine, but if you decide to do a complete clone of your system each night I would definitely time this beforehand and make sure that this time doesn't cause problems (such as creeping into your next work day). Cloning an entire TB can be very time consuming especially since this clone of your system will include thousands of very small files.
|
|
|
| |
|
|
|
 |
|
 |
|
Forum Regular
Join Date: Jun 2000
Location: Indianapolis, Indiana
Status:
Offline
|
|
1. Certain of our data are small - pure numerical data; many are 15mb images. I would expect a single TB to be enough for the next 2 to 3 years, though I figure it will have to be upgraded after that. And by then, there will be more resources anyway, more money, and should be easy to upgrade.
2. At the moment I know of no university backup policy. I did a lot of tech work for other schools and universities, and now that I'm here, I find it lacking in a global policy. Good point though, I'll check officially.
3. I think it's smart to have redundancy such that both single disc failure doesn't affect data (mirror?), and an offsite 'backup' would be great. Backup of OS/permissions is less of a big deal, and is pretty easy.
I figure, the OS HD be kept sterile I can clone it once a month even. You make good points that mirror isn't backup. Also, the idea of a network backup utilizing in-house LAN line might be a good idea.
Would two 1TB externals, one for data, one for an rsync incremental backup be robust/scalable? I would assume drives from different manufacturers is a good idea. I've never used NAS, is that worth checking into for a kind of pseudo-offsite backup?
|
|
|
| |
|
|
|
 |
|
 |
|
Posting Junkie
Join Date: Nov 2000
Location: in front of my Mac
Status:
Offline
|
|
Both TM and rsync approaches will be scalable. You can clone the backups to new larger disks if you want at any time. You can also exchange the main/source disk at any time. TM will require a bit more work when you swap disks than rsync, but rsync will require a bit more work to get set up initially. Roughly it's a wash.
NAS is not a bad idea, but there are caveats. Consumer NAS is usually slow (really slow, ~<3 MB/s) and faster pro NAS is usually very expensive. Since you were thinking of NAS for off-site backups, you'll probably be connecting the NAS to another switch. I have never seen a NAS below $1000 get more than 5MB/s even when connected to the same switch through Gigabit. If you now think of remote switches, cabling, etc. you see where you are going in terms of performance. If you use it for incrementals during the night you'll be fine. Anything else and it will probably be too slow.
Your idea with the two 1 TB external drives is fine. But it sounds like you're relying on one disk and one method for the entire backup. Personally, I'd try to use a second disk and a second method to be on the safe side. TM and rsync are both excellent tools, but when it comes to backing up your experimental data IMHO you can never be too cautious. Especially when disks are so cheap (fast 1T drives sell for as little as $80 shipped) and the software side is relatively easy to deal with.
(Last edited by Simon; Aug 23, 2009 at 01:20 AM.
(Reason:typo))
|
|
•
|
| |
|
|
|
 |
|
 |
|
Moderator 
Join Date: May 2001
Location: Hilbert space
Status:
Offline
|
|
rsync only syncs data, if you want incremental backups, you need to use something like rdiff-backup which is built on top of rsync. However, if you're not so comfortable with the command line, I reckon it'd be easier to get a tool like Synk.
From the sound of things, TimeMachine seems like a good start for your primary backup drive.
If your university has a server room, you could put equip second harddrive with network access or a Pogoplug-type device (you could also use some old server, set it up to serve a network volume and then copy your files onto that). They won't be as fast, but if your infrastructure is based on 100 MBit/s ethernet, then they'll as fast as the networking connection. And from the looks of it, your backups won't move really huge amounts of data.
If you are creative, you can perhaps scrounge for old servers or so. They often have very reliable hardware components and they may give you an edge in terms of reliability and speed compared to off-the-shelf harddrives with ethernet ports or so. Since I assume you may have extra (old) hardware lying around, this won't add to the bill. You could even get two harddrives and mirror them in order to protect you against hardware failure.
Keep in mind that these are consumer-grade solutions -- not necessarily/just because of software, but especially because of the hardware you use. You should be fine, but periodically remind the brass that you should be upgraded to pro-grade backup solutions. People often have a sort of `as long as it still works, why spend more?' attitude?
|
|
I don't suffer from insanity, I enjoy every minute of it.
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
I still disagree with using Time Machine. OreoCookie is right, it's a consumer level solution. I wouldn't trust it for a minute with important data though unless I can access logs of what it is actually doing, and until I can feel more comfortable about it actually being reliable.
rsync, on the other hand, while represented accurately in being more difficult to setup and learn, *is* a professional solution. At my old job we backed up terrabytes of valuable data across a SAN nightly, and we weren't the only enterprise relying on it heavily - it has earned the trust of many sys admins.
It can be setup to do incremental backups, you can do all sorts of stuff with it if you are willing to put in the time, and it is free.
My point is if putting in the time is justified for an enterprise class sort of solution despite the inconveniences, rsync is your best bet. Using it and cpio you can actually reproduce the Time Machine backup structure of relying on hard links across different incremental backup snapshots.
|
|
|
| |
|
|
|
 |
|
 |
|
Moderator 
Join Date: May 2001
Location: Hilbert space
Status:
Offline
|
|
@besson
I don't think it's balanced to criticize the use of Time Machine, Time Machine is a perfectly adequate solution regarding the hardware side of the solution. I haven't had a problem with Time Machine yet, I have found it to be reliable and unobstrusive. It's easy to implement and use. For the first while, at least, it'll be fine.
Once there is a `serious' budget for a backup solution, more sophisticated software should then be used in conjunction with more sophisticated hardware. rsync, for instance, is only useful if the admin can actually use it. Otherwise, it may be a horrible backup solution, not because of lack of capabilities or reliability, but rather because of complexity. Not a fault of the software, to be sure.
|
|
I don't suffer from insanity, I enjoy every minute of it.
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
Oreo: will TM report I/O errors? Files that it could not copy? Is there a log of each backup reporting on what was backed up? Can it email this information or somehow make this available remotely?
I've had family members that thought they were backing stuff up and weren't, for whatever reason (I never had the opportunity to verify this or look into this), but my big problem with it are these silent failures. My other problem it was the amount of time it would take to run, and the fact that at the time it didn't support non-Time Capsule network backups. Surely some of this has gotten better, I've never had much of an excuse to give it another look, but at any rate this is the basis for my low opinions of it for any real data of value beyond just family pictures and songs and junk.
|
|
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Jun 2001
Location: planning a comeback !
Status:
Offline
|
|
I actually have to agree with Besson, for mission critical stuff, I wouldn't trust TM alone. It's too new and unpredictable.
The problem with rsync is that you better know what you're doing.
I would say that rsync paired with limited understanding / knowledge is worse than TM by itself.
Therefore, it's really important to assess what the level of knowledge is of the person that is setting up the system and doing the backups.
-t
|
|
|
| |
|
|
|
 |
|
 |
|
Posting Junkie
Join Date: Nov 2000
Location: in front of my Mac
Status:
Offline
|
|
It is true, TM doesn't supply a lot of information through its GUI. However, you can find out what it's doing. I also claim at 95% if the people who complain about it being slow or 'stalling' or whatever else simply do so because they look at the GUI, see it's not progressing and then draw premature conclusions.
All of the TM progress information including any errors it encounters are logged to /private/var/log/system.log. If you use Console you can filter 'backupd' to see only TM messages.
What I actually do is use the free MkConsole to display system.log (among others) on my desktop. When TM is running I always see what's going on. Stalls are caused among other things by deep transversal (you'll get this after forced reboots or some system updates). Long post-backup activity is usually related to deleting many outdated backups, etc. You get all kinds of information there.
|
|
•
|
| |
|
|
|
 |
|
 |
|
Posting Junkie
Join Date: Nov 2000
Location: in front of my Mac
Status:
Offline
|
|
Also, besson, I suggest you just post a script here that uses rsync/cpio to mimic TM functionality.
We can critique it in this thread. If it holds what you promise it will indeed be just as easy as TM since it's already been done for others to use. And if not, we'll at least know how far rsync can go and for which functionality you'll need to switch to the real TM. Sounds fair enough, no?
(Last edited by Simon; Aug 23, 2009 at 01:41 AM.
)
|
|
•
|
| |
|
|
|
 |
|
 |
|
Moderator 
Join Date: May 2001
Location: Hilbert space
Status:
Offline
|
|
Originally Posted by besson3c
Oreo: will TM report I/O errors? Files that it could not copy? Is there a log of each backup reporting on what was backed up? Can it email this information or somehow make this available remotely?
CharlesS has written a small app that shows you what Time Machine has backed up. If I remember correctly, he claimed it was rather easy. system.log contains messages by backupd. Also, if you're really fancy, you can use Instruments (an app that comes with the Developer Kit) to monitor backupd.
If you wanted to check all this with command line tools, you'd also need to check these things regularly. And you'd have to be able to understand them. Not terribly difficult if you like the command line.
Originally Posted by besson3c
I've had family members that thought they were backing stuff up and weren't, for whatever reason (I never had the opportunity to verify this or look into this), but my big problem with it are these silent failures. My other problem it was the amount of time it would take to run, and the fact that at the time it didn't support non-Time Capsule network backups. Surely some of this has gotten better, I've never had much of an excuse to give it another look, but at any rate this is the basis for my low opinions of it for any real data of value beyond just family pictures and songs and junk.
But it sounds to me your low opinion is based on prejudice. Nobody here is suggesting it is a professional backup solution. But a professional backup software the enduser cannot use is useless as well.
|
|
I don't suffer from insanity, I enjoy every minute of it.
|
| |
|
|
|
 |
 |
|
 |
|
|
|
|
|

|
|
 |
Forum Rules
|
 |
 |
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
 |
 |
 |
 |
|
 |
|