Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > macOS > Fusion Drive

Fusion Drive
Thread Tools
Salty
Professional Poster
Join Date: Jul 2005
Location: Winnipeg, MB
Status: Offline
Reply With Quote
Oct 24, 2012, 09:27 PM
 
Hey guys how long do we think it'll be before somebody figures out a way to make fusion drive available for other Macs? I have a 128 gig SSD in my MacBook and I took out the optical drive and put in a second HDD It'd be killer to have it setup as a fusion drive.
     
Waragainstsleep
Posting Junkie
Join Date: Mar 2004
Location: UK
Status: Offline
Reply With Quote
Oct 24, 2012, 11:52 PM
 
Thats basically what I have. My OS and apps live on the SSD, my user folder is on the HDD. Its nearly the same.
I have plenty of more important things to do, if only I could bring myself to do them....
     
Spheric Harlot
Clinically Insane
Join Date: Nov 1999
Location: 888500128, C3, 2nd soft.
Status: Offline
Reply With Quote
Oct 25, 2012, 01:12 AM
 
The point is that it's completely dynamic and transparent.

Often-used stuff gets automatically moved to the SSD; less-used stuff to the hard drive.
     
Waragainstsleep
Posting Junkie
Join Date: Mar 2004
Location: UK
Status: Offline
Reply With Quote
Oct 25, 2012, 02:27 AM
 
I'm just saying a good approximation is easy, what if you hack it and then every single update breaks the hack? Not going to be fun at all.
I have plenty of more important things to do, if only I could bring myself to do them....
     
mattyb
Addicted to MacNN
Join Date: Feb 2008
Location: Standing on the shoulders of giants
Status: Offline
Reply With Quote
Oct 25, 2012, 02:31 AM
 
For us lazy users who prefer to ask the learned Mac NN Forums dwellers, is this a combination of hardware and software on the newer machines only?

How will it work for those that install Windows alongside OSX?
     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Oct 25, 2012, 04:31 AM
 
The fogs have cleared a little on this point, and Apple engineers are speaking about it.

Fusion Drive support is actually present in Mountain Lion 10.8.2. There is no hardware component required beyond the two drives. The only thing missing is the special version of Disk Utility required to enable it. In effect, it does two things. The first is that it creates one volume that spans two drives (JBOD). This is not actually new - one can do that today in Disk Utility - but the news is that the OS keeps track of recently used files and moves recently used files to the SSD and other files to the HDD. It operates on a file level, not a block level, but it should keep work well enough for the common case. The second is that it has a 4 GB cache to which all files are written first. These are then written to HDD in bigger batches - basically this is the ZFS feature ZIL.

If you use Boot Camp to install Windows, it will just shrink the HFS+ partition and tuck in the NTFS partition at the end of the HDD. You cannot shrink the SSD partition.

The question that remains is if the special Disk Utility version will ever be released to us common plebs with older machines, and exactly how it works with the SSD - e.g. if it can be partitioned as long as the Fusion partition is bigger than the 128 GB Apple ships Macs with.
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
Waragainstsleep
Posting Junkie
Join Date: Mar 2004
Location: UK
Status: Offline
Reply With Quote
Oct 25, 2012, 09:18 AM
 
I wonder if that version of DU ships with iMacs or Minis that aren't configured to use the Fusion drive.
I have plenty of more important things to do, if only I could bring myself to do them....
     
Salty  (op)
Professional Poster
Join Date: Jul 2005
Location: Winnipeg, MB
Status: Offline
Reply With Quote
Nov 8, 2012, 08:10 PM
 
I wonder when I can get this into my MacBook Pro!

Also is it just me or would it be awesome if they came out with a MacBook Pro that was as thick as the current gen, basically the exact same but with a fusion drive and real GPU in the 13 inch and a bigger battery in the 15 inch? (and obviously retina displays) I'd much sooner buy one of those than the incredibly thin 13 inch.
     
besson3c
Clinically Insane
Join Date: Mar 2001
Location: yes
Status: Offline
Reply With Quote
Nov 8, 2012, 09:15 PM
 
I still think making HFS+ write far more often than it would otherwise have to is a recipe for losing important files.
     
Waragainstsleep
Posting Junkie
Join Date: Mar 2004
Location: UK
Status: Offline
Reply With Quote
Nov 9, 2012, 12:06 AM
 
Yeah, having read the details of that FD hack, my thinking now is that Apple's priority should be a new filesystem. Couldn't they try for ZFS again? Didn't it change hands since the last mess?
I have plenty of more important things to do, if only I could bring myself to do them....
     
besson3c
Clinically Insane
Join Date: Mar 2001
Location: yes
Status: Offline
Reply With Quote
Nov 9, 2012, 01:51 AM
 
Originally Posted by Waragainstsleep View Post
Yeah, having read the details of that FD hack, my thinking now is that Apple's priority should be a new filesystem. Couldn't they try for ZFS again? Didn't it change hands since the last mess?
It is owned by Oracle now, although Oracle may be putting its weight behind BTRFS, I dunno.

Unfortunately, ZFS may be great technology that ultimately dies out due to non-technological reasons. If Oracle's future is in BTRFS/Linux, maybe they might be inclined to sell ZFS to Apple at some point?
     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Nov 9, 2012, 03:30 AM
 
Actually ZFS changing hands is part of the problem. As long as Sun owned ZFS, Apple could probably make a deal - Sun was all about having its technologies finding market share by any means possible. Now that it's Oracle, Oracle would want some licensing money to let Apple use it (the open source variant is under GPLv2, which Apple can't use without giving away all of iOS, and anyway is not the latest version.

With FD, I have read up a bit on Core Storage. It seems to me to be the platform Apple wants to use to add features that HFS+ doesn't support. They have now added a tiering feature, and basically got something like ZIL for free. They have full disk encryption, and per file compression. What more do Apple want? RAID cannot be a priority for a company that mostly ships a single drive with their computers, and it could be done in Core Storage anyway. Deduplication would be awesome, but again, you can put it in Core Storage. ZFS headliner feature is data integrity by checksumming each block - another thing you could do in Core Storage. The question remains what you would do about said corruption - basically the only option is to restore it from Time Machine, which may not be present - but that issue would be there with ZFS as well. Maybe something like the ZFS scrub command could be run when doing the backups? Less demanding may be to just do the checksumming on the journal and the Catalog File. Snapshots are generally done in the LVM (Core Storage is an LVM), so that is no problem if Apple ever wants to do Time Machine 2.

I've been going on about the failures of HFS+ for years, and noone would be happier than me if we could get rid of it, but in Core Storage I at least see a way forward. Remember that making new advanced file systems from scratch is HARD - MS has failed twice now - so from a perspective of risk management, what Apple is doing makes sense.
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
besson3c
Clinically Insane
Join Date: Mar 2001
Location: yes
Status: Offline
Reply With Quote
Nov 9, 2012, 06:01 AM
 
Originally Posted by P View Post
Actually ZFS changing hands is part of the problem. As long as Sun owned ZFS, Apple could probably make a deal - Sun was all about having its technologies finding market share by any means possible. Now that it's Oracle, Oracle would want some licensing money to let Apple use it (the open source variant is under GPLv2, which Apple can't use without giving away all of iOS, and anyway is not the latest version.
With FD, I have read up a bit on Core Storage. It seems to me to be the platform Apple wants to use to add features that HFS+ doesn't support. They have now added a tiering feature, and basically got something like ZIL for free. They have full disk encryption, and per file compression. What more do Apple want? RAID cannot be a priority for a company that mostly ships a single drive with their computers, and it could be done in Core Storage anyway. Deduplication would be awesome, but again, you can put it in Core Storage. ZFS headliner feature is data integrity by checksumming each block - another thing you could do in Core Storage. The question remains what you would do about said corruption - basically the only option is to restore it from Time Machine, which may not be present - but that issue would be there with ZFS as well. Maybe something like the ZFS scrub command could be run when doing the backups? Less demanding may be to just do the checksumming on the journal and the Catalog File. Snapshots are generally done in the LVM (Core Storage is an LVM), so that is no problem if Apple ever wants to do Time Machine 2.
I've been going on about the failures of HFS+ for years, and noone would be happier than me if we could get rid of it, but in Core Storage I at least see a way forward. Remember that making new advanced file systems from scratch is HARD - MS has failed twice now - so from a perspective of risk management, what Apple is doing makes sense.
Snapshots. That is worth the price of investing in ZFS alone. In addition to this, you'd be getting rid of the HFS+ file locking stuff and replacing it (if I'm understanding things correctly) with copy-on-write.

The checksumming features do need multiple disks to care operations to, but couldn't this be a partition or a small additional SSD or something?
     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Nov 9, 2012, 06:46 AM
 
I guess one line got hidden in that block of text:
Snapshots are generally done in the LVM (Core Storage is an LVM), so that is no problem if Apple ever wants to do Time Machine 2.
The Linux LVM supports snapshots, for instance. ZFS is innovative partially because it includes features that are usually found in an LVM. Apple seems to have just decided to make an LVM instead. Snapshots would be great for a Time Machine 2 implementation, and are actually a completely plausible goal for 10.9 (Sabretooth?).

In addition to this, you'd be getting rid of the HFS+ file locking stuff and replacing it (if I'm understanding things correctly) with copy-on-write.
Atomic operations on file data is nice, but there are other ways to achieve that (basically, just run a database and send transactions to it. Apple even includes the framework for one in Core Data). The copy-on-write stuff that ZFS does for everything is more for the checksumming. The problem with HFS+ is operations on file metadata (adding or removing files, creating directories, renaming, etc) - modifying the Catalog File. HFS+ uses a single Catalog File for the entire volume, which means that only a single process can make such operations at a time - for the entire volume. That is a terrible performance handicap. Most filesystems have one process per file directory, while ZFS even supports multiple processes in a single directory. This is one of the big problems with HFS+, and no, Core Storage doesn't solve it. Unless Apple starts hacking things to the point where each directory is an HFS+ filesystem that grows and shrinks as it likes, all handled by Core Storage, we aren't getting away from this. But we are potentially getting away from a lot of other problems - and the scope of what "HFS 3" would have to do is reduced from a behemoth decade-long research project to something one might fit into the next OS X version.

The ZFS implementation of checksumming relies on multiple copies of the same file everywhere to be relevant (without it, it can warn without fixing), but one could conceivably make a copy of a few important files (the Catalog File and the journal, for instance) and use checksums to verify their integrity.
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
besson3c
Clinically Insane
Join Date: Mar 2001
Location: yes
Status: Offline
Reply With Quote
Nov 9, 2012, 08:17 AM
 
Originally Posted by P View Post
I guess one line got hidden in that block of text:
The Linux LVM supports snapshots, for instance. ZFS is innovative partially because it includes features that are usually found in an LVM. Apple seems to have just decided to make an LVM instead. Snapshots would be great for a Time Machine 2 implementation, and are actually a completely plausible goal for 10.9 (Sabretooth?).
Atomic operations on file data is nice, but there are other ways to achieve that (basically, just run a database and send transactions to it. Apple even includes the framework for one in Core Data). The copy-on-write stuff that ZFS does for everything is more for the checksumming. The problem with HFS+ is operations on file metadata (adding or removing files, creating directories, renaming, etc) - modifying the Catalog File. HFS+ uses a single Catalog File for the entire volume, which means that only a single process can make such operations at a time - for the entire volume. That is a terrible performance handicap. Most filesystems have one process per file directory, while ZFS even supports multiple processes in a single directory. This is one of the big problems with HFS+, and no, Core Storage doesn't solve it. Unless Apple starts hacking things to the point where each directory is an HFS+ filesystem that grows and shrinks as it likes, all handled by Core Storage, we aren't getting away from this. But we are potentially getting away from a lot of other problems - and the scope of what "HFS 3" would have to do is reduced from a behemoth decade-long research project to something one might fit into the next OS X version.
The ZFS implementation of checksumming relies on multiple copies of the same file everywhere to be relevant (without it, it can warn without fixing), but one could conceivably make a copy of a few important files (the Catalog File and the journal, for instance) and use checksums to verify their integrity.
They aren't block level snapshots, they are just deltas where if a file changed, a complete copy of that file is retained, right?

Regarding the checksumming, sorry, what I wrote above was very cryptic... Couldn't a small secondary SSD drive be used to help facilitate writes by keeping a temporary copy there for comparison? For example, say I write an entry to a database... That block would be written to both drives, and if there was a checksum mismatch, I guess the journal/intent log would help trigger a second attempt? Once the checksums were in parity, that file would be deleted off the small SSD. This approach (if possible) could be how you would really build a cool fusion drive, although I suspect the main impetus behind the fusion drive was a stopgap measure until SSDs are cheap enough to completely replace SATA drives.

Moreover, even leaving aside the whole self-healing thing (and I'm not fully confident the approach I've written above is viable), simply being able to trigger warnings is huge. As Siracusa laments, and I'll use his same example, it would suck if some priceless family picture just silently corrupts itself while we have absolutely no way of knowing it is corrupted until some years later when we want to get at that picture. If I saw a warning, I at least could try saving the photo off my camera again when it occurred. The OS being aware of these problems just seems like a sensible thing.

When I think about the state of HFS+ this way, this is what makes me think like the current fusion drive approach and the fact that HFS+ even still exists is insane. When we write stuff, we have no way of knowing whether that write was successful, and of course we have no way of correcting this. We have teased Windows users about their OS bit rot in needing to reinstall Windows to regain performance after a while, but we have file rot.

Now, you or somebody else brought up the comparison to other file systems as far as file rot goes in saying that other file systems do not self-heal like ZFS does if multiple drives are available. This is true, but at least with Linux the file system is abstracted enough that there is a choice of file system (although maybe Apple's brief stint with UFS indicates the same?), and that when there are better options available it will presumably not be as painful to migrate data. Moreover, I can't substantiate this right now, but I'm pretty certain that there are measures taken in other Linux/Unix based file systems to reduce file rot possibilities. I was reading something about file barriers in ext4 (although I don't really understand this), and as mentioned ext4 does the metadata checksumming too, so it looks like on the Linux side they are either already ahead of the game in file integrity, or at least well on their way to being so.

With OS X, we can only guess as to when Apple revises HFS+ I guess, but it looks like HFS+ hasn't been touched in a long time. I appreciate what you have been saying about Core Storage, Siracusa has been saying the same sorts of things, but to me CS is a viable way to gain new storage related *features*, but it isn't a viable way to rid HFS+ of its inherent weaknesses, as you have stated in your post, both performance and otherwise.

I'm wondering if Apple sees SSDs and their inherent far improved reliability as their solution to the file rot problem. This kind of bugs me a bit, because it seems like they are sweeping the problem under the rug, I like stuff that is designed as best as it can be. If nothing more a better file system would yield us better performance, but I suspect the jaw dropping performance we enjoy from any SSD in comparison to what we have been used to for so many years may be another opportunity for Apple to sweep this problem under the rug.

These sorts of issues are far more important in the server space, I'd imagine. I guess our only hope is that some day something will trickle down to the Mac, and Apple will capture this trickle.
     
cgc
Professional Poster
Join Date: Mar 2003
Location: Down by the river
Status: Offline
Reply With Quote
Nov 9, 2012, 10:30 AM
 
Originally Posted by Salty View Post
Hey guys how long do we think it'll be before somebody figures out a way to make fusion drive available for other Macs? I have a 128 gig SSD in my MacBook and I took out the optical drive and put in a second HDD It'd be killer to have it setup as a fusion drive.
I think all you have to do is plug a Fusion drive in and it works if you have OSX 10.8.2 or higher but 10.7 has the necessary underlying code (e.g. "CoreStorage") which makes a home-brew Fusion drive doable via the Terminal (not DiskUtility yet). BareFeats "made" their own Fusion-like drive and tested it vs. actual Fusion drives and other HDDs. This guy says it's best to make your own Fusion-like drive for numerous reasons.
     
besson3c
Clinically Insane
Join Date: Mar 2001
Location: yes
Status: Offline
Reply With Quote
Nov 9, 2012, 10:45 AM
 
Originally Posted by cgc View Post

I think all you have to do is plug a Fusion drive in and it works if you have OSX 10.8.2 or higher but 10.7 has the necessary underlying code (e.g. "CoreStorage") which makes a home-brew Fusion drive doable via the Terminal (not DiskUtility yet). BareFeats "made" their own Fusion-like drive and tested it vs. actual Fusion drives and other HDDs. This guy says it's best to make your own Fusion-like drive for numerous reasons.
Apple's fusion drives are scary enough, let alone non-condoned fusion drives. It will be interesting to see if we hear any horror stories. I'll let somebody else be the guinea pig.
     
cgc
Professional Poster
Join Date: Mar 2003
Location: Down by the river
Status: Offline
Reply With Quote
Nov 9, 2012, 02:30 PM
 
Originally Posted by besson3c View Post
Apple's fusion drives are scary enough, let alone non-condoned fusion drives. It will be interesting to see if we hear any horror stories. I'll let somebody else be the guinea pig.
I think a "non-condoned" fusion drive is MUCH less problematic because I know I can swap one or the other drive (e.g. SSD or HDD) if it fails. With a Fusion drive I'd probably need to but Apple branded parts but I don't think there's much to it other than the software side so it's prolly a wash. Performance is higher with the BareFeats fusion-like drive combo and I bet it costs less as well. It's all in the driver (e.g. CoreStorage).
     
besson3c
Clinically Insane
Join Date: Mar 2001
Location: yes
Status: Offline
Reply With Quote
Nov 9, 2012, 02:38 PM
 
Originally Posted by cgc View Post

I think a "non-condoned" fusion drive is MUCH less problematic because I know I can swap one or the other drive (e.g. SSD or HDD) if it fails. With a Fusion drive I'd probably need to but Apple branded parts but I don't think there's much to it other than the software side so it's prolly a wash. Performance is higher with the BareFeats fusion-like drive combo and I bet it costs less as well. It's all in the driver (e.g. CoreStorage).
It's a very cool idea, it just sits on top of an absolutely horrible file system. It was considered horrible many years ago, and still is so, unfortunately...
     
cgc
Professional Poster
Join Date: Mar 2003
Location: Down by the river
Status: Offline
Reply With Quote
Nov 10, 2012, 03:25 AM
 
Originally Posted by besson3c View Post
It's a very cool idea, it just sits on top of an absolutely horrible file system. It was considered horrible many years ago, and still is so, unfortunately...
Yup. HFS+ is adequate but not modern by any means...gets the job done though. Are you hoping for a ZFS-like filesystem in OSX 10.9 or something different? I'd suggest Btfs or something else not Apple-proprietary. A filesystem should be open to make it bullet proof.
     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Nov 11, 2012, 08:04 AM
 
Originally Posted by besson3c View Post
They aren't block level snapshots, they are just deltas where if a file changed, a complete copy of that file is retained, right?
In the Linux LVM? I think they would have to be block level. An LVM does not understand files - all it does is at the block level.

Originally Posted by besson3c View Post
Regarding the checksumming, sorry, what I wrote above was very cryptic... Couldn't a small secondary SSD drive be used to help facilitate writes by keeping a temporary copy there for comparison? For example, say I write an entry to a database... That block would be written to both drives, and if there was a checksum mismatch, I guess the journal/intent log would help trigger a second attempt? Once the checksums were in parity, that file would be deleted off the small SSD. This approach (if possible) could be how you would really build a cool fusion drive, although I suspect the main impetus behind the fusion drive was a stopgap measure until SSDs are cheap enough to completely replace SATA drives.
That sounds like full data journalling with the journal pinned to an SSD. That would be nice in the sense that all writes would be atomic, but I don't see how it would fix silent file corruption - SFC usually (statistically) strikes a while after it was written.

Originally Posted by besson3c View Post
Moreover, even leaving aside the whole self-healing thing (and I'm not fully confident the approach I've written above is viable), simply being able to trigger warnings is huge. As Siracusa laments, and I'll use his same example, it would suck if some priceless family picture just silently corrupts itself while we have absolutely no way of knowing it is corrupted until some years later when we want to get at that picture. If I saw a warning, I at least could try saving the photo off my camera again when it occurred. The OS being aware of these problems just seems like a sensible thing.
If you have Time Machine at a block level, you could do that: when you trigger the backup, checksum each and every block on both the backup and the live drive and if any block is off, restore from the other copy. Obviously it would break for older files that are no longer on the live, but if you at least know that, you could take that in to consideration when thinning. Without a backup, it makes less sense for the average user, but if the mechanism is there, of course one could use it for an early warning system.

Originally Posted by besson3c View Post
When I think about the state of HFS+ this way, this is what makes me think like the current fusion drive approach and the fact that HFS+ even still exists is insane. When we write stuff, we have no way of knowing whether that write was successful, and of course we have no way of correcting this. We have teased Windows users about their OS bit rot in needing to reinstall Windows to regain performance after a while, but we have file rot.
So do they. This is not new - what is new is that ZFS actually solves the problem.

Originally Posted by besson3c View Post
Now, you or somebody else brought up the comparison to other file systems as far as file rot goes in saying that other file systems do not self-heal like ZFS does if multiple drives are available. This is true, but at least with Linux the file system is abstracted enough that there is a choice of file system (although maybe Apple's brief stint with UFS indicates the same?), and that when there are better options available it will presumably not be as painful to migrate data. Moreover, I can't substantiate this right now, but I'm pretty certain that there are measures taken in other Linux/Unix based file systems to reduce file rot possibilities. I was reading something about file barriers in ext4 (although I don't really understand this), and as mentioned ext4 does the metadata checksumming too, so it looks like on the Linux side they are either already ahead of the game in file integrity, or at least well on their way to being so.
Apple has pretty good file system abstraction for everything except booting (which is only available for UFS or HFS+) - witness the ZEVO ZFS driver and the existing third-party r/w drivers for e3fs and NTFS. Since there now is a ZFS driver for Mac, that is not the problem - the problem is that we expect Apple to deliver something that just works out of the box so we don't have to fiddle with these things.

I think that what you call file barriers are what the ext4 docs call write barriers. Write barriers means limiting how the disk can reorder writes to make sure that a power failure doesn't leave the disk data in a corrupted state when the disk cache loses power. A good idea, trading write performance for some additional safety, but it seems irrelevant on a laptop.

Originally Posted by besson3c View Post
With OS X, we can only guess as to when Apple revises HFS+ I guess, but it looks like HFS+ hasn't been touched in a long time. I appreciate what you have been saying about Core Storage, Siracusa has been saying the same sorts of things, but to me CS is a viable way to gain new storage related *features*, but it isn't a viable way to rid HFS+ of its inherent weaknesses, as you have stated in your post, both performance and otherwise.
You are mostly correct that HFS+ hasn't been touched. Apple has removed the old HFS wrapper, but other than that, they have added exactly one big feature: journalling. The big filesystem-related features that have been added lately have been added in other parts of the system - Core Storage, Time Machine and Spotlight (which is, quite frankly, the best search feature in an OS ever).

Originally Posted by besson3c View Post
I'm wondering if Apple sees SSDs and their inherent far improved reliability as their solution to the file rot problem. This kind of bugs me a bit, because it seems like they are sweeping the problem under the rug, I like stuff that is designed as best as it can be. If nothing more a better file system would yield us better performance, but I suspect the jaw dropping performance we enjoy from any SSD in comparison to what we have been used to for so many years may be another opportunity for Apple to sweep this problem under the rug.

These sorts of issues are far more important in the server space, I'd imagine. I guess our only hope is that some day something will trickle down to the Mac, and Apple will capture this trickle.
Do we know that SSDs have better reliability? It seems likely, but I don't know if I have seen any long-term studies on that.

The massively improved random access performance of an SSD hides a lot of problems, true. It could also be that the kernel has other bottlenecks that limit the amount of threading you can get out of file system operations, so that fixing HFS+ wouldn't help all that much.
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
besson3c
Clinically Insane
Join Date: Mar 2001
Location: yes
Status: Offline
Reply With Quote
Nov 11, 2012, 02:14 PM
 
Originally Posted by cgc View Post

Yup. HFS+ is adequate but not modern by any means...gets the job done though. Are you hoping for a ZFS-like filesystem in OSX 10.9 or something different? I'd suggest Btfs or something else not Apple-proprietary. A filesystem should be open to make it bullet proof.
I would say it is inadequate.

I'm hoping for any sort of file system designed for the modern era of computing. I honestly don't know what to expect or hope for from Apple at this time, and I mean that literally. I can see them building on core storage, but I don't know what it will sit on top of.
     
besson3c
Clinically Insane
Join Date: Mar 2001
Location: yes
Status: Offline
Reply With Quote
Nov 11, 2012, 03:16 PM
 
Quote:
In the Linux LVM? I think they would have to be block level. An LVM does not understand files - all it does is at the block level.
You're right. I think the difference between a Linux LVM snapshot and a ZFS one is that in a ZFS snapshot the initial snapshot consumes no data, and it contains references to changed data that changes constantly as the live file system changes. I don't think you can restore from a ZFS snapshot if the original data were to vaporize, whereas I think you can with an Linux LVM snapshot. I might be wrong about this though. I guess this sort of makes sense though, since Solaris assumes that if you are running ZFS you are running it in some sort of RAID configuration that provides redundancy, and therefore the snapshots are just a way to backpedal, Time Machine style.



Quote:
That sounds like full data journalling with the journal pinned to an SSD. That would be nice in the sense that all writes would be atomic, but I don't see how it would fix silent file corruption - SFC usually (statistically) strikes a while after it was written.
Well, I guess that would be because of sectors going bad in SATA drives, or filesystem hierarchy corruption. You're right it wouldn't prevent SFC, but it would still be nice to provide some assurance that a written file was written correctly to begin with. Writing files seems to be sort of like sending a UDP packet, you just sort of do it and hope for the best.



Quote:
If you have Time Machine at a block level, you could do that: when you trigger the backup, checksum each and every block on both the backup and the live drive and if any block is off, restore from the other copy. Obviously it would break for older files that are no longer on the live, but if you at least know that, you could take that in to consideration when thinning. Without a backup, it makes less sense for the average user, but if the mechanism is there, of course one could use it for an early warning system.
I keep on neglecting to think about how an external Time Machine drive could be used to provide this sort of redundancy, this is very interesting thinking!

To further that thinking, Amazon has released an interesting new storage system called Amazon Glacier, which is designed to provide crazy cheap long term storage of files in exchange for a tradeoff where recovering said files is much slower than traditional backup means. IOW, Glacier is sort of like for your museum files that you just want to keep archived, but you don't actually need for work.

Apple has this new iCloud system... They could partner would Amazon and do all sorts of interesting things. I believe I read how Apple is already using Amazon S3 for storage? How about something like the fusion drive concept where the OS offers to archive files you very rarely touch into Glacier?

Cloud backup could be very cool. It's probably not terribly practical now because most home broadband uploads speeds are ass, but if it uploaded bits and pieces of your data in the background over time in small chunks or something? In keeping with your idea of using Time Machine as a means for doing checksum comparison, what about some sort of cloud system? Do you think that maybe a device like the Time Capsule is just a stop gap backup system? After all, it is only available to you when you are on the same network - what if you need to restore something while you're on the road?



Quote:
So do they. This is not new - what is new is that ZFS actually solves the problem.
When I wrote this I was thinking of design features such as CoW and write barriers. Copy on Write seems like a safer way to handle writes, if I'm understanding things correctly, especially in file systems that support multiple locks? I might be misinterpreting some of this though:

http://en.wikipedia.org/wiki/ZFS#Copy-on-write_transactional_model



Quote:
Apple has pretty good file system abstraction for everything except booting (which is only available for UFS or HFS+) - witness the ZEVO ZFS driver and the existing third-party r/w drivers for e3fs and NTFS. Since there now is a ZFS driver for Mac, that is not the problem - the problem is that we expect Apple to deliver something that just works out of the box so we don't have to fiddle with these things.
I don't really understand what it is about booting that makes it trickier business. I don't mean this as criticism of Apple, operating systems like FreeBSD still (last I checked) require booting from UFS, despite its ZFS support. Do you understand why this is?



Quote:
I think that what you call file barriers are what the ext4 docs call write barriers. Write barriers means limiting how the disk can reorder writes to make sure that a power failure doesn't leave the disk data in a corrupted state when the disk cache loses power. A good idea, trading write performance for some additional safety, but it seems irrelevant on a laptop.
It is definitely less relevant, but I guess there could be data loss when the battery dies... You're right though, I'm sure the impetus behind this was preventing power outages to ext4 file systems on servers where battery backup is lacking.



Quote:
You are mostly correct that HFS+ hasn't been touched. Apple has removed the old HFS wrapper, but other than that, they have added exactly one big feature: journalling. The big filesystem-related features that have been added lately have been added in other parts of the system - Core Storage, Time Machine and Spotlight (which is, quite frankly, the best search feature in an OS ever).
Did you try Google Desktop and Quicksilver? It looks like Google has discontinued Desktop because they see the future in cloud storage. As far as Google's personal strategy, this seems to make sense.

I still think Apple is behind in the whole cloud thing. iCloud could be far more than just a glorified device syncer. Would it make sense to store the Spotlight index in the cloud so that you can search across all of your devices from any device?


Do we know that SSDs have better reliability? It seems likely, but I don't know if I have seen any long-term studies on that.
The massively improved random access performance of an SSD hides a lot of problems, true. It could also be that the kernel has other bottlenecks that limit the amount of threading you can get out of file system operations, so that fixing HFS+ wouldn't help all that much.
*Shrug* to both (good) questions/points here...
     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Nov 12, 2012, 01:01 AM
 
Originally Posted by besson3c View Post
You're right. I think the difference between a Linux LVM snapshot and a ZFS one is that in a ZFS snapshot the initial snapshot consumes no data, and it contains references to changed data that changes constantly as the live file system changes. I don't think you can restore from a ZFS snapshot if the original data were to vaporize, whereas I think you can with an Linux LVM snapshot. I might be wrong about this though. I guess this sort of makes sense though, since Solaris assumes that if you are running ZFS you are running it in some sort of RAID configuration that provides redundancy, and therefore the snapshots are just a way to backpedal, Time Machine style.
The actual snapshot is just the same as ZFS - no writes when you take it, but it is not a complete backup. It can however be used to make a backup: take a snapshot and then make the backup from the snapshot. Apple now supports local TM backups, where the local drive in a laptop is used for backups that are then copied to the TM drive. I'm thinking about basing a "TM 2" on that feature: snapshots every hour on the hour, and then sending these snapshots to the external drive as it becomes available. ZFS can at least send snapshots - not sure if the Linux LVM supports that, but it seems doable.

Originally Posted by besson3c View Post
Well, I guess that would be because of sectors going bad in SATA drives, or filesystem hierarchy corruption. You're right it wouldn't prevent SFC, but it would still be nice to provide some assurance that a written file was written correctly to begin with. Writing files seems to be sort of like sending a UDP packet, you just sort of do it and hope for the best. 
Didn't the OS used to verify all copies back in the olden days? Maybe it's just me, but I think that back in the era of floppies, we verified copies at least. Did we just get so impatient that we skipped that, or did HDD mechanisms evolve better error correction? Maybe there is a good reason for all this - maybe writes just don't fail silently, so all corruption is over time - but I have a sneaking suspicion that before journalling, we were all so focused on the file directory going bad that file corruption became a secondary concern.

Originally Posted by besson3c View Post
I keep on neglecting to think about how an external Time Machine drive could be used to provide this sort of redundancy, this is very interesting thinking!

To further that thinking, Amazon has released an interesting new storage system called Amazon Glacier, which is designed to provide crazy cheap long term storage of files in exchange for a tradeoff where recovering said files is much slower than traditional backup means. IOW, Glacier is sort of like for your museum files that you just want to keep archived, but you don't actually need for work.

Apple has this new iCloud system... They could partner would Amazon and do all sorts of interesting things. I believe I read how Apple is already using Amazon S3 for storage? How about something like the fusion drive concept where the OS offers to archive files you very rarely touch into Glacier?

Cloud backup could be very cool. It's probably not terribly practical now because most home broadband uploads speeds are ass, but if it uploaded bits and pieces of your data in the background over time in small chunks or something? In keeping with your idea of using Time Machine as a means for doing checksum comparison, what about some sort of cloud system? Do you think that maybe a device like the Time Capsule is just a stop gap backup system? After all, it is only available to you when you are on the same network - what if you need to restore something while you're on the road?
I think Apple's idea is to make your Time Capsule available over the net from wherever you are, but right now upload speeds are too low.

Will have to look in to what Glacier is. I have a friend who is looking for a good long-term backup solution. Cloud backup truly seems like the way to go, so your files are safe from fire and theft, not just data corruption.

I think that Apple is using Microsoft Azure rather than S3 (for iCloud), but I may be wrong. Never followed the details in that business.

Originally Posted by besson3c View Post
When I wrote this I was thinking of design features such as CoW and write barriers. Copy on Write seems like a safer way to handle writes, if I'm understanding things correctly, especially in file systems that support multiple locks? I might be misinterpreting some of this though:

http://en.wikipedia.org/wiki/ZFS#Cop...actional_model
This is one of the things that I always liked about ZFS. It is basically full data journalling (ie, all writes are atomic, so you have either completed the write or not started it in case of a power failure or system crash) but implemented in such a way as to not crash write performance. Photoshop has done this for all saves since the effing eighties, except (obviously) in the userspace layer, so all saves take twice as long.

Originally Posted by besson3c View Post
I don't really understand what it is about booting that makes it trickier business. I don't mean this as criticism of Apple, operating systems like FreeBSD still (last I checked) require booting from UFS, despite its ZFS support. Do you understand why this is?
Yes. Booting is a chain of things all handing over to the next (slightly more complicated) piece of software. When turned on, the CPU gets a reset signal (which at that point is a single electrical signal on one dedicated pin - about as low-level as communication can be) which causes it to start reading a program at a certain address. This is where the firmware (EFI in Macs, usually BIOS in Wintel PCs, although they're moving to UEFI) is executed - first figuring out where the important hardware is and how to access it, then finding the files needed for booting (the bootloader first of all), loading them into RAM, and starting the execution of them. The trick about file systems is that to find those files, you basically have to implement a r/o filesystem driver in the firmware. HFS+ and UFS were both designed in antiquity with low memory requirements as a fundamental design point - they can be implemented in firmware. ZFS has many strong points, but it is complex and it is a memory hog. Sun had serious problems getting their own machines to boot from it.

Originally Posted by besson3c View Post
Did you try Google Desktop and Quicksilver? It looks like Google has discontinued Desktop because they see the future in cloud storage. As far as Google's personal strategy, this seems to make sense.
I use Google Desktop on my work laptop. It works well enough, but indexing over the network takes forever and seems so wasteful. Mac servers do the indexing once and for all and then lets the clients search as they will. It's good, but Spotlight is so much better.

Originally Posted by besson3c View Post
I still think Apple is behind in the whole cloud thing. iCloud could be far more than just a glorified device syncer. Would it make sense to store the Spotlight index in the cloud so that you can search across all of your devices from any device?
Interesting point. It wouldn't be live updating the way the current one is, but say a backup sent there every now and then so one can search all ones devices?
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
besson3c
Clinically Insane
Join Date: Mar 2001
Location: yes
Status: Offline
Reply With Quote
Nov 12, 2012, 01:22 AM
 
Quote:
The actual snapshot is just the same as ZFS - no writes when you take it, but it is not a complete backup. It can however be used to make a backup: take a snapshot and then make the backup from the snapshot. Apple now supports local TM backups, where the local drive in a laptop is used for backups that are then copied to the TM drive. I'm thinking about basing a "TM 2" on that feature: snapshots every hour on the hour, and then sending these snapshots to the external drive as it becomes available. ZFS can at least send snapshots - not sure if the Linux LVM supports that, but it seems doable.
Yeah, ZFS send and receive is *very* slick. This brings up another interesting possibility: sending snapshots to the cloud ala zfs send. I'm pretty sure LVM doesn't support it now. Redirecting output should be easy, but there are a few considerations here such as insuring that there aren't compatibility issues with the file system receiving the data.



Quote:
Didn't the OS used to verify all copies back in the olden days? Maybe it's just me, but I think that back in the era of floppies, we verified copies at least. Did we just get so impatient that we skipped that, or did HDD mechanisms evolve better error correction? Maybe there is a good reason for all this - maybe writes just don't fail silently, so all corruption is over time - but I have a sneaking suspicion that before journalling, we were all so focused on the file directory going bad that file corruption became a secondary concern.
You could be right that hard drives include error checking that makes this concern moot. I don't understand how error checking would work on a mechanically failing drive, but...



Quote:
I think Apple's idea is to make your Time Capsule available over the net from wherever you are, but right now upload speeds are too low.
Does this exist now?



Quote:
Will have to look in to what Glacier is. I have a friend who is looking for a good long-term backup solution. Cloud backup truly seems like the way to go, so your files are safe from fire and theft, not just data corruption.
There are a crapload of cloud backup products out there, but I would imagine that Glacier is the safest since it is more of an enterprise-type solution. To be as cheap as a Backblaze or a CrashPlan price-wise I would imagine that you'd have to really do this on the cheap.



Quote:
Yes. Booting is a chain of things all handing over to the next (slightly more complicated) piece of software. When turned on, the CPU gets a reset signal (which at that point is a single electrical signal on one dedicated pin - about as low-level as communication can be) which causes it to start reading a program at a certain address. This is where the firmware (EFI in Macs, usually BIOS in Wintel PCs, although they're moving to UEFI) is executed - first figuring out where the important hardware is and how to access it, then finding the files needed for booting (the bootloader first of all), loading them into RAM, and starting the execution of them. The trick about file systems is that to find those files, you basically have to implement a r/o filesystem driver in the firmware. HFS+ and UFS were both designed in antiquity with low memory requirements as a fundamental design point - they can be implemented in firmware. ZFS has many strong points, but it is complex and it is a memory hog. Sun had serious problems getting their own machines to boot from it.
Thanks, makes sense!


Interesting point. It wouldn't be live updating the way the current one is, but say a backup sent there every now and then so one can search all ones devices?
Yeah, and perhaps a single index for all of your devices. Perhaps it would make sense to sort of partition results to separate results per device, and/or perhaps files that haven't been touched in a gazillion years. I'm sure Spotlight already considers last modification dates in coming up with search results, but it might also be a good idea to simply not bother searching old ass stuff unless you tell it specifically to do so, in order to save time? If we go for a single index for all devices approach, this sort of search optimization might become necessary, for some people.
     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Nov 12, 2012, 03:30 AM
 
Originally Posted by besson3c View Post
Yeah, ZFS send and receive is *very* slick. This brings up another interesting possibility: sending snapshots to the cloud ala zfs send. I'm pretty sure LVM doesn't support it now. Redirecting output should be easy, but there are a few considerations here such as insuring that there aren't compatibility issues with the file system receiving the data.
As long as you can send it over the network at all, you should be able to send it to a cloud server. Obviously you need to make sure that you have all your ducks in a row wrt having enough data to recreate things in a restore, but certainly something like that should be possible. Deduping and some basic compression could go a long way towards making cloud backups a reality (e.g. if you have a full Office installation and a few thousand other people do as well, you don't need to send the blocks for that installation - you just increment a counter by one, so that that block is not deleted).

Originally Posted by besson3c View Post
Does this exist now?
It's just an AFP drive with systematically organized data on it. Should be as accessible as any AFP drive is - which is to say, it will work but not very well once the latency starts to creep up.

Originally Posted by besson3c View Post
Yeah, and perhaps a single index for all of your devices. Perhaps it would make sense to sort of partition results to separate results per device, and/or perhaps files that haven't been touched in a gazillion years. I'm sure Spotlight already considers last modification dates in coming up with search results, but it might also be a good idea to simply not bother searching old ass stuff unless you tell it specifically to do so, in order to save time? If we go for a single index for all devices approach, this sort of search optimization might become necessary, for some people.
As soon as you have single index, you can then refine your search as to what to include - just tag each entry with a source tag and filter on that tag (or not) as required.
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
mduell
Posting Junkie
Join Date: Oct 2005
Location: Houston, TX
Status: Offline
Reply With Quote
Nov 14, 2012, 10:30 AM
 
Originally Posted by P View Post
Actually ZFS changing hands is part of the problem. As long as Sun owned ZFS, Apple could probably make a deal - Sun was all about having its technologies finding market share by any means possible. Now that it's Oracle, Oracle would want some licensing money to let Apple use it (the open source variant is under GPLv2, which Apple can't use without giving away all of iOS, and anyway is not the latest version.
ZFS is open sourced under CDDL, not GPL2. In fact CDDL is GPL-incompatible, leading to problems adopting ZFS for Linux and solutions like FUSE.
     
   
 
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Top
Privacy Policy
All times are GMT -4. The time now is 12:46 AM.
All contents of these forums © 1995-2017 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.8 © 2000-2017, Jelsoft Enterprises Ltd.,