Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > macOS > SpotLight - Why use indexing instead of database

SpotLight - Why use indexing instead of database
Thread Tools
MPMoriarty
Dedicated MacNNer
Join Date: Oct 2003
Location: Saint Louis, MO
Status: Offline
Reply With Quote
Jul 7, 2004, 05:29 PM
 
It appears that Apple chose to make SpotLight use an indexing method for collection and managing metadata about your files instead of using a relational database like Microsoft is supposed to do with WinFS in Longhorn.

I wonder why Apple chose to go this route? Is possibly using an indexing engine better in the long run?

I mean, the guy who designed BFS had a very big hand in SpotLight (that's for certain). So he must be choosing to use indexing instead of a database for a reason.

Or...

Could this possibly be the first step for SpotLight...just to get people used to the idea of searching for their files more than manually hunting for them. Then maybe once the technology is out for awhile, Apple will pump SpotLight more into a robust solution.

I'm sure Apple will make this technology shine one way or another.

What do you guy's think?

Mike
     
Horsepoo!!!
Banned
Join Date: Jun 2003
Status: Offline
Reply With Quote
Jul 7, 2004, 05:42 PM
 
Originally posted by MPMoriarty:
It appears that Apple chose to make SpotLight use an indexing method for collection and managing metadata about your files instead of using a relational database like Microsoft is supposed to do with WinFS in Longhorn.

I wonder why Apple chose to go this route? Is possibly using an indexing engine better in the long run?

I mean, the guy who designed BFS had a very big hand in SpotLight (that's for certain). So he must be choosing to use indexing instead of a database for a reason.

Or...

Could this possibly be the first step for SpotLight...just to get people used to the idea of searching for their files more than manually hunting for them. Then maybe once the technology is out for awhile, Apple will pump SpotLight more into a robust solution.

I'm sure Apple will make this technology shine one way or another.

What do you guy's think?

Mike
I think it's a transition issue. Moving to a database filesystem would be too big of a change for most people...in many cases would require a reformat of the hard drive. Then other problems arise from removable media that may not use a database filesystem. These storage devices would not work with the search engine and would most likely have to rely on indexing.
     
Brass
Professional Poster
Join Date: Nov 2000
Location: Tasmania, Australia
Status: Offline
Reply With Quote
Jul 7, 2004, 07:47 PM
 
Firstly, a clarification: Indexing and relational databases are not mutually exclusive. In fact a large relational database would be so slow as to be nearly useless if it didn't include indexing.

Secondly, another clarification: Apple IS using a database. They are NOT using a database-based file system, however. They are using a database which resides on the file system, not a file system that resides on a database.

There are some exceptionally good reasons for doing it the way they've done it:

With a database file system, you get an excellent way of searching and managing the files withing the file system. great. But that's all.

With an extensible database that Apple is using, it can be used for a LOT more than just files. For example, your email messages are not all stored in separate files, but Spotlight can search them! This is because the database is extensible to cope with a variety of datatypes, not just files. So any application developer can utilise spotlight for making their content searchable at the file level, but also at other levels (sub-components of the file, such as email messages).

I hope this makes sense. In any case, it's a very cool idea.

I guess the other good reason is that we don't need to re-format our hard disks. You just upgrade to Tiger, and it just works! Files are indexed (I guess) any time they are written or modified.
     
MPMoriarty  (op)
Dedicated MacNNer
Join Date: Oct 2003
Location: Saint Louis, MO
Status: Offline
Reply With Quote
Jul 7, 2004, 09:54 PM
 
I'm sorry I confused indexing and relational databases.

I thought that WinFS was really just a large relational database on top of the actual file system. Not an actual file system itself.
     
hmurchison2001
Senior User
Join Date: Jan 2001
Location: Seattle
Status: Offline
Reply With Quote
Jul 7, 2004, 09:56 PM
 

Could this possibly be the first step for SpotLight...just to get people used to the idea of searching for their files more than manually hunting for them. Then maybe once the technology is out for awhile, Apple will pump SpotLight more into a robust solution.
I'm thinking the same thing. It looks like iTunes and now Spotlight will be our "training wheels" so to speak. iTunes is preparing us for searching in an appropriate manner for metadata and creating "Smart" folders. I'm sure a db infused fs would be superior in ways and it may be likely that Apple introduces a new fs in Lion(?) 10.5. This makes me think that Core Data will be something that Apple pushes toward developers because it is a natural for db and once that data is in a db file it would seem to make the final transition to a true db based fs easy for the developer since db logic is supposed to be removed from the client(the concept of EOF and hopefully Core Data). This means less muss n fuss if a new fs is announced.

Plus is a new fs is coming I'm sure Apple will make sure all parties have the appropriate tools to prepare for the transition.
     
Chris Grande
Senior User
Join Date: Mar 2002
Location: CT
Status: Offline
Reply With Quote
Jul 7, 2004, 10:00 PM
 
In one of these posts I read that Mail 2.0 converts your mail into separate files instead of clumping them into the mbox format, is this true?
     
Millennium
Clinically Insane
Join Date: Nov 1999
Status: Offline
Reply With Quote
Jul 7, 2004, 10:15 PM
 
Originally posted by Chris Grande:
In one of these posts I read that Mail 2.0 converts your mail into separate files instead of clumping them into the mbox format, is this true?
I don't know if it's necessarily true, but it is certainly possible. There's already a standard way of doing this, known as maildir, which they might have decided to use. As you describe, this format stores messages in separate files. These would, at least theoretically, be much easier for Spotlight to index. However, they do come at a price: the user does not get to determine the filenames insize a maildir; if you change any names you mess things up. Apple could scare away most people from doing this, however, by turning a maildir into a Package, so that it would look and feel like a single file in the Finder. Once you've done that, the only problem left is to hide the Big Scary Filenames from Spotlight.

Ahem. Back to the original question: why use indexing? Because indexing is faster, sometimes by orders of magnitude. This is why indexing is popular even in relational databases, and why a large part of being a good DBA involves knowing what data needs to be indexed and what doesn't. Index too much and your database balloons in size; index too little (or the wrong stuff) and your database becomes too slow. Finding the right balance isn't always easy.

Besides which, an all-relational model would be a usability nightmare, because it's not navigable. The best systems will always be a hybrid of the two, in order to get a relational filesystem's awesome search capabilities while getting a hierarchical filesystem's superior usability in every other circumstance.
You are in Soviet Russia. It is dark. Grue is likely to be eaten by YOU!
     
MPMoriarty  (op)
Dedicated MacNNer
Join Date: Oct 2003
Location: Saint Louis, MO
Status: Offline
Reply With Quote
Jul 7, 2004, 10:44 PM
 
Originally posted by hmurchison2001:
I'm thinking the same thing. It looks like iTunes and now Spotlight will be our "training wheels" so to speak. iTunes is preparing us for searching in an appropriate manner for metadata and creating "Smart" folders. I'm sure a db infused fs would be superior in ways and it may be likely that Apple introduces a new fs in Lion(?) 10.5. This makes me think that Core Data will be something that Apple pushes toward developers because it is a natural for db and once that data is in a db file it would seem to make the final transition to a true db based fs easy for the developer since db logic is supposed to be removed from the client(the concept of EOF and hopefully Core Data). This means less muss n fuss if a new fs is announced.

Plus is a new fs is coming I'm sure Apple will make sure all parties have the appropriate tools to prepare for the transition.
What is Core Data?
     
hmurchison2001
Senior User
Join Date: Jan 2001
Location: Seattle
Status: Offline
Reply With Quote
Jul 7, 2004, 11:42 PM
 
Core Data is a new API that Apple announced during WWDC. It sounds a lot like the Enterprise Object Framework(EOF) that Webobject uses and thus what the Apple Store and iTunes Music Store use. I'm no expert and I'm trying to wrap my head around this but here's the gist of what I'm understanding and perhaps someone with far more experience will shed some light.

Your datafiles that reside in database form, prescribe to a certain schema. So say you have a SQL db with your data then the client apps that access this data have to incorporate the "logic"(rules) to access the db. However according to a webobject video I watched on Apples site they state that changes made to the db have to be reflected in the client app. They went on to say that with Webojects and EOF you describe the "Logic" not in the db or the client but actually between the two(they have a nice graphic that makes sense). So this seems to provide some sort of abstraction between the db and client. Thus, modifications now to the db do not require client modification only the "Logic" sections need to sync. Now that your client doesn't need to contain all this extra code it makes them a natural for accessing multiple db.

Man I can't find the video on Apple's page anymore. I'll keep looking and post it. Core Data seems to be in the mold of EOF. Where Webobjects is your application server and development platform Core Data seems to be stripped down to the basics. If would seem to make sense that you would want certain data objects to remain "persistent" and have access to them via the client. However no data is "that" persistent and being able to change that data without requiring a bunch of changes in the client apps would be nice.

Apple has to be using this already in a way because the latest iLife apps can share data without the other app being open. My iMovie can find iTunes files and use them when it needs to. I think this technology might ease the transistion to a new fs as well.

I'm just learning this stuff so please please anyone with experience please chime in and make us all smarter. Plenty of developer seem jazzed on the idea of Core Data provided it has enough features to make it worthwile. Here is Apples WWDC blurb on it emphasis mine.

Introducting Core Data This session provides an overview of the new Core Data framework in Cocoa. It will focus on the new functionality provided for managing and persisting model objects, which includes automatic undo/redo, input validation, and saving to various types of "persistent stores" (SQL and XML).

Advanced Core Data Learn about the more advanced features of the new Core Data framework, including how to work with multiple persistent stores at the same time, how to use predefined fetch requests and predicates to find your objects, how to get more out of your validation rules, and how to manipulate schemas at runtime.
and Xcode

Xcode Modeling and Design Want to take your software design skills to the next level? Learn about Xcode's new design tools for object design and persistent object modeling. With these new tools you can view and edit a visual model of your object-oriented code in C++, Objective-C, or Java, and use the model to navigate your source base. Then, create an object graph of your application's object model, and automatically generate a schema for Cocoa's new Persistence Framework.
Obviously Apple is quite serious about this. Perhaps not all developers can take advantage but for some it's probably a godsend.
     
MPMoriarty  (op)
Dedicated MacNNer
Join Date: Oct 2003
Location: Saint Louis, MO
Status: Offline
Reply With Quote
Jul 8, 2004, 12:49 AM
 
Sounds interesting.
     
   
 
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Top
Privacy Policy
All times are GMT -4. The time now is 05:36 PM.
All contents of these forums © 1995-2017 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.8 © 2000-2017, Jelsoft Enterprises Ltd.,