Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > Mac OS X > Spotlight : Does it scan inside of PDFs?

Spotlight : Does it scan inside of PDFs?
Thread Tools
Mac Elite
Join Date: May 2002
Location: Los Angeles, CA.
Status: Offline
Reply With Quote
Jul 21, 2005, 03:13 AM
 
I recently saved a couple websites [with a lot of text on them] into PDFs (print -> save as PDF) ....


.... when I search the folder I saved it in using Spotlight and a couple keywords, it doesn't list the PDF.

Does anyone have any ideas as to why this doesn't happen?

Maybe it's my own computer? Maybe I need a plugin?

any clues?

     
Senior User
Join Date: Feb 2003
Location: USA
Status: Offline
Reply With Quote
Jul 21, 2005, 03:58 AM
 
Hmm, PDF searches work fine here. Check Spotlight preferences, make sure "PDF Documents" is selected under Search Results. Also make sure you didn't set any Privacy options.
MacBook 2.0 160/2GB/SuperDrive
Lots of older Macs
     
Posting Junkie
Join Date: Mar 2004
Location: MacNN database error. Please refresh your browser.
Status: Offline
Reply With Quote
Jul 21, 2005, 04:01 AM
 
Ditto on above post.

This is a computer-generated message and needs no signature.
     
Grizzled Veteran
Join Date: Sep 2000
Location: London, UK
Status: Offline
Reply With Quote
Jul 21, 2005, 09:19 AM
 
Spotlight does search PDFs. What I have found is that it doesn't always index your new files in a timely manner. I have searched for text in emails and PDFs that I know is there and the file doesn't get found. If I go and find it manually, then the next time it is searched, it shows up. Maybe the process of 'touching' the file (by finding or opening it oneself) kicks off the metadata indexing process. But it doesn't always seem to happen automatically when a new file is created.

It would be good to have some admin tools for spotlight to force indexing, by kind, etc. I feel some Apple feedback coming on…
     
Moderator
Join Date: May 2001
Location: Hilbert space
Status: Offline
Reply With Quote
Jul 21, 2005, 09:51 AM
 
Works fine for me.
I don't suffer from insanity, I enjoy every minute of it.
     
Mac Elite
Join Date: Nov 2001
Status: Offline
Reply With Quote
Jul 21, 2005, 10:06 AM
 
Originally Posted by wulf
Spotlight does search PDFs. What I have found is that it doesn't always index your new files in a timely manner. I have searched for text in emails and PDFs that I know is there and the file doesn't get found. If I go and find it manually, then the next time it is searched, it shows up. Maybe the process of 'touching' the file (by finding or opening it oneself) kicks off the metadata indexing process. But it doesn't always seem to happen automatically when a new file is created.

It would be good to have some admin tools for spotlight to force indexing, by kind, etc. I feel some Apple feedback coming on…
I've had the same experience.

Note it *is* possible to force spotlight to re-index particular types of finds. I don't recall exactly how it's done -- I think you have to run 'mdimport' with the appropriate importer. I did it once for Circus Ponies Notebook files, and I know you can do it for any file type with the right importer (there's one for each file type that Spotlight can index).
     
Senior User
Join Date: Feb 2003
Location: USA
Status: Offline
Reply With Quote
Jul 21, 2005, 06:20 PM
 
I did a test...I saved to PDF a web page that had a keyword I knew would not exist anywhere else on this machine, immediately did a Spotlight search for that keyword, and it found it right away. And this is on a lowly G3 400.
Very strange, will be curious to see how this unfolds...
MacBook 2.0 160/2GB/SuperDrive
Lots of older Macs
     
Mac Elite
Join Date: Dec 2000
Location: Northern California
Status: Offline
Reply With Quote
Jul 21, 2005, 08:03 PM
 
I just tried printing to PDF this very page and searching for the word "unfolds," which returned no results prior, and returned the PDF file immediately after. It must be your particular machine, certainly it's supposed to happen immediately. Does Spotlight /ever/ return results from those files, days after?
Mac OS X 10.5.0, Mac Pro 2.66GHz/2 GB RAM/X1900 XT, 23" ACD
esdesign
     
Mac Enthusiast
Join Date: Jul 2002
Location: Sydney, Australia
Status: Offline
Reply With Quote
Jul 21, 2005, 08:57 PM
 
How do you FORCE Spotlight to re-index, or to index a particular folder / location? Is such a thing possible?

Hmm... what are the chances you could setup a script or something that makes Spotlight index whenever the screensaver is enabled, or something similar?
     
Posting Junkie
Join Date: Mar 2004
Location: MacNN database error. Please refresh your browser.
Status: Offline
Reply With Quote
Jul 21, 2005, 09:09 PM
 
You can re-index via Terminal.

This is a computer-generated message and needs no signature.
     
badtz  (op)
Mac Elite
Join Date: May 2002
Location: Los Angeles, CA.
Status: Offline
Reply With Quote
Jul 22, 2005, 05:49 AM
 
http://s27.yousendit.com/d.aspx?id=1...40M94UYK6JQZSF

for instance .... this is an example PDF file ...

I search for the word "suffrage" [without quotes] .... inside of the Preview, and it finds it.

But when I search via Spotlight, it doesn't.


Can anyone else test this?

[this file was created awhile ago]

slightly OT: should spotlight be indexing MS Office files also? [particularly powerpoint?]
     
Grizzled Veteran
Join Date: Sep 2000
Location: London, UK
Status: Offline
Reply With Quote
Jul 22, 2005, 06:07 AM
 
Originally Posted by badtz
http://s27.yousendit.com/d.aspx?id=1...40M94UYK6JQZSF

for instance .... this is an example PDF file ...

I search for the word "suffrage" [without quotes] .... inside of the Preview, and it finds it.

But when I search via Spotlight, it doesn't.


Can anyone else test this?

[this file was created awhile ago]

slightly OT: should spotlight be indexing MS Office files also? [particularly powerpoint?]
Spotlight should index Word, Excel and PowerPoint files (and does on my machines) but doesn't yet work with Entourage, as its database is incompatible.
     
Grizzled Veteran
Join Date: Sep 2000
Location: London, UK
Status: Offline
Reply With Quote
Jul 22, 2005, 06:16 AM
 
Originally Posted by badtz
http://s27.yousendit.com/d.aspx?id=1...40M94UYK6JQZSF

for instance .... this is an example PDF file ...

I search for the word "suffrage" [without quotes] .... inside of the Preview, and it finds it.

But when I search via Spotlight, it doesn't.


Can anyone else test this?

[this file was created awhile ago]

slightly OT: should spotlight be indexing MS Office files also? [particularly powerpoint?]
I just dl'd your file and put "suffrage" into Spotlight, it came up almost immediately with that PDF (among others).

If it's still not working for you I'd try forcing a re-index of your files, see if that helps.
     
Mac Elite
Join Date: Jul 2002
Status: Offline
Reply With Quote
Jul 22, 2005, 06:20 AM
 
You can import single files or directories by using the mdimport command in Terminal.
     
Posting Junkie
Join Date: Mar 2004
Location: MacNN database error. Please refresh your browser.
Status: Offline
Reply With Quote
Jul 22, 2005, 06:37 AM
 
It showed "suffrage" in the pdf as soon as I downloaded it. I didn't even have to open the file.

This is a computer-generated message and needs no signature.
     
Moderator
Join Date: May 2001
Location: Hilbert space
Status: Offline
Reply With Quote
Jul 22, 2005, 06:38 AM
 
Originally Posted by badtz
http://s27.yousendit.com/d.aspx?id=1...40M94UYK6JQZSF

for instance .... this is an example PDF file ...

I search for the word "suffrage" [without quotes] .... inside of the Preview, and it finds it.

But when I search via Spotlight, it doesn't.


Can anyone else test this?

[this file was created awhile ago]

slightly OT: should spotlight be indexing MS Office files also? [particularly powerpoint?]
If there is an importer for that format, the answer is yes. It takes some time, though until the files are completely indexed; this is particularly true of mobile macs, because they have less cpu horsepower and less idle time.
I don't suffer from insanity, I enjoy every minute of it.
     
badtz  (op)
Mac Elite
Join Date: May 2002
Location: Los Angeles, CA.
Status: Offline
Reply With Quote
Jul 22, 2005, 07:26 AM
 
odd!!!!

I've read a tip somewhere [i think macosxhints] that an easy method to reindex [which it did] was to drag your HD to the spotlight preferences of places to not index, then to remove it.

It'll force spotlight to re-index.

I've tried that and it still doesn't work.
     
Mac Enthusiast
Join Date: Jul 2005
Status: Offline
Reply With Quote
Jul 22, 2005, 08:32 AM
 
Try doing a 'man mdimport' in the Terminal and see if that'll help you.
     
Mac Elite
Join Date: Mar 2001
Location: Minneapolis, MN
Status: Offline
Reply With Quote
Jul 22, 2005, 08:57 AM
 
repair permissions?
     
Mac Elite
Join Date: Nov 2001
Status: Offline
Reply With Quote
Jul 22, 2005, 09:28 AM
 
Originally Posted by awaspaas
repair permissions?
NO!
     
badtz  (op)
Mac Elite
Join Date: May 2002
Location: Los Angeles, CA.
Status: Offline
Reply With Quote
Jul 22, 2005, 07:13 PM
 
Thanks for all of the help everyone! I think I found the solution!

http://www.macosxhints.com/article.p...potlight+index

I used that hint to identify folders where Spotlight was having trouble indexing.

Then I did the following:

exclude those directories in the spotlight list.

then I added my HD onto that list, then I removed it (this forces spotlight to reindex the entire disk).


after it reindexed the drive, it didn't try and index those directories that it had trouble with.

In my case... it was ~/library/preferences and ~/library/safari

now when I do a spotlight search for "suffrage" the Ch6Summary.pdf shows up!!!

When I get home I'll test this more thoroughly, but I'm hoping this solved it
     
   
Thread Tools
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Top
Privacy Policy
All times are GMT -5. The time now is 07:51 PM.
All contents of these forums © 1995-2011 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.7 © 2000-2011, Jelsoft Enterprises Ltd., Content Relevant URLs by vBSEO 3.3.2