Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > Mac OS X > grep bug with multiple files?

grep bug with multiple files?
Thread Tools
Zim
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status: Offline
Reply With Quote
Feb 16, 2003, 12:03 PM
 
I was trying to find a phrase from an old email... I use unix style emails, and store them in ~/Mail.... anwyay, I cd'd to Mail, and did a

grep keyword *

as in: search all files for keyword

I get a ton of false positives...

ie. grep will spit out

file1: blah blah blah (no keyword on line)

if I do

grep -c keyword file1

I get 0 as the result.

What gives? Is this a grep bug, and/or a Mac OS X grep bug?

Thanks,
Mike
     
Mac Elite
Join Date: May 2001
Status: Offline
Reply With Quote
Feb 16, 2003, 11:46 PM
 
Works all fine for me. Do you have it aliased maybe? What does "which grep" show?

-
     
Zim  (op)
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status: Offline
Reply With Quote
Feb 17, 2003, 08:48 AM
 
Originally posted by Moonray:
Works all fine for me. Do you have it aliased maybe? What does "which grep" show?

-
>which grep
/usr/bin/grep
     
Mac Elite
Join Date: May 2001
Status: Offline
Reply With Quote
Feb 17, 2003, 02:35 PM
 
Originally posted by Zim:
>which grep
/usr/bin/grep
That's okay.

I now tend to believe, that your files contain lines like

blah blah blah (no keyword on line)

and grep finds them looking for lines containing the keyword "keyword" showing you:

file1: blah blah blah (no keyword on line)

-
     
Zim  (op)
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status: Offline
Reply With Quote
Feb 17, 2003, 02:57 PM
 
Originally posted by Moonray:
That's okay.

I now tend to believe, that your files contain lines like

blah blah blah (no keyword on line)

and grep finds them looking for lines containing the keyword "keyword" showing you:

file1: blah blah blah (no keyword on line)

-
Um, was there a explanation in there of why grep is finding these NON-matching lines, or are you just saying you believe I am seeing it??

Cheers,
Mike
     
Mac Enthusiast
Join Date: Nov 2001
Location: Adelaide, South Australia
Status: Offline
Reply With Quote
Feb 17, 2003, 11:17 PM
 
How about an example? That is,
a line where grep matches and the
command you're executing to try
and find those matches. Not that
I *want* to read your old email
messages!!

[[Offhand remark: watch out for the size of the lines involved. Maybe they're longer than grep is printing out, and the match is further along?]]

Cheers,
Paul
     
Zim  (op)
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status: Offline
Reply With Quote
Feb 18, 2003, 09:27 AM
 
Originally posted by Paul McCann:
How about an example? That is,
a line where grep matches and the
command you're executing to try
and find those matches. Not that
I *want* to read your old email
messages!!

[[Offhand remark: watch out for the size of the lines involved. Maybe they're longer than grep is printing out, and the match is further along?]]

Cheers,
Paul
Fair enough...

example.... (this is real)

> grep -c zzzzzzzxxxxzzzyy *

(I figure the odds of that string appearing are pretty low, eh?)

several results (much truncated, I have 100's of email files)

summaries:29
holt576:1
waterfall406:9

Now

>grep -c zzzzzzzxxxxzzzyy summaries
0
>grep -c zzzzzzzxxxxzzzyy tholt576
0
>grep -c zzzzzzzxxxxzzzyy waterfall406
0

Further, I open the file (waterfall406) in vi and do a search for even one zz pair...

"Pattern not found"

If I check just these three files...

>grep -c zzzzzzzxxxxzzzyy waterfall406 tholt576 summaries
summaries:0
tholt576:0
waterfall406:0


I'm wondering if it some kind of overloading on the search. piping ls -1 into a file, I find that I have 476 files in my Mail directory.

Doing a

grep -c zzzzzzzxxxxzzzyy * | grep -v 0 > tout

to find all non-zero matches, claims that 143 (of 476) files have this string (obviously they do not).

I have not done as much testing, but think I am not seeing this in directories with fewer files.

Experiment #2...

I copy the same 3 (of many) offending files into their own directory (~/tmp2)

>grep -c zzzzzzzxxxxzzzyy *
summaries:0
tholt576:0
waterfall406:0

So, next I tried adding files to the directory...

perl -e 'foreach $file (101..200) { `touch "file$file"` }'

and doing

>grep -c zzzzzzzxxxxzzzyy * | grep -v 0

but I'm coming up empty.. so maybe its content in one of the real Mail files, or the name of one of them that is setting this off (I'm up to 500 fake files now, the same as the real Mail directory).

So I'm at a loss...

Cheers,
Mike
     
Zim  (op)
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status: Offline
Reply With Quote
Feb 18, 2003, 10:24 AM
 
I went back and looked at all of the file names...

I had inadvertently saved a file (first file in the ls listing) as "-web".

So I'd expect that instead of

grep keyword filelist

it was seeing

grep keyword -web rest-of-filelist

Removed the file and now grep returns results as expected (only real matches).

Heh, life is never boring.

Mike
     
Mac Enthusiast
Join Date: Nov 2001
Location: Adelaide, South Australia
Status: Offline
Reply With Quote
Feb 18, 2003, 10:37 PM
 
Thanks for the update.

I had a quick peck around, trying to duplicate your results (via a -web file), and I think what's happening is that grep is interpreting the -w as "complete word", the "e" as "-e" (as in "using the pattern that follows") and the "b" as the pattern to check.

That is,

grep -c xxxxyyyxxxzz *

gives the count of the occurrences of standalone "b" characters (ie separate words) within all the files in your directory bar the -web file. But you should also have seen an error about there being "No such file as xxxxyyyxxxzz" or whatever).

Almost perfectly consistent with your initial report; when you home in on a single file there's no "-web" option-set for grep, and the count of the long string in each such file is zero.

Cheers,
Paul
     
Zim  (op)
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status: Offline
Reply With Quote
Feb 19, 2003, 07:59 AM
 
Yeah, I looked at the man page to try to figure out what it "thought" it was trying to do.. I am a little suprised that it would not syntax error at that point since only files should come after PATTERN.

SYNOPSIS
grep [options] PATTERN [FILE...]
grep [options] [-e PATTERN | -f FILE] [FILE...]

I sent it in to the gnu-bugs addr on the man page as a bug/feature so at least they will know about it (not sure if it truly is a bug, tho I would consider it one if that was my command-line parser).

I also tried to duplicate this (only 2 minutes of effort) under Solaris 2.8 at work, but no dice. Shrug.

Ah well, for the one other poor soul that that encounters this, hopefully the thread will live in the archives

Cheers,
Mike
     
   
Thread Tools
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Top
Privacy Policy
All times are GMT -5. The time now is 07:19 AM.
All contents of these forums © 1995-2011 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.7 © 2000-2011, Jelsoft Enterprises Ltd., Content Relevant URLs by vBSEO 3.3.2