 |
 |
grep bug with multiple files?
|
 |
|
 |
|
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status:
Offline
|
|
I was trying to find a phrase from an old email... I use unix style emails, and store them in ~/Mail.... anwyay, I cd'd to Mail, and did a
grep keyword *
as in: search all files for keyword
I get a ton of false positives...
ie. grep will spit out
file1: blah blah blah (no keyword on line)
if I do
grep -c keyword file1
I get 0 as the result.
What gives? Is this a grep bug, and/or a Mac OS X grep bug?
Thanks,
Mike
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: May 2001
Status:
Offline
|
|
Works all fine for me. Do you have it aliased maybe? What does "which grep" show?
-
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status:
Offline
|
|
Originally posted by Moonray:
Works all fine for me. Do you have it aliased maybe? What does "which grep" show?
-
>which grep
/usr/bin/grep
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: May 2001
Status:
Offline
|
|
Originally posted by Zim:
>which grep
/usr/bin/grep
That's okay.
I now tend to believe, that your files contain lines like
blah blah blah (no keyword on line)
and grep finds them looking for lines containing the keyword "keyword" showing you:
file1: blah blah blah (no keyword on line)
-
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status:
Offline
|
|
Originally posted by Moonray:
That's okay.
I now tend to believe, that your files contain lines like
blah blah blah (no keyword on line)
and grep finds them looking for lines containing the keyword "keyword" showing you:
file1: blah blah blah (no keyword on line)
-
Um, was there a explanation in there of why grep is finding these NON-matching lines, or are you just saying you believe I am seeing it??
Cheers,
Mike
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Enthusiast
Join Date: Nov 2001
Location: Adelaide, South Australia
Status:
Offline
|
|
How about an example? That is,
a line where grep matches and the
command you're executing to try
and find those matches. Not that
I *want* to read your old email
messages!!
[[Offhand remark: watch out for the size of the lines involved. Maybe they're longer than grep is printing out, and the match is further along?]]
Cheers,
Paul
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status:
Offline
|
|
Originally posted by Paul McCann:
How about an example? That is,
a line where grep matches and the
command you're executing to try
and find those matches. Not that
I *want* to read your old email
messages!!
[[Offhand remark: watch out for the size of the lines involved. Maybe they're longer than grep is printing out, and the match is further along?]]
Cheers,
Paul
Fair enough...
example.... (this is real)
> grep -c zzzzzzzxxxxzzzyy *
(I figure the odds of that string appearing are pretty low, eh?)
several results (much truncated, I have 100's of email files)
summaries:29
holt576:1
waterfall406:9
Now
>grep -c zzzzzzzxxxxzzzyy summaries
0
>grep -c zzzzzzzxxxxzzzyy tholt576
0
>grep -c zzzzzzzxxxxzzzyy waterfall406
0
Further, I open the file (waterfall406) in vi and do a search for even one zz pair...
"Pattern not found"
If I check just these three files...
>grep -c zzzzzzzxxxxzzzyy waterfall406 tholt576 summaries
summaries:0
tholt576:0
waterfall406:0
I'm wondering if it some kind of overloading on the search. piping ls -1 into a file, I find that I have 476 files in my Mail directory.
Doing a
grep -c zzzzzzzxxxxzzzyy * | grep -v 0 > tout
to find all non-zero matches, claims that 143 (of 476) files have this string (obviously they do not).
I have not done as much testing, but think I am not seeing this in directories with fewer files.
Experiment #2...
I copy the same 3 (of many) offending files into their own directory (~/tmp2)
>grep -c zzzzzzzxxxxzzzyy *
summaries:0
tholt576:0
waterfall406:0
So, next I tried adding files to the directory...
perl -e 'foreach $file (101..200) { `touch "file$file"` }'
and doing
>grep -c zzzzzzzxxxxzzzyy * | grep -v 0
but I'm coming up empty.. so maybe its content in one of the real Mail files, or the name of one of them that is setting this off (I'm up to 500 fake files now, the same as the real Mail directory).
So I'm at a loss...
Cheers,
Mike
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status:
Offline
|
|
I went back and looked at all of the file names...
I had inadvertently saved a file (first file in the ls listing) as "-web".
So I'd expect that instead of
grep keyword filelist
it was seeing
grep keyword -web rest-of-filelist
Removed the file and now grep returns results as expected (only real matches).
Heh, life is never boring.
Mike
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Enthusiast
Join Date: Nov 2001
Location: Adelaide, South Australia
Status:
Offline
|
|
Thanks for the update.
I had a quick peck around, trying to duplicate your results (via a -web file), and I think what's happening is that grep is interpreting the -w as "complete word", the "e" as "-e" (as in "using the pattern that follows") and the "b" as the pattern to check.
That is,
grep -c xxxxyyyxxxzz *
gives the count of the occurrences of standalone "b" characters (ie separate words) within all the files in your directory bar the -web file. But you should also have seen an error about there being "No such file as xxxxyyyxxxzz" or whatever).
Almost perfectly consistent with your initial report; when you home in on a single file there's no "-web" option-set for grep, and the count of the long string in each such file is zero.
Cheers,
Paul
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Apr 2001
Location: Cary, NC
Status:
Offline
|
|
Yeah, I looked at the man page to try to figure out what it "thought" it was trying to do.. I am a little suprised that it would not syntax error at that point since only files should come after PATTERN.
SYNOPSIS
grep [options] PATTERN [FILE...]
grep [options] [-e PATTERN | -f FILE] [FILE...]
I sent it in to the gnu-bugs addr on the man page as a bug/feature so at least they will know about it (not sure if it truly is a bug, tho I would consider it one if that was my command-line parser).
I also tried to duplicate this (only 2 minutes of effort) under Solaris 2.8 at work, but no dice. Shrug.
Ah well, for the one other poor soul that that encounters this, hopefully the thread will live in the archives
Cheers,
Mike
|
|
|
| |
|
|
|
 |
 |
|
 |
|
|
|
|
|

|
|
 |
Forum Rules
|
 |
 |
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
 |
 |
 |
 |
|
 |
|