Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > Mac OS X > extracting IP from a file

extracting IP from a file
Thread Tools
Moderator
Join Date: Sep 2000
Location: Irvine, CA
Status: Offline
Reply With Quote
Jan 3, 2003, 09:14 AM
 
Hello,

I want to extract all IPs starting with 200 for example from my Apache access_log without all of the excess stuff which comes with it like the 404 error. I have it as this:

grep "200.*" access_log | sort | uniq > log.txt

for example. How can I get rid of the extra stuff so I am just left with the IPs by themselves? Thanks.
{{{ mindwaves }}}
     
Mac Elite
Join Date: Sep 2000
Location: Edmond, OK USA
Status: Offline
Reply With Quote
Jan 3, 2003, 09:56 AM
 
Originally posted by mindwaves:
Hello,

I want to extract all IPs starting with 200 for example from my Apache access_log without all of the excess stuff which comes with it like the 404 error. I have it as this:

grep "200.*" access_log | sort | uniq > log.txt

for example. How can I get rid of the extra stuff so I am just left with the IPs by themselves? Thanks.
This isn't the typical Unix'y answer (I can here the perl cranking), but this will work:

Code:
import java.io.*; public class IP { public static void main(String [] argv) throws Exception { String prefix = ""; if (argv != null && argv.length == 1) { prefix = argv[0]; } BufferedReader reader = new BufferedReader(new InputStreamReader(System.in)); String line = reader.readLine(); while (line != null) { // strip the IP off int index = line.indexOf(" "); String ip = line.substring(0, index); if (ip.startsWith(prefix)) { System.out.println(ip); } line = reader.readLine(); } } }
Just save to a file named "IP.java" compile with this:

Code:
javac IP.java
And run thusly:
Code:
cat /var/log/httpd/access_log | java IP
To limit IP's to one subnet, supply as much leading stuff as you want (It will be matched literally, so don't use "200.*", use "200.").

Code:
cat /var/log/httpd/access_log | java IP 192.168.1
This will only show for example 192.168.123.5, 192.168.1.4, etc.
     
Mac Elite
Join Date: Sep 2000
Location: Edmond, OK USA
Status: Offline
Reply With Quote
Jan 3, 2003, 09:59 AM
 
I just ran that check on my machine and I got 890 IP's for 57696 entries! It's amazing how much junk is going on out there. Maybe broadband for everyone wasn't a great idea.
     
Mac Elite
Join Date: May 1999
Location: San Jose, CA
Status: Offline
Reply With Quote
Jan 3, 2003, 12:48 PM
 
Originally posted by absmiths:
I just ran that check on my machine and I got 890 IP's for 57696 entries! It's amazing how much junk is going on out there. Maybe broadband for everyone wasn't a great idea.
What makes you think 200.*.*.* = broadband usage?

There's LOTS of broadband users outside of that range. AFAIK, there's no definite way to ascertain the type of connection a specific IP address is behind.

Also, a simpler solution to the original question would be:

awk '/^200./ {print $1};' access_log |sort -un

This uses awk to print the first field (print $1} of any line that begins with "200." (/^200./)

There's nothing easier than awk for simple text manipulation like this.
Gods don't kill people - people with Gods kill people.
     
Mac Elite
Join Date: Sep 2000
Location: Edmond, OK USA
Status: Offline
Reply With Quote
Jan 3, 2003, 12:55 PM
 
Originally posted by Camelot:
What makes you think 200.*.*.* = broadband usage?

There's LOTS of broadband users outside of that range. AFAIK, there's no definite way to ascertain the type of connection a specific IP address is behind.
I was talking about all IP's - I have no interest in the 200.* network. AFAIK most broadband users are NOT in the 200+ IP range - they tend to be 64.* or 23.* or 12.*, etc.

My comment about broadband was regarding the huge number of script kiddies with nothing better to do than do ping sweeps to find vulnerable machines to either steal from or use as a base for illegal activities.

Also, a simpler solution to the original question would be:
Simplicity is of course a subjective thing. That Java program took about 30 seconds to write and about 0 thought. I.E., no reading of man pages or unix manuals required.

BTW, as I mentioned originally, it wasn't a particularly unixy solution but was perfectly functional. I actually prefer solving problems that way because it gives me a lot more flexibility with the data once I harvest it.
     
Moderator
Join Date: Sep 2000
Location: Irvine, CA
Status: Offline
Reply With Quote
Jan 3, 2003, 03:36 PM
 
Hey, thanks to everyone for the replies. I really appreciate it.
{{{ mindwaves }}}
     
   
Thread Tools
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Top
Privacy Policy
All times are GMT -5. The time now is 09:55 AM.
All contents of these forums © 1995-2011 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.7 © 2000-2011, Jelsoft Enterprises Ltd., Content Relevant URLs by vBSEO 3.3.2