Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > macOS > awk question

awk question
Thread Tools
C.J. Moof
Mac Elite
Join Date: Aug 2001
Location: Madison, WI
Status: Offline
Reply With Quote
Mar 19, 2004, 06:25 PM
 
I'm pretty sure this is the right tool for the job, I'm just not quite sure how to use it. My intent is to get a running output of the processor load on my linux server. I've already crontabed an uptime >> /var/log/loadlog , to get a running sample of load averages, every 2 minutes.

I'd like to trim that output down from:

4:06pm up 9 days, 8:21, 1 user, load average: 1.49, 0.73, 0.79

to

4:06pm 1.49, 0.73, 0.79

awk looks like the right tool to modify each line, but I'm not getting a handle on what the commands would look like. One complication is that if spaces are used as the delimiter, the positions don't remain constant: Once it rolls over to '10 days, 1 min, 1 user' we now have an extra column- time is in 2 parts instead of 1.

Or, am I just trying to do things the hard way, and there's a better tool for logging my processor this way?

Thanks for any help!
OS X: Where software installation doesn't require wizards with shields.
     
Moonray
Mac Elite
Join Date: May 2001
Status: Offline
Reply With Quote
Mar 19, 2004, 08:12 PM
 
Just pipe it through sed -e 's/ up.*://'
Code:
uptime | sed -e 's/ up.*://' >> /var/log/loadlog
You could use awk too but sed is simpler.

-
     
Arkham_c
Mac Elite
Join Date: Dec 2001
Location: Atlanta, GA, USA
Status: Offline
Reply With Quote
Mar 19, 2004, 11:25 PM
 
Originally posted by C.J. Moof:
I'm pretty sure this is the right tool for the job, I'm just not quite sure how to use it. My intent is to get a running output of the processor load on my linux server. I've already crontabed an uptime >> /var/log/loadlog , to get a running sample of load averages, every 2 minutes.

I'd like to trim that output down from:

4:06pm up 9 days, 8:21, 1 user, load average: 1.49, 0.73, 0.79

to

4:06pm 1.49, 0.73, 0.79

awk looks like the right tool to modify each line, but I'm not getting a handle on what the commands would look like. One complication is that if spaces are used as the delimiter, the positions don't remain constant: Once it rolls over to '10 days, 1 min, 1 user' we now have an extra column- time is in 2 parts instead of 1.

Or, am I just trying to do things the hard way, and there's a better tool for logging my processor this way?

Thanks for any help!
uptime | awk '{print $1,$10",",$11",",$12}'
Mac Pro 2x 2.66 GHz Dual core, Apple TV 160GB, two Windows XP PCs
     
C.J. Moof  (op)
Mac Elite
Join Date: Aug 2001
Location: Madison, WI
Status: Offline
Reply With Quote
Mar 19, 2004, 11:51 PM
 
The sed line works very nicely. Now if you don't mind, I'd like to understand why.... do uptime, pipe thru sed, use the e flag to execute a command.
s/ I'm not clear on.
actuallly, the whole expression is cryptic to me. I'd greatly appreciate an explanation of why it works.
OS X: Where software installation doesn't require wizards with shields.
     
C.J. Moof  (op)
Mac Elite
Join Date: Aug 2001
Location: Madison, WI
Status: Offline
Reply With Quote
Mar 19, 2004, 11:55 PM
 
Originally posted by Arkham_c:
uptime | awk '{print $1,$10",",$11",",$12}'
But the uptime output format changes depending on how long it's been up.
Code:
7:44am up 8 days, 23:59, 2 users, load average: 0.62, 0.27, 0.19 7:46am up 9 days, 1 min, 2 users, load average: 0.36, 0.29, 0.20
Wouldn't a statement that's valid for the first line not be valid for the 2nd- a time measured in 23:59 vs 1 min?
OS X: Where software installation doesn't require wizards with shields.
     
Moonray
Mac Elite
Join Date: May 2001
Status: Offline
Reply With Quote
Mar 20, 2004, 01:23 AM
 
Originally posted by C.J. Moof:
The sed line works very nicely. Now if you don't mind, I'd like to understand why.... do uptime, pipe thru sed, use the e flag to execute a command.
s/ I'm not clear on.
actuallly, the whole expression is cryptic to me. I'd greatly appreciate an explanation of why it works.
Well here is how it works:
Code:
sed -e 's/ up.*://'
As you found out correctly the -e means to execute the command that follows (the -e could be omitted in this simple example). The command follows and is enclosed in single quotes so the shell does not start to interprete some special characters like "*" in it.

The actual command is "Substitute" (hence the "s"). This is the most used sed command and will replace portions of text in lines from its input with something else (note that although sed basically operates on lines delimited by the Unix newline character it has as well the ability to combine lines and perform then operations on them).

The substitute command takes two strings as arguments that are delimited by three slashes in the above example (you can use any other character as delimiter, but a "/" is the most common. However if you want your delimiter character to appear in your strings you need to precede it with a "\" to escape it).

The first of the two strings is a "regular expression" (that is like pattern matching with wildcards but in a very extended and powerful way). This regular expression is searched in the input lines and then the matching text replaced with the second string which is here empty so it just gets deleted.

The regular expression above has two special characters, a "." that means "match any character here" (like a "?" in "ls ?ouse" to match house, mouse, and alike). The other is a "*" that follows it and means that any count of the previous "atom" will be matched. So ".*" together mean "match any number of any characters". The whole expression " up.*:" means then match any string that starts with " up", then has any other characters, and ends with ":".

Sed finds the " up 9 days, 8:21, 1 user, load average:" in your uptime output because it matches this regular expression and replaces it with an empty string = deletes it.

Recommended reading:
-
     
C.J. Moof  (op)
Mac Elite
Join Date: Aug 2001
Location: Madison, WI
Status: Offline
Reply With Quote
Mar 20, 2004, 10:07 AM
 
Originally posted by Moonray:
Well here is how it works:
Code:
sed -e 's/ up.*://'
As you found out correctly the -e means to execute the command that follows (the -e could be omitted in this simple example). The command follows and is enclosed in single quotes so the shell does not start to interprete some special characters like "*" in it.

The actual command is "Substitute" (hence the "s"). This is the most used sed command and will replace portions of text in lines from its input with something else (note that although sed basically operates on lines delimited by the Unix newline character it has as well the ability to combine lines and perform then operations on them).
The substitute command takes two strings as arguments that are delimited by three slashes in the above example (you can use any other character as delimiter, but a "/" is the most common. However if you want your delimiter character to appear in your strings you need to precede it with a "\" to escape it).

OK, that makes sense, a / is the standard delimiter that brackets the /from/to/ group.

The first of the two strings is a "regular expression" (that is like pattern matching with wildcards but in a very extended and powerful way). This regular expression is searched in the input lines and then the matching text replaced with the second string which is here empty so it just gets deleted.

Sure, no problem there. I figured that was a regular expresson in there... it's high time I sit down and really understand how to read those.

The regular expression above has two special characters, a "." that means "match any character here" (like a "?" in "ls ?ouse" to match house, mouse, and alike). The other is a "*" that follows it and means that any count of the previous "atom" will be matched. So ".*" together mean "match any number of any characters". The whole expression " up.*:" means then match any string that starts with " up", then has any other characters, and ends with ":".

That's a little different than I'm used to thinking of things.... so a * doesn't indicate 'match everything' (as in a DOS copy A*.* Cfloppy\), but 'keep matching the character to my left'. Is that right?
And if I really wanted to find a . (not match any character here), I'd refer to it as \. ?

Sed finds the " up 9 days, 8:21, 1 user, load average:" in your uptime output because it matches this regular expression and replaces it with an empty string = deletes it.

Recommended reading:
-
man sed only got me to define the e... I'll look at the other references. I've found it so much easier to discuss unix things to learn than to just read. It's a language all it's own, and I need to converse in it to really get it. Reading helps, but things like this open the doors.

Thanks for your time.
OS X: Where software installation doesn't require wizards with shields.
     
Moonray
Mac Elite
Join Date: May 2001
Status: Offline
Reply With Quote
Mar 20, 2004, 05:51 PM
 
Originally posted by C.J. Moof:
That's a little different than I'm used to thinking of things.... so a * doesn't indicate 'match everything' (as in a DOS copy A*.* Cfloppy\), but 'keep matching the character to my left'. Is that right?
In DOS the "*" means something like "from here on don't care ana match everything until the end. DOS 8.3 filenames have two parts (and two ends) that are handled separately so you need this "*.*" with the dot to match proper DOS filenames. The "*" is a wildcard to match the whole rest of the filename or extension part:
Code:
*.* matches all files *.exe matches all files with an exe extension a*.* matches all files starting with "a" and any extension john*.txt matches john.txt, johnny.txt, johnson.text, johnboy.txt etc. john*y.txt is a not allowed pattern, the y would get ignored and the results would be the same as in the line above
The "?" in DOS is a wildcard for exactly one character:
Code:
c?t.doc matches cat.doc, cut.doc, c!t.doc etc, but not cult.doc
In Unix-shell pattern matching (also known as globbing) "*" and "?" have similar meanings just that you are allowed to put more characters (and wildcards) after a "*", and of course you don't need to work around the "extension-dot" if you don't have to:
Code:
* matches all files *.exe matches all files ending with .exe a*.* matches all files starting with "a" and containing a dot anywhere a* matches all files starting with "a" john*.txt matches john.txt, johnny.txt, john...text, johnboy.txt etc. john*y.txt matches johnny.txt, johnboy.txt etc. (everything that starts with "john" and ends with "y.txt") john*y_w* matches johnboy_walker, johnny_wilmers.doc and alike
Examples for the use of "?" would be the same as for DOS.

In addition to that, Unix shells know [] to match one character in the brackets:
Code:
c[au]t matches exactly "cat" and "cut", nothing else
You may define ranges using a hyphen in [], like:
Code:
[0-9] the same as [0123456789] [a-zA-Z] all upper- and lowercase ASCII characters
Is the first character in such a bracket expression a "^" it means "anything but the following characters:
Code:
[^0-9] anything but a numerical digit
Finally there is (not widely known) the possibility do give alternatives:
Code:
*.{htm,html} match any file ending with either .htm or .html *.{rar,r[0-9][0-9]} match *.rar, *.r00, *.r01, *.r02 etc.
But this often needs some additional filtering of the results.

Now Regular Expressions. The "." works the same way the "?" did before and [] work about the same as above. Now think of the "*" as a multiplicator that allows any number of occurrences of the preceding "atom" (a normal character, meta-character like the ".", etc"); so ".*" has the meaning a single "*" had in the examples before (any number of anything), but "a*" means "any number of "a", [0-9]* "any (or no) numerical", and so on.

Now there are old and modern regular expressions that have a different syntax and many dialects of them making things not easy, some need a \ in places where others don't, but that's described in man 7 re_format and many places on the web, so only a rough overview to give you an idea:

There are more possibilities for multiplicators than "*":
Code:
* match 0 or more of the previous atom + match 1 or more of the previous atom (old) ? match 0 or 1 of the previous atom (old) {2,4} match 2 to 4 times the previous atom (modern)
A "^" matches the beginning of a line, a "$" a line end.

You can use () or \(\) to remember to parts of a matched expression and place them in a replace string using \1, \2, ...:
Code:
sed s/\([0-9]*$\)/$\1/g finds any numeral like followed by a "$" and replaces it with a "$" followed by the same number. (If you use [0-9.,] you'll catch "5,000.00$" too).
You have the idea now.

Originally posted by C.J. Moof:
And if I really wanted to find a . (not match any character here), I'd refer to it as \. ?
Very right.

Originally posted by C.J. Moof:
man sed only got me to define the e... I'll look at the other references. I've found it so much easier to discuss unix things to learn than to just read. It's a language all it's own, and I need to converse in it to really get it. Reading helps, but things like this open the doors.
You know you can scroll with the arrow keys, <space> and <B>? There is "man man" too. For me man sed is about 14 80x25 pages in the terminal. However you find all man pages on the web too: http://www.hmug.org/man/1/sed.html.
Originally posted by C.J. Moof:
Thanks for your time.
You're welcome, sometimes it's worth it.

-
     
utidjian
Senior User
Join Date: Jan 2001
Location: Mahwah, NJ USA
Status: Offline
Reply With Quote
Mar 20, 2004, 09:00 PM
 
Originally posted by C.J. Moof:
I'm pretty sure this is the right tool for the job, I'm just not quite sure how to use it. My intent is to get a running output of the processor load on my linux server. I've already crontabed an uptime >> /var/log/loadlog , to get a running sample of load averages, every 2 minutes.

I'd like to trim that output down from:

4:06pm up 9 days, 8:21, 1 user, load average: 1.49, 0.73, 0.79

to

4:06pm 1.49, 0.73, 0.79

awk looks like the right tool to modify each line, but I'm not getting a handle on what the commands would look like. One complication is that if spaces are used as the delimiter, the positions don't remain constant: Once it rolls over to '10 days, 1 min, 1 user' we now have an extra column- time is in 2 parts instead of 1.

Or, am I just trying to do things the hard way, and there's a better tool for logging my processor this way?

Thanks for any help!
Looks like you are trying to re-invent a tool that already exists.

On my RedHat Linux servers (versions 7.2, 7.3, and 9) I use a tool called "sar" (see man sar). The manpage has HUGE amount of details on the use of it. See the examples near the bottom to get a feel for how to use it. Using sar would be far more efficient than using a cronjob running uptime... mainly because all the logging is already done. All you need to do is to query the database to get the stats you want. You can get stats on far more information than what uptime provides.

Try "sar -u | less" as an example.

Good you learned lots about sed and awk though.
-DU-...etc...
     
JNI
Forum Regular
Join Date: Oct 2002
Location: Left Coast
Status: Offline
Reply With Quote
Mar 21, 2004, 01:29 AM
 
Originally posted by utidjian:
Looks like you are trying to re-invent a tool that already exists.

On my RedHat Linux servers (versions 7.2, 7.3, and 9) I use a tool called "sar" (see man sar).
FYI, the sar command is in OS X too. At least in Panther - don't know about Jaguar.

Yet another cool *nix tool I never knew about, until now. Thanks.

edit: I can't get your example to do anything. Is it possible it is not enabled in the OS?
( Last edited by JNI; Mar 21, 2004 at 01:34 AM. )
     
utidjian
Senior User
Join Date: Jan 2001
Location: Mahwah, NJ USA
Status: Offline
Reply With Quote
Mar 21, 2004, 11:21 AM
 
Originally posted by JNI:
FYI, the sar command is in OS X too. At least in Panther - don't know about Jaguar.

Yet another cool *nix tool I never knew about, until now. Thanks.

edit: I can't get your example to do anything. Is it possible it is not enabled in the OS?
So it is! I don't have any Jaguar systems up anymore to check. Seems that the Linux version is a bit different in implementation and scope. You can see the Linux manpage for sar at:
http://perso.wanadoo.fr/sebastien.godard/use_sar.html
and lots of other info on it at:
http://perso.wanadoo.fr/sebastien.godard/

I think the reason my example command didn't work is because there is no cronjob creating entries in /var/log/sa/.

I got this command to work though:
Code:
core:~ utidjian$ sar -u 1 5 09:59:22 %usr %sys %idle 09:59:24 0 2 98 09:59:25 0 3 97 09:59:26 1 6 93 09:59:27 2 4 94 09:59:28 19 3 78 Average: 4 3 92 core:~ utidjian$
Which grabs CPU usage every 1 second 5 times.

On one of my Linux servers I have an /etc/cron.d/sysstat file that contains:
Code:
# run system activity accounting tool every 10 minutes */10 * * * * root /usr/lib/sa/sa1 1 1 # generate a daily summary of process accounting at 23:53 53 23 * * * root /usr/lib/sa/sa2 -A
You can adjust accordingly for your system. There are some more example cron scripts at:
http://perso.wanadoo.fr/sebastien.godard/use_en.html

If there some logs in /var/log/sa/ then the original command I posted should have worked on your system.

I much prefer sar to a tool like uptime. The "load averages" in uptime is exactly that... a collection of a lot of averages. It is not very informative as to exactly what is loading the system.

Another way to get some stats on system load is to run top like this:
Code:
core:~ utidjian$ top -l 1 -n 0 Processes: 47 total, 2 running, 45 sleeping... 107 threads 10:15:54 Load Avg: 0.14, 0.03, 0.01 CPU usage: 28.6% user, 71.4% sys, 0.0% idle SharedLibs: num = 103, resident = 22.4M code, 2.54M data, 6.68M LinkEdit MemRegions: num = 3392, resident = 30.4M + 8.54M private, 57.0M shared PhysMem: 48.5M wired, 67.0M active, 118M inactive, 234M used, 277M free VM: 2.27G + 71.1M 15119(0) pageins, 6(0) pageouts core:~ utidjian$
Unfortunately top can also load the system quite a bit also.
-DU-...etc...
     
   
 
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Top
Privacy Policy
All times are GMT -4. The time now is 03:24 PM.
All contents of these forums © 1995-2017 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.8 © 2000-2017, Jelsoft Enterprises Ltd.,