 |
 |
text file modification help
|
 |
|
 |
|
Professional Poster
Join Date: Sep 2000
Location: San Francisco
Status:
Offline
|
|
I have a tab-delimited text file that I need to modify. I can use the command line or TextWrangler, whichever would be easier. Each record has three entries separated by tabs and finished with a hard return. I'd like to preface each record with a ">" character and put a hard return after the second field. So it currently looks like:
xxxx xxxx xxxxxxxxxxx
And I would like it to look like:
>xxxx xxxx
xxxxxxxxxxxxx
What would be the easiest way to do this? I know there is some simple command line tool that could handle it.
thanks,
kman
|
|
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
*erased command, didn't work as I originally thought*
where "test" is the name of my file I want to read
(Last edited by besson3c; Nov 12, 2007 at 01:27 PM.
)
|
|
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
and if you want to output this to a new file:
*erased command, didn't work as I originally thought* > test2
where test2 is the name of the new file, test is the name of the orig file
(Last edited by besson3c; Nov 12, 2007 at 01:27 PM.
)
|
|
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
Ahh... my command is not quite there. Only works with files containing one line.
|
|
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
Here you go:
awk {'print ">"$1,$2 "\n" $3'} test
$1 = the first column, $2 = the second column, etc... You may have to change these values if there is stuff before or after the columns you are interested in.
|
|
|
| |
|
|
|
 |
|
 |
|
Professional Poster
Join Date: Sep 2000
Location: San Francisco
Status:
Offline
|
|
E for effort. Any other unix jockeys out there that can help me out?
|
|
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
My last command works kman, what is the problem with it? WHat output are you getting?
|
|
|
| |
|
|
|
 |
|
 |
|
Professional Poster
Join Date: Sep 2000
Location: San Francisco
Status:
Offline
|
|
Sorry, I started my post before yours appeared. Thanks.
|
|
|
| |
|
|
|
 |
|
 |
|
Professional Poster
Join Date: Sep 2000
Location: San Francisco
Status:
Offline
|
|
Hmm...still not working for me. I get the following:
Code:
$ awk {'print ">$1,$2 "\n" $3} tbd0allcopy.tab
>
The > symbols continue every time I press return.
|
|
|
| |
|
|
|
 |
|
 |
|
Professional Poster
Join Date: Sep 2000
Location: San Francisco
Status:
Offline
|
|
When I do a more command on tbd0allcopy.tab I get the following:
225 namehere xxxxxxxxxxxxxxxxxxxxxxxx^M444 name2here cccccccccccccccccccccccccc^M
etc...The fields obviously contain:
225
namehere
xxxxxxxxxxxxxxxxxxxxxxxxx
separated by tabs and a carriage return (^M) at the end of the line.
kman
|
|
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
It looks like the file you are using is using WIndows carriage returns rather than Unix carriage returns, assuming that the ^M is appearing where there should be carriage returns. You'll first need to convert the file from Windows -> Unix CRs...
To do this:
awk '{ sub("\r$", ""); print }' winfile.txt > unixfile.txt
Lots of other ways to do this though:
How do I convert between Unix and Windows text files? - Knowledge Base
|
|
|
| |
|
|
|
 |
|
 |
|
Clinically Insane
Join Date: Mar 2001
Location: yes
Status:
Offline
|
|
Originally Posted by kman42
Hmm...still not working for me. I get the following:
Code:
$ awk {'print ">$1,$2 "\n" $3} tbd0allcopy.tab
>
The > symbols continue every time I press return.
This is because you are missing the final ' mark after the $3.
|
|
|
| |
|
|
|
 |
 |
|
 |
|
|
|
|
|

|
|
 |
Forum Rules
|
 |
 |
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
 |
 |
 |
 |
|
 |
|