 |
 |
Text editing help.
|
 |
|
 |
|
Senior User
Join Date: Jul 2001
Status:
Offline
|
|
Hi I need help in cleaning up text edl files.
They out put like this
0016 BLK V D 060 00:00:00:00 00:00:02:00 01:06:45;26 01:06:47;26
0017 AUX V C 00:00:00:03 00:00:08:25 01:09:35;29 01:09:44;21
0018 AUX V C 00:00:00:00 00:00:10:17 01:09:47;11 01:09:57;28
0019 AUX V C 00:00:00:00 00:00:10:08 01:09:59;24 01:10:10;02
0020 6408 V C 00:26:26;28 00:26:29;17 01:10:41;07 01:10:57;13
PEG A 016 00:00:00 6408
* REPAIR: FROM SOURCE TRUE SPEED IS 4.869000 FPS
0021 6408 V C 00:26:26;28 00:26:27;29 01:11:00;17 01:11:07;01
PEG A 016 00:00:00 6408
* REPAIR: FROM SOURCE TRUE SPEED IS 4.869000 FPS
0022 6408 V C 00:26:28;00 00:26:28;00 01:11:07;01 01:11:07;01
0022 BLK V D 060 00:00:00:00 00:00:02:00 01:11:07;01 01:11:09;01
And I need them to be cleaned up like this
0008 AUX 00:00:00:00 00:00:05:28 01:04:40;00 01:04:45;28
0009 AUX 00:00:05:28 00:00:05:28 01:04:45;28 01:04:45;28
0009 AUX 00:00:00:22 00:00:06:18 01:04:45;28 01:04:51;24
0010 BLK 00:00:00:00 00:00:00:00 01:05:09;19 01:05:09;19
I have used Tex edit plus to do half of the work to clean it up this far. But these files are HUGE! And it takes to long to do this by hand.
Are there any links to text edit prasing and such stuff. ?
From the clean file I put tabs in each space so I can then import it into Filemaker pro. So I knopw that it does work, I just have to automate the text clean up part some how.
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Jul 2001
Status:
Offline
|
|
0008 AUX 00:00:00:00 00:00:05:28 01:04:40;00 01:04:45;28
0009 AUX 00:00:05:28 00:00:05:28 01:04:45;28 01:04:45;28
0009 AUX 00:00:00:22 00:00:06:18 01:04:45;28 01:04:51;24
0010 BLK 00:00:00:00 00:00:00:00 01:05:09;19 01:05:09;19
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: Dec 2001
Location: Atlanta, GA, USA
Status:
Offline
|
|
So you just want to delete lines that begin with "*"?
|
|
Mac Pro 2x 2.66 GHz Dual core, Apple TV 160GB, two Windows XP PCs
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Jul 2001
Status:
Offline
|
|
Originally posted by Arkham_c:
So you just want to delete lines that begin with "*"?
Yes that is one of them
also stuff like this
0016 BLK V D 060 00:00:00:00 00:00:02:00 01:06:45;26 01:06:47;26
0017 AUX V C 00:00:00:03 00:00:08:25 01:09:35;29 01:09:44;21
PEG A 016 00:00:00 6408
See this line it looks like this
0016 BLK V D 060 00:00:00:00 00:00:02:00 01:06:45;26 01:06:47;26
the "060" before the 00:00:00:00 group need be be deleted to./ But it changes from 060 to 027 and other random numbers
then the " V D " or " V C " and other letters like in
0017 AUX V C 00:00:00:03 00:00:08:25 01:09:35;29 01:09:44;21
those two things have to be removed for it to import clean into filemaker pro.
This is what a pure edl file looks like fresh
0001 6408 V D 020 00:26:26;28 00:26:34;00 01:01:42;24 01:01:56;28
PEG A 050 00:00:00 6510
PEG B 016 00:00:00 6408
* REPAIR: TO SOURCE TRUE SPEED IS 4.869000 FPS
0002 6408 V C 00:26:29;07 00:26:29;07 01:01:56;28 01:01:56;28
0002 BLK V D 020 00:00:00:00 00:00:00:20 01:01:56;28 01:01:57;18
PEG A 016 00:00:00 6408
* REPAIR: FROM SOURCE TRUE SPEED IS 4.869000 FPS
0003 BLK V C 00:00:00:00 00:00:00:00 01:02:16;11 01:02:16;11
0003 AUX V D 075 00:00:00:00 00:00:07:27 01:02:16;11 01:02:24;08
0004 BLK V C 00:00:00:00 00:00:00:00 01:02:46;02 01:02:46;02
0004 HISNTS V D 020 01:03:20;28 01:03:25;28 01:02:46;02 01:02:51;02
0005 HISNTS V C 01:03:25;28 01:03:25;28 01:02:51;02 01:02:51;02
0005 BLK V D 020 00:00:00:00 00:00:00:20 01:02:51;02 01:02:51;22
0006 BLK V C 00:00:00:00 00:00:00:00 01:03:01;25 01:03:01;25
0006 HISNTS V D 020 01:03:20;28 01:03:25;22 01:03:01;25 01:03:06;19
0007 BLK V C 00:00:00:00 00:00:00:00 01:04:15;17 01:04:15;17
0007 HISNTS V D 020 01:03:20;28 01:03:25;22 01:04:15;17 01:04:20;11
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: Nov 2001
Location: Trafalmadore
Status:
Offline
|
|
I am sure you could parse the thing easily with Perl, given you know the parsing rules. You could even do it in Basic, if you don't know Perl and assuming you know all the iterations that need changed. If this is a one time thing, then it would be easier to do by hand.
Looking at the examples you gave, it seems that none of the data after 00:00... needs to be changed, just the data preceding it.
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Jul 2001
Status:
Offline
|
|
Originally posted by SMacTech:
I am sure you could parse the thing easily with Perl, given you know the parsing rules. You could even do it in Basic, if you don't know Perl and assuming you know all the iterations that need changed. If this is a one time thing, then it would be easier to do by hand.
Looking at the examples you gave, it seems that none of the data after 00:00... needs to be changed, just the data preceding it.
Sounds good, any help with this ? Links ? Tutorials?
This will be a work pipe line tool methods. It is putting imovie edl data files into a searchable database "filemaker"
Just looking for a faster method instead of loging it all by hand.
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: Nov 2003
Location: Minnesota
Status:
Offline
|
|
|
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: Dec 2001
Location: Atlanta, GA, USA
Status:
Offline
|
|
Here's a python script that will do what you want. Put it in file called "clean.py" and then run it via "python clean.py filename".
Code:
#!/usr/bin/env python
import sys, string
def cleanFile(f):
inf = open(f, "r")
outf_name = f + ".out"
outf = open(outf_name, "w")
for line in inf.readlines():
splitline = string.split(line)
if splitline[0] != "*":
if len(splitline) == 9:
newline = []
newline.extend(splitline[:4])
newline.extend(splitline[5:])
splitline = newline
outf.write(string.join(splitline, "\t") + "\n")
inf.close()
outf.close()
print "output saved to " + outf_name
if __name__ == "__main__":
if len(sys.argv) == 2:
cleanFile(sys.argv[1])
else:
print "Usage: %s <input_file> " % sys.argv[0]
raise SystemExit
I'm sure someone will come up with a one-liner in Perl to do this.
(Last edited by Arkham_c; Mar 12, 2004 at 08:26 AM.
)
|
|
Mac Pro 2x 2.66 GHz Dual core, Apple TV 160GB, two Windows XP PCs
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: May 2001
Status:
Offline
|
|
Or you could use a single line with sed in the terminal:
Code:
sed -e '/^[0-9]\{4,\}/!d;s/\([^ ]* [^ ]*\)[A-Z0-9 ]* \(.*\)/\1 \2/' inputfile > outputfile
-
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Jul 2001
Status:
Offline
|
|
|
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Jul 2001
Status:
Offline
|
|
Progress state
Sure. !
I almost figured out the method to use.
With BBedit or text wrangler. I found that I could vertical select text . !! WOO HOO! that will help so much for other stuff to ..
The messy text is like so...
* REPAIR: TO SOURCE TRUE SPEED IS 4.869000 FPS
0004 BLK V C 00:00:00:00 00:00:00:00 01:02:46;02 01:02:46;02
0004 AUX V D 020 00:00:18:03 00:00:23:03 01:02:46;02 01:02:51;02
0005 AUX V C 00:00:23:03 00:00:23:03 01:02:51;02 01:02:51;02
0005 BLK V D 020 00:00:00:00 00:00:00:20 01:02:51;02 01:02:51;22
0006 BLK V C 00:00:00:00 00:00:00:00 01:03:01;25 01:03:01;25
0006 AUX V D 020 00:00:18:03 00:00:22:27 01:03:01;25 01:03:06;19
0007 BLK V C 00:00:00:00 00:00:00:00 01:04:15;17 01:04:15;17
0007 AUX V D 020 00:00:18:03 00:00:22:27 01:04:15;17 01:04:20;11
0008 BLK V C 00:00:00:00 00:00:00:00 01:05:09;19 01:05:09;19
0008 AUX V D 020 00:00:18:03 00:00:22:27 01:05:09;19 01:05:14;13
0009 6510 V C 01:26:19;15 01:26:27;18 01:10:41;07 01:10:57;13
PEG A 050 00:00:00 6510
0010 6510 V C 01:26:19;15 01:26:22;22 01:11:00;17 01:11:07;01
PEG A 050 00:00:00 6510
0011 6510 V C 01:26:22;22 01:26:22;22 01:11:07;01 01:11:07;01
0011 BLK V D 060 00:00:00:00 00:00:02:00 01:11:07;01 01:11:09;01
And studing that text I have found that the
"V" and "C" and "D" in the center is always in the same row, this also includes the three grouped numbers. like this: V D 060
V D 060
V D 060
V D 060
V D 060
V D 060
They are also always in a line on the charecters number 13-27.
So if I can remove all charicters on those lines 13-27 that will get half done.
Then if I could remove all lines that begin with
strings that I choose to not be useful
Like " * REPAIR: ---
So if I could wild card all lines that begin with something like * REPAIR: and PEG A
That would de the full compleate clean up ! Yahooo! No more copy paste till i'm blue in the face
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: May 2001
Status:
Offline
|
|
Ok, sorry.. if you enter my line directly in tcsh you need one \ more:
Code:
sed -e '/^[0-9]\{4,\}/\!d;s/\([^ ]* [^ ]*\)[A-Z0-9 ]* \(.*\)/\1 \2/' infile > outfile
I tried it with your above data:
Code:
* REPAIR: TO SOURCE TRUE SPEED IS 4.869000 FPS
0004 BLK V C 00:00:00:00 00:00:00:00 01:02:46;02 01:02:46;02
0004 AUX V D 020 00:00:18:03 00:00:23:03 01:02:46;02 01:02:51;02
0005 AUX V C 00:00:23:03 00:00:23:03 01:02:51;02 01:02:51;02
0005 BLK V D 020 00:00:00:00 00:00:00:20 01:02:51;02 01:02:51;22
0006 BLK V C 00:00:00:00 00:00:00:00 01:03:01;25 01:03:01;25
0006 AUX V D 020 00:00:18:03 00:00:22:27 01:03:01;25 01:03:06;19
0007 BLK V C 00:00:00:00 00:00:00:00 01:04:15;17 01:04:15;17
0007 AUX V D 020 00:00:18:03 00:00:22:27 01:04:15;17 01:04:20;11
0008 BLK V C 00:00:00:00 00:00:00:00 01:05:09;19 01:05:09;19
0008 AUX V D 020 00:00:18:03 00:00:22:27 01:05:09;19 01:05:14;13
0009 6510 V C 01:26:19;15 01:26:27;18 01:10:41;07 01:10:57;13
PEG A 050 00:00:00 6510
0010 6510 V C 01:26:19;15 01:26:22;22 01:11:00;17 01:11:07;01
PEG A 050 00:00:00 6510
0011 6510 V C 01:26:22;22 01:26:22;22 01:11:07;01 01:11:07;01
0011 BLK V D 060 00:00:00:00 00:00:02:00 01:11:07;01 01:11:09;01
and it gives:
Code:
0004 BLK 00:00:00:00 00:00:00:00 01:02:46;02 01:02:46;02
0004 AUX 00:00:18:03 00:00:23:03 01:02:46;02 01:02:51;02
0005 AUX 00:00:23:03 00:00:23:03 01:02:51;02 01:02:51;02
0005 BLK 00:00:00:00 00:00:00:20 01:02:51;02 01:02:51;22
0006 BLK 00:00:00:00 00:00:00:00 01:03:01;25 01:03:01;25
0006 AUX 00:00:18:03 00:00:22:27 01:03:01;25 01:03:06;19
0007 BLK 00:00:00:00 00:00:00:00 01:04:15;17 01:04:15;17
0007 AUX 00:00:18:03 00:00:22:27 01:04:15;17 01:04:20;11
0008 BLK 00:00:00:00 00:00:00:00 01:05:09;19 01:05:09;19
0008 AUX 00:00:18:03 00:00:22:27 01:05:09;19 01:05:14;13
0009 6510 01:26:19;15 01:26:27;18 01:10:41;07 01:10:57;13
0010 6510 01:26:19;15 01:26:22;22 01:11:00;17 01:11:07;01
0011 6510 01:26:22;22 01:26:22;22 01:11:07;01 01:11:07;01
0011 BLK 00:00:00:00 00:00:02:00 01:11:07;01 01:11:09;01
If it still does not work for you it would be helpful if you could describe what happens instead.
-
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Jul 2001
Status:
Offline
|
|
When I try the sed command I get this .
[filemaker:~/Desktop/w] filemake% sed -e '/^[0-9]\{4,\}/!d;s/\([^ ]* [^ ]*\)[A-Z0-9 ]* \(.*\)/\1 \2/' inputfile > a.txt
tcsh: d: Event not found.
[filemaker:~/Desktop/w] filemake%
I have cd to the w dir and in there I have the file named a.txt . So I type a.txt
Then I try just draging the file into the window same deal.
I did a man seds so I know that it is there.
I am on 10.3.2 if that helps.
I know a little bit about terminal. I know that I am in tcsh ttyp2
also if I do it llike this
[filemaker:~/Desktop/w] filemake% sed -e '/^[0-9]\{4,\}/\!d;s/\([^ ]* [^ ]*\)[A-Z0-9 ]* \(.*\)/\1 \2/' testfornick.EDL > g.edl
[filemaker:~/Desktop/w] filemake%
It does make a new file but nothing is in it.
(Last edited by loren s; Mar 12, 2004 at 02:22 PM.
)
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: May 2001
Status:
Offline
|
|
Okay, the second version should work for you.
It does not matter whether you use a full pathname or a filename only for infile and outfile, but for the ease I suggest you create a special folder for your tries and 'cd' to there so you only have to deal with the filenames.
Now it is possible that your file does have "Mac line endings" and not Unix style ones. Sed will work only with Unix linefeeds otehrwise it will consider the whole file a single line. Try to check that and either save your file as Unix text or whatever your editor calls that option, or try:
Code:
cat infile | tr "\r" "\n" | sed -e '/^[0-9]\{4,\}/\!d;s/\([^ ]* [^ ]*\)[A-Z0-9 ]* \(.*\)/\1 \2/' > outfile
(The tr "\r" "\n" will translate all Mac \r returns into Unix \n newlines).
-
|
|
|
| |
|
|
|
 |
|
 |
|
Senior User
Join Date: Jul 2001
Status:
Offline
|
|
Originally posted by Moonray:
Okay, the second version should work for you.
It does not matter whether you use a full pathname or a filename only for infile and outfile, but for the ease I suggest you create a special folder for your tries and 'cd' to there so you only have to deal with the filenames.
Now it is possible that your file does have "Mac line endings" and not Unix style ones. Sed will work only with Unix linefeeds otehrwise it will consider the whole file a single line. Try to check that and either save your file as Unix text or whatever your editor calls that option, or try:
Code:
cat infile | tr "\r" "\n" | sed -e '/^[0-9]\{4,\}/\!d;s/\([^ ]* [^ ]*\)[A-Z0-9 ]* \(.*\)/\1 \2/' > outfile
(The tr "\r" "\n" will translate all Mac \r returns into Unix \n newlines).
-
Hi sorry for not getting back to you on this.
The command did not work for the life of me. I tried and tried. I am sure that I am doing something wrong.
I have figured out a sort of work method for now. I will post here again when I am ready with new questions. Thankyou 
|
|
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: May 2001
Status:
Offline
|
|
Oh well, you're welcome.
But if you're ever going to investigate this more, start with
Code:
cat infile | sed -e 's/a/z/[color= darkblue]g[/color]' > outfile
That should substitute every 'a' in infile with 'z' globally.
-
|
|
|
| |
|
|
|
 |
 |
|
 |
|
|
|
|
|

|
|
 |
Forum Rules
|
 |
 |
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
 |
 |
 |
 |
|
 |
|