Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > Mac OS X > Need to split a text file into many files

Need to split a text file into many files
Thread Tools
Mac Elite
Join Date: Mar 2001
Location: England
Status: Offline
Reply With Quote
Dec 28, 2002, 08:41 PM
 
I have a large text file. It's actually words that accompany our worship books at church. Anyway, it contains about 1200 song words, all in one long file. I want to split it into one file per song.

I managed to put a <SONG> tag in between each one, by using a clever combination of search and replace. But what now? How on earth do I split it?

My first urge was to use Hypercard, and I wrote a script that seemed to work for smaller files... but this one is so big or there's another problem. At any rate, it won't run on this. (I am using OSX btw, just Hypercard has been what I've done things like this with in the past.)

Anyone got any suggestions?


Amorya
What the nerd community most often fail to realize is that all features aren't equal. A well implemented and well integrated feature in a convenient interface is worth way more than the same feature implemented crappy, or accessed through a annoying interface.
     
Amorya  (op)
Mac Elite
Join Date: Mar 2001
Location: England
Status: Offline
Reply With Quote
Dec 28, 2002, 08:45 PM
 
Here's the script I was trying to use, if we have any Hypercard people here...

(I just need a solution, btw. It can be in any method. So mods, don't go moving this to the classic forum just because I mentioned Hypercard please.)

Code:
on mouseUp close file nm answer file "Choose a file" put it into nm open file nm repeat read from file nm until return if char 1 of it is "<" then domenu "New Card" go last end if put it after bg field text end repeat end mouseUp
I know it'd infinitely loop, that's not the problem. When I try my large file, it gives me the error "File not open". As I say, it works fine on smaller files.

Amorya
What the nerd community most often fail to realize is that all features aren't equal. A well implemented and well integrated feature in a convenient interface is worth way more than the same feature implemented crappy, or accessed through a annoying interface.
     
Mac Enthusiast
Join Date: Nov 2001
Location: Adelaide, South Australia
Status: Offline
Reply With Quote
Dec 29, 2002, 09:08 AM
 
Hmm, it's not clear how you want file names to be generated, but here's a script that will do something like what you're after. (It just uses the first fifteen characters of the song, prepended with a unique number, as the filename.) I'm assuming that your songfile looks something like:

Song for Sunday
This is the first line of the song for Sunday
This is the second such line
La la la
Tra la la
<SONG>
Song for Monday
This is the first line of the song for Monday
This is the second such line
La la la
Tra la la
<SONG>
Another song for the book
Yeah yeah
Ok
<SONG>

That is, no leading tag, a tag between each song and a tag at the end. Should be pretty easy to get it in that form if it's not there already. Here's the script to split it up: save it as a file called "split.pl" or whatever you wish to call it, set the $dir variable within the script to where you want the files to appear (I'm assuming that your big song file also sits in this directory), change the name of the song file to whatever it's called and you should be right to go. See below for running instructions in case it's not obvious.

#!/usr/bin/perl -w
use strict;
my $dir='/Users/pmccann/junk/';
my $songfile='songs.txt';
open IN,$dir.$songfile or die "no such file?: $!";
my ($title,$i,$song); # variables to hold each song, title, and file#
while (<IN> ){
if (/<SONG>/){
my $title=++$i."_".substr($song,0,15);
$title=~s/[ ]/_/gs; #change spaces to underscores
open OUT,">$dir$title" or die "opening outfile $title : $!";
print OUT $song;
$song=''; #back to scratch if we see a tag
next;
} else{
$song.=$_
}
}


To run the thing just go to the diretory in which you've saved "split.pl" and enter the following two commands:

chmod u+x split.pl

./split.pl

Let me know if you have any problems running the thing, or if the instructions given here aren't enough to allow you to get it to work. (In particular, if you're not comfortable editing files in the terminal). It's not the most elegant script in the world but it should get your job done, and done extremely quickly.

Cheers,
Paul
(Last edited by Paul McCann; Dec 29, 2002 at 09:14 AM. )
     
Amorya  (op)
Mac Elite
Join Date: Mar 2001
Location: England
Status: Offline
Reply With Quote
Dec 29, 2002, 05:37 PM
 
Great, thanks. I'll try it soon. Quickly would be good... I got my Hypercard one to run, but it would have taken 40 minutes (I only let it get up to 3 songs, haven't let it finish yet).

I'm OK in the terminal - never tried perl though. Looks reasonably simple, and I can't see anything there that'll erase my harddrive or anything

Amorya
What the nerd community most often fail to realize is that all features aren't equal. A well implemented and well integrated feature in a convenient interface is worth way more than the same feature implemented crappy, or accessed through a annoying interface.
     
   
Thread Tools
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Top
Privacy Policy
All times are GMT -5. The time now is 09:47 AM.
All contents of these forums © 1995-2011 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.7 © 2000-2011, Jelsoft Enterprises Ltd., Content Relevant URLs by vBSEO 3.3.2