Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > macOS > Help Me Grep, Please

Help Me Grep, Please
Thread Tools
immsav
Fresh-Faced Recruit
Join Date: Dec 2002
Location: Oxford
Status: Offline
Reply With Quote
Jun 19, 2003, 03:00 PM
 
I've got BBEdit open. I've got a webpage full of entries like this:

<strong>Anti-Slavery International</strong> - <a href="http://www.antislavery.org">http://www.antislavery.org</a>
– ASI is the oldest human rights organization in world and the sister
organization to Free the Slaves, based in London, England.

I want to change this to:

<dt><a href="http://www.antislavery.org">Anti-Slavery International</a></dt><dd>ASI is the oldest human rights organization in world and the sister organization to Free the Slaves, based in London, England.</dd>

This would take hours to do by hand, but I know there's a quick solution using BBEdit's grep search and replace feature. After trying out different patterns for half an hour or so, I haven't been having much luck.

Can anyone here help me?

Thanks,

JP
     
Paul McCann
Mac Enthusiast
Join Date: Nov 2001
Location: Adelaide, South Australia
Status: Offline
Reply With Quote
Jun 19, 2003, 08:34 PM
 
Try something like:

Find:

<strong>(.*)</strong>.*(<a href.*?>).*</a>([^<]*)

Replace with:

<dt>\2\1</a></dt>\r<dd>\3</dd>\r

(obviously with "Use grep" ticked!). It's far from perfect, but it might get the job done. Parsing html is usually in imperfect sort of operation.

Best of luck,
Paul
     
immsav  (op)
Fresh-Faced Recruit
Join Date: Dec 2002
Location: Oxford
Status: Offline
Reply With Quote
Jun 20, 2003, 04:28 PM
 
Thanks, Paul. Your solution worked out well, although I had to escape the brackets and slashes with backslashes--only those outside of the parentheses, though. Curious.

You definitely saved me a few hours of tedious editing. Thanks again.

-JP
     
Paul McCann
Mac Enthusiast
Join Date: Nov 2001
Location: Adelaide, South Australia
Status: Offline
Reply With Quote
Jun 20, 2003, 10:49 PM
 
No problem whatsoever! Glad to hear that you got there in the end.

I'm still a bit confused as to the escaping business however; works fine in BBEdit 6.5.3 as printed in my post above (for what it's worth). I guess they've just changed the rules for what counts as a metacharacter through the different versions.

Best wishes,
Paul
     
absmiths
Mac Elite
Join Date: Sep 2000
Location: Edmond, OK USA
Status: Offline
Reply With Quote
Jun 23, 2003, 04:30 PM
 
Originally posted by immsav:
I've got BBEdit open. I've got a webpage full of entries like this:

<dt><a href="http://www.antislavery.org">Anti-Slavery International</a></dt><dd>ASI is the oldest human rights organization in world and the sister organization to Free the Slaves, based in London, England.</dd>
While you are at it, you should correct a grammatical mistake you made in both versions - "ASI is the oldest human rights organization in the world and ..."
     
   
 
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Top
Privacy Policy
All times are GMT -4. The time now is 10:33 AM.
All contents of these forums © 1995-2017 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.8 © 2000-2017, Jelsoft Enterprises Ltd.,