Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > Developer Center > Java regular expression

Java regular expression
Thread Tools
Forum Regular
Join Date: Jan 2001
Status: Offline
Reply With Quote
Sep 30, 2004, 10:34 AM
 
I need to capture strings like:

P(Risk) = 0.90
P(Disease | Risk) = .02
P(Disease | !Risk) = 0.005
Q(Disease | Symptom)
Q(Symptom | Risk)

and so on and '#' is a comment, e.g I dont want to capture

# P(Disease | Symptom)=0.1

How can I do this with regular expressions?

I do it like this know

Code:
String B = "(P.*\\((.*?)\\).*=(.*))"; Pattern p = Pattern.compile(B); Matcher m = p.matcher(expresion); /*find P(exp) == v*/ while(m.find()){ System.out.println(m.group(0)); } /*find Q(...)*/ String C = "Q.*\\((.*?)\\)"; p = Pattern.compile(C); m = p.matcher(expresion); while(m.find()){ System.out.println(m.group(0)); }
but that would capture #P(..) and #Q(..) expressions to, How can I ignore the comments?

I tried to add [^#] in front of the expression (B and C)but that didn't work, shouldnt ^ inside a character class negate the mathing, e.g dont match '#'?
     
Dedicated MacNNer
Join Date: Feb 2001
Location: Manhattan
Status: Offline
Reply With Quote
Sep 30, 2004, 01:32 PM
 
try to add ^\s*[^#] to the front of the pattern... looking for potential white-space at the front of a line and no initial #... maybe, haven't tried...
     
geran  (op)
Forum Regular
Join Date: Jan 2001
Status: Offline
Reply With Quote
Sep 30, 2004, 02:25 PM
 
didnt work,

"^.*[^#](P.*\\((.*?)\\).*=(.*))"
only matches the commented lines. [^#] dont seem to negate the mathing of '#'

correct me if I got it wrong, but shouldnt "^.*[^#]" mean all lines biggining whit any whitespace character (zero or more) but NOT contain `#'?
     
Mac Elite
Join Date: Oct 1999
Location: San Jose, Ca
Status: Offline
Reply With Quote
Sep 30, 2004, 02:41 PM
 
Originally posted by geran:
correct me if I got it wrong, but shouldnt "^.*[^#]" mean all lines biggining whit any whitespace character (zero or more) but NOT contain `#'?
No, that pattern would be any line stating with 0 or more of any character and then having a character other than '#'... not a very meaningful match.

I don't ever do regexp's in Java, preferring to do them in Perl (where the gods meant regexp to be done... *chuckle*), but the pattern there would be something like:

/^\s*[PQ]\(/

Of course, I don't know exactly what the file you are filtering looks like, so there are a lot of variants on this patten that would filter out data you don't want, or simplify things to parse out the data you do want.
     
Mac Elite
Join Date: Sep 2000
Location: in front of the keyboard
Status: Offline
Reply With Quote
Oct 1, 2004, 11:00 AM
 
you could always use what you have, then check for comments.
signatures are a waste of bandwidth
especially ones with political tripe in them.
     
   
Thread Tools
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Top
Privacy Policy
All times are GMT -5. The time now is 09:12 PM.
All contents of these forums © 1995-2011 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.7 © 2000-2011, Jelsoft Enterprises Ltd., Content Relevant URLs by vBSEO 3.3.2