 |
 |
Decoding URL data
|
 |
|
 |
|
Junior Member
Join Date: Sep 2000
Location: Calgary, Alberta, Canada
Status:
Offline
|
|
I have a string that contains encoded URL data, it's not actually a real URL (in the form http://www.host.com/path/path), it's just a string encoded in the url format (with escape characters and + between words). ie: this+is+encoded+.. yada yada. How can I decode this string?
|
|
"The pool on the roof must have a leak"
|
| |
|
|
|
 |
|
 |
|
Grizzled Veteran
Join Date: Sep 2000
Location: Springfield, MA
Status:
Offline
|
|
Originally posted by exrae:
<STRONG>I have a string that contains encoded URL data, it's not actually a real URL (in the form http://www.host.com/path/path), it's just a string encoded in the url format (with escape characters and + between words). ie: this+is+encoded+.. yada yada. How can I decode this string?</STRONG>
What language are you using? I know that their are libraries floating around for both perl and c that do that sort of thing.
However, if you wanted to roll your own, it's not a terribly difficult thing to write, I did it in c a while back for a project.
|
|
We hope your rules and wisdom choke you / Now we are one in everlasting peace
-- Radiohead, Exit Music (for a film)
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Sep 2000
Location: Calgary, Alberta, Canada
Status:
Offline
|
|
I'm using Cocoa and Objective-C, I can create an NSURL with the string, but there seem only to be methods that can extract host etc., not decode the whole string... I would write it myself but if there was a good library available it would save me the time.
|
|
"The pool on the roof must have a leak"
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Mar 2001
Status:
Offline
|
|
There are some CGI libraries in C that have to do this sort of parsing. Try looking at cgi-lib or cgic. I was working on an Objective-C CGI library at some point, but haven't had time to take it beyond simple proof of concept.
|
|
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Sep 2000
Location: Calgary, Alberta, Canada
Status:
Offline
|
|
Well I've decided to write my own, replacing the '+' characters with spaces is obviously simple, but I need to convert occurrences of %XX (where XX is the ASCII value, in hex) to the actual ASCII character. What's the simplest way of accomplishing this?
|
|
"The pool on the roof must have a leak"
|
| |
|
|
|
 |
|
 |
|
Mac Elite
Join Date: Feb 2001
Location: Vancouver, WA
Status:
Offline
|
|
CFURL.h (in CoreFoundation) has some functions for encoding and decoding URL text. Others can be found in the open-source Omni frameworks: the stuff you'll find useful is split between the NSString extension in OmniFoundation and OWAddress in the OWF framework.
|
|
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Sep 2000
Location: Calgary, Alberta, Canada
Status:
Offline
|
|
I tried using CFURL and CFString to decode the URL data, but I'm having a problem..
<BLOCKQUOTE><font size="1"face="Geneva, Verdana, Arial">code:</font><HR><pre><font size=1 face=courier>
coreString = CFStringCreateWithCString(NULL, decodedString, CFStringGetSystemEncoding());
coreURL = CFURLCreateWithString(NULL, coreString, NULL);
finalCFString = CFURLGetString(coreURL);
</font>[/code]
The code works and I get a valid string (with the % escape characters still there) if I comment out the last line. Else, I get a SIGBUS error in the program.
What am I doing incorrectly with the last line? I have no experience with the CoreFoundation types so I'm a little lost..
|
|
"The pool on the roof must have a leak"
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Sep 2000
Location: Calgary, Alberta, Canada
Status:
Offline
|
|
Oh, obviously I need to pass the allocator(?) in the CFURLCreateWithString.. but I haven't a clue what that is.
|
|
"The pool on the roof must have a leak"
|
| |
|
|
|
 |
|
 |
|
Grizzled Veteran
Join Date: Sep 2000
Location: Springfield, MA
Status:
Offline
|
|
Originally posted by exrae:
<STRONG>Well I've decided to write my own, replacing the '+' characters with spaces is obviously simple, but I need to convert occurrences of %XX (where XX is the ASCII value, in hex) to the actual ASCII character. What's the simplest way of accomplishing this?</STRONG>
Well, for the curious, here's my url tokenizer, the '&' and '=' characters are used as delimiters of course. I'm not saying that this is good, or nice, but it's not too ugly I don't think, and it does work :-)
BTW, "string" is defined by the following: "typedef char *string;"
<BLOCKQUOTE><font size="1"face="Geneva, Verdana, Arial">code:</font><HR><pre><font size=1 face=courier>
string next_token (string input_string, int* pos)
{
int i;
int w_start;
int w_len = <font color = blue>0</font>;
int t_pos;
string output_string;
char char_val[<font color = blue>3</font>];
<font color = brown>/* if string is empty, return NULL */</font>
if (input_string[*pos] == '\<font color = blue>0</font>')
return NULL;
w_start = *pos;
<font color = brown>/* First loop through the string once to get size */</font>
while (input_string[*pos] != '\<font color = blue>0</font>')
{
if (input_string[*pos] == delim)
break;
if (input_string[*pos] == '%') <font color = brown>/* '%' is followd by <font color = blue>2</font> chars will be strippped */</font>
w_len--;
else
w_len++;
*pos = *pos + <font color = blue>1</font>;
}
output_string = (string) calloc(w_len + <font color = blue>1</font>, sizeof(char)); <font color = brown>/* add space for null char */</font>
<font color = brown>/* now we copy output string to input string */</font>
t_pos = w_start;
for (i = <font color = blue>0</font>; i < w_len; i++)
{
if (input_string[t_pos] == '%')
{
char_val[<font color = blue>0</font>] = input_string[++t_pos]; <font color = brown>/* copy hex value to temp variable */</font>
char_val[<font color = blue>1</font>] = input_string[++t_pos];
output_string[i] = strtol(char_val, NULL, <font color = blue>16</font>);
} else if (input_string[t_pos] == '+') {
output_string[i] = ' ';
} else {
output_string[i] = input_string[t_pos];
}
t_pos++;
}
<font color = brown>/* skip over delimiter, terminate string and return value */</font>
if (input_string[*pos] == delim)
*pos = *pos + <font color = blue>1</font>;
output_string[w_len] = '\<font color = blue>0</font>';
return output_string;
}
</font>[/code]
|
|
We hope your rules and wisdom choke you / Now we are one in everlasting peace
-- Radiohead, Exit Music (for a film)
|
| |
|
|
|
 |
|
 |
|
Junior Member
Join Date: Sep 2000
Location: Calgary, Alberta, Canada
Status:
Offline
|
|
Well, I actually got the time to sit down and sift through the CF docs a little, and the solution was quite clear and simple. Here it is...
<BLOCKQUOTE><font size="1"face="Geneva, Verdana, Arial">code:</font><HR><pre><font size=1 face=courier>
- (NSString*)urlToString  NSString*)urlString {
char decodedString[[urlString length]];
CFStringRef coreString;
CFStringRef finalCFString;
int i;
for(i = <font color = blue>0</font>; i<[urlString length]; i++) {
if( [urlString characterAtIndex:i]=='+' )
decodedString[i] = ' ';
else
decodedString[i] = [urlString characterAtIndex:i];
}
coreString = CFStringCreateWithCString(NULL, decodedString, CFStringGetSystemEncoding());
finalCFString = CFURLCreateStringByReplacingPercentEscapes(kCFAllo catorDefault,
coreString, CFSTR(<font color = red>""</font>));
CFStringGetCString(finalCFString, decodedString, [urlString length],
CFStringGetSystemEncoding());
return [NSString stringWithCString:decodedString];
}
</font>[/code]
|
|
"The pool on the roof must have a leak"
|
| |
|
|
|
 |
 |
|
 |
|
|
|
|
|

|
|
 |
Forum Rules
|
 |
 |
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
|
HTML code is Off
|
|
|
|
|
|
 |
 |
 |
 |
|
 |
|