Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Software - Troubleshooting and Discussion > Developer Center > Unicode Literals

Unicode Literals
Thread Tools
Junior Member
Join Date: Dec 2000
Location: Houston, TX, USA
Status: Offline
Reply With Quote
Mar 6, 2001, 07:52 PM
 
I was wondering how to display to a text field a Unicode string. I'm doing the standard

[textField setStringValue:string];

call, and before that

NSString *string = [NSString stringWithCharacters:???? length:?];

So, what goes where the ?'s are? My problem is that i can't figure out how to make a string from the Unicode codes that i've looked up for the characters i need. Thanks!
Mike
     
ali
Forum Regular
Join Date: Sep 2000
Status: Offline
Reply With Quote
Mar 6, 2001, 08:12 PM
 
Usually you would do something like:

const unichar u[] = {'H', 0x41, 0x2022};
NSString *string = [NSString stringWithCharacters:u length:sizeof(u)];

There is no way to include Unicode literals in string constants.

It's best to avoid any chars > 127 in string constants, actually.
Ali

     
Mac Enthusiast
Join Date: Feb 2000
Location: Storrs,Connecticut, USA
Status: Offline
Reply With Quote
Mar 7, 2001, 05:46 AM
 
Wait, In the file encodings menu, in the format menu, unicode is an option. Isn't it okay to just type unicode characters if you save your project files in unicode format? I thought that you could easily make a string object with any unicode character in it using the compiler construct @"This is my string with unicode in it". Is there any reason why this wouldn't work if you saved your source files in unicode format?
     
ali
Forum Regular
Join Date: Sep 2000
Status: Offline
Reply With Quote
Mar 7, 2001, 11:55 AM
 
Well, "Unicode" is actually a 2-byte format, meaning even ASCII chars get represented as a zero byte followed by the ASCII byte. Most C-compilers will choke on such a source file, so that is not an option in general. (I believe this works in Java though, so you can use it if you're doing Cocoa Java.)

Another option is to use regular 8-bit source files, but non-ASCII chars in string literals. However, whether you use actual chars > 127 or the safer \nnn syntax, you are still at the mercy of the compiler or the environment in which the program is running. If you are the only one intepreting those bytes, then you can certainly choose an encoding and stick with it --- for instance, UTF-8, which is the 8-bit, Unix-safe representation of Unicode, is a reasonable choice.

For CFSTR("...") or @"...", the choice is a bit more complicated, as the runtime interprets your bytes. Apple is trying to move to UTF-8 as a standard for those, but for now it's best to avoid non-ASCII chars in literal strings altogether.
     
Junior Member
Join Date: Dec 2000
Location: Houston, TX, USA
Status: Offline
Reply With Quote
Mar 8, 2001, 11:37 AM
 
What i wanted to do was create some strings of text that are not at all ascii, such as a delta followed by two superscript +'s. I wanted to put this in code to display in a textfield. I figured Unicode was the best way, but if there is another option, i'm all ears. Thanks for the replies...
     
ali
Forum Regular
Join Date: Sep 2000
Status: Offline
Reply With Quote
Mar 8, 2001, 11:58 AM
 
Unicode is the best, most robust way. For instance, for em-dash use:

unichar ch = 0x2014;
NSString *emdash = [NSString stringWithCharacters:&ch length:1];

Another option is to use 8-bit chars, but explicitly specifing their encodings. For instance, for emdash, you can specify 0xd1, and an encoding of MacOSRoman. This is not possible with constant strings (CFSTR("...") or @"...").

Ali


     
Junior Member
Join Date: Dec 2000
Location: Houston, TX, USA
Status: Offline
Reply With Quote
Mar 8, 2001, 12:07 PM
 
is the unicode character set complete in OS X beta?
     
ali
Forum Regular
Join Date: Sep 2000
Status: Offline
Reply With Quote
Mar 8, 2001, 12:19 PM
 
As far as "Unicode" handling, any character will be handled properly (and not lost). But as far as display, this depends on the availability of an appropriate font, and I am not sure how complete the coverage is. Many of the symbol characters and Asian characters are handled. If there is no font, you'll probably get some fallback font (box?).

Note that you don't have to worry about speficifying a font which can display a given Unicode character; the Cocoa text system handles this by looking for a font that "covers" the characters you've specified.

Ali
     
Junior Member
Join Date: Dec 2000
Location: Houston, TX, USA
Status: Offline
Reply With Quote
Mar 8, 2001, 12:32 PM
 
Yeah,i was curious about the display implementation. I think maybe its not all there, but then some of the stuff i see could also be my error.
     
Dedicated MacNNer
Join Date: Jan 2001
Location: Virginia, US
Status: Offline
Reply With Quote
Mar 11, 2001, 05:55 AM
 
Another hacky way to get unicode characters in your string is to use property lists, which will parse \Uxxxx escapes into Unicode characters.

NSString *myString = [@"\"my string with a \\U2022 couple of embedded \\U1ABC Unicode characters\"" propertyList];

Note that the property list string must be surrounded by literal " characters for the parsing to work, and the \ must be escaped like \\ so a literal \ ends up in the constant string to be parsed by -propertyList.

Property list files read off of disk (NSArray or NSDictionary's -initWithContentsOfFile can also have these escapes in them (or can be fully Unicode files themselves). Source code cannot be Unicode -- the C compiler can't parse it.



[This message has been edited by lindberg (edited 03-11-2001).]
     
   
Thread Tools
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Top
Privacy Policy
All times are GMT -5. The time now is 12:31 PM.
All contents of these forums © 1995-2011 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.7 © 2000-2011, Jelsoft Enterprises Ltd., Content Relevant URLs by vBSEO 3.3.2