Windows 7: UCS-2 or UTF-16? The Real Scoop?

Do you have a typical platform dependent issue you're battling with ? Ask it here. Make sure you mention your platform, compiler, and wxWidgets version.
Post Reply
Forbin
Earned a small fee
Earned a small fee
Posts: 21
Joined: Mon Oct 31, 2016 7:26 pm

Windows 7: UCS-2 or UTF-16? The Real Scoop?

Post by Forbin »

.
Platform: Windows 7 Pro (and maybe Win-10)
Compiler: MS Visual Studio Pro 2013 (C++)
wxWidgets: 3.0.2

I've searched both this site and the web in general, and found conflicting information regarding what form of Unicode is actually supported in Windows 7 -- natively, and in wxWidgets 3.0.2. Comments elsewhere on this site suggest it is still limited to UCS-2. Comments elsewhere on the web claim that Microsoft eventually expanded support to the complete UTF-16 representation of some version of Unicode (typically stated to be Unicode 5.x).

So, what's the real story here?

For example, what if a Chinese user had a name that requires surrogate pairs in the string (i.e. there are one or more glyphs having code points greater than U+FFFF), and they try to type that into my UI based on wxWidgets 3.0.2? What happens? I don't read any Asian languages, so I don't have confidence that I can craft a string that would actually test this, nor am I confident I could tell if there were a problem or not. (Otherwise, I would just test this myself!)
  • Could the correct string be displayed in all the different wx control types that could contain a string?
  • Would wxString::Len() give me the correct glyph count? Or would it return the number of code units (e.g. number of double-bytes) instead?
  • If I tried to create a filename from that Chinese user's name, would it appear correctly in File Mangler?
Here's hoping an Asian colleague is out there with the extensive experience to be able to answer all this categorically!


Cheers,
-- forbin
User avatar
doublemax
Moderator
Moderator
Posts: 19103
Joined: Fri Apr 21, 2006 8:03 pm
Location: $FCE2

Re: Windows 7: UCS-2 or UTF-16? The Real Scoop?

Post by doublemax »

It's been a while since i touched this problem and my memory is a little shady, so take everything i say with a grain of salt.

wxWidgets doesn't handle surrogate pairs, so accessing individual characters of such a string can lead to wrong results. However, as long as you take the string as a whole, everything should work as expected.

What you can try: Build a simple GUI with a wxTextCtrl and a wxStatictext. Enter some text into the wxTextCtrl and then set that text to the wxStaticText. If it appears correctly, you have your answer.

Slightly related, unsolved issue with a little additional information about the subject:
http://trac.wxwidgets.org/ticket/11827
Use the source, Luke!
Post Reply