UTF8 characters corrupt on GUI in Windows Topic is solved

Do you have a typical platform dependent issue you're battling with ? Ask it here. Make sure you mention your platform, compiler, and wxWidgets version.
Post Reply
MJaoune
Earned a small fee
Earned a small fee
Posts: 22
Joined: Tue Aug 20, 2019 7:37 pm

UTF8 characters corrupt on GUI in Windows

Post by MJaoune » Sun Aug 25, 2019 8:08 pm

Hi,

I am building an application which has some Arabic in its GUI, however the text appears corrupt and has nothing to do with Arabic (See screenshot).

I am using the "u" libraries and have defined "_UNICODE". I am using Visual Studio 2010, and have set "Character Set" to "Use Unicode Character Set". I have put each Arabic string in a wxT().

Arabic is available in UTF8 starting from U+0600 code point.

The application works flawlessly on Linux using wxGTK (Under GTK3).

I am using wxWidgets 3.0.4, the prebuilt binaries found in http://wxwidgets.org/downloads/.

Thanks in advance.
Attachments
unicode-problem.png
Screenshot
unicode-problem.png (8.48 KiB) Viewed 322 times

Kvaz1r
Earned some good credits
Earned some good credits
Posts: 132
Joined: Tue Jun 07, 2016 1:07 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by Kvaz1r » Sun Aug 25, 2019 8:31 pm

Are your sure that VisualStudio use utf8 encoding for file?

MJaoune
Earned a small fee
Earned a small fee
Posts: 22
Joined: Tue Aug 20, 2019 7:37 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by MJaoune » Mon Aug 26, 2019 1:35 am

Kvaz1r wrote:
Sun Aug 25, 2019 8:31 pm
Are your sure that VisualStudio use utf8 encoding for file?
Well how do I check? The strings look normal in the Visual Studio IDE, also, I haven't created the files in VS, but in another text editor and then added those files to the VS project.

PB
Part Of The Furniture
Part Of The Furniture
Posts: 2070
Joined: Sun Jan 03, 2010 5:45 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by PB » Mon Aug 26, 2019 6:22 am

How do you convert those string literals to wxStrings? If they are in UTF-8, wxStríng(literal) will not work, as on MSW wxString expects a (unicode) literal to be UTF16. Have you tried wxString::FromUTF8(literal)?

I always say non 7-bit ASCII in the source files is brittle (depends on platform and compiler used) and one is better off using English strings and having them translated with _().

See also
https://docs.microsoft.com/en-us/cpp/bu ... ew=vs-2019

utelle
Moderator
Moderator
Posts: 897
Joined: Tue Jul 05, 2005 10:00 pm
Location: Cologne, Germany
Contact:

Re: UTF8 characters corrupt on GUI in Windows

Post by utelle » Mon Aug 26, 2019 6:47 am

MJaoune wrote:
Mon Aug 26, 2019 1:35 am
Kvaz1r wrote:
Sun Aug 25, 2019 8:31 pm
Are your sure that VisualStudio use utf8 encoding for file?
Well how do I check? The strings look normal in the Visual Studio IDE, also, I haven't created the files in VS, but in another text editor and then added those files to the VS project.
Visual Studio usually offers to save a source file as Unicode (UCS-2 encoding), if you enter non-ASCII characters in the IDE. However, if you used a different text editor to create the file, then this mechanism doesn't work. If the file is indeed encoded in UTF-8, then using the wxT macro will not work as it assumes strings in UCS-2 encoding.

Make sure that your source file is really encoded in UTF-8 (for example, by opening it in NotePad++, which shows the encoding of the file in the status bar). And then use the wxString UTF-8 conversion method to make sure that wxWidgets uses the correct string representation:

Code: Select all

wxString::FromUTF8("...")
Note that no wxT macro is used here.

MJaoune
Earned a small fee
Earned a small fee
Posts: 22
Joined: Tue Aug 20, 2019 7:37 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by MJaoune » Mon Aug 26, 2019 9:05 am

PB wrote:
Mon Aug 26, 2019 6:22 am
Have you tried wxString::FromUTF8(literal)?
utelle wrote:
Mon Aug 26, 2019 6:47 am
And then use the wxString UTF-8 conversion method to make sure that wxWidgets uses the correct string representation:

Code: Select all

wxString::FromUTF8("...")
Note that no wxT macro is used here.
This did it, thanks!

I was using "wxT()", I thought it would do such conversions automatically... What is its purpose then?

PB
Part Of The Furniture
Part Of The Furniture
Posts: 2070
Joined: Sun Jan 03, 2010 5:45 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by PB » Mon Aug 26, 2019 10:58 am

wxT() converts narrow string literal to wide (Unicode) or keeps a narrow one in non-Unicode build.

However, as I wrote before on MSW it expects UTF16 encoding and not UTF-8 one for a Unicode string. You could perhaps change this compiler behaviour using the MSVC setting I linked in my previous post. I think this MSVC behaviour could also be altered by source files having an UTF-8 BOM but I believe other compilers may not like that.

Post Reply