UTF8 characters corrupt on GUI in Windows Topic is solved

Do you have a typical platform dependent issue you're battling with ? Ask it here. Make sure you mention your platform, compiler, and wxWidgets version.
Post Reply
MJaoune
Earned a small fee
Earned a small fee
Posts: 24
Joined: Tue Aug 20, 2019 7:37 pm

UTF8 characters corrupt on GUI in Windows

Post by MJaoune »

Hi,

I am building an application which has some Arabic in its GUI, however the text appears corrupt and has nothing to do with Arabic (See screenshot).

I am using the "u" libraries and have defined "_UNICODE". I am using Visual Studio 2010, and have set "Character Set" to "Use Unicode Character Set". I have put each Arabic string in a wxT().

Arabic is available in UTF8 starting from U+0600 code point.

The application works flawlessly on Linux using wxGTK (Under GTK3).

I am using wxWidgets 3.0.4, the prebuilt binaries found in http://wxwidgets.org/downloads/.

Thanks in advance.
Attachments
Screenshot
Screenshot
unicode-problem.png (8.48 KiB) Viewed 2201 times
Kvaz1r
Super wx Problem Solver
Super wx Problem Solver
Posts: 357
Joined: Tue Jun 07, 2016 1:07 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by Kvaz1r »

Are your sure that VisualStudio use utf8 encoding for file?
MJaoune
Earned a small fee
Earned a small fee
Posts: 24
Joined: Tue Aug 20, 2019 7:37 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by MJaoune »

Kvaz1r wrote: Sun Aug 25, 2019 8:31 pm Are your sure that VisualStudio use utf8 encoding for file?
Well how do I check? The strings look normal in the Visual Studio IDE, also, I haven't created the files in VS, but in another text editor and then added those files to the VS project.
PB
Part Of The Furniture
Part Of The Furniture
Posts: 4193
Joined: Sun Jan 03, 2010 5:45 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by PB »

How do you convert those string literals to wxStrings? If they are in UTF-8, wxStríng(literal) will not work, as on MSW wxString expects a (unicode) literal to be UTF16. Have you tried wxString::FromUTF8(literal)?

I always say non 7-bit ASCII in the source files is brittle (depends on platform and compiler used) and one is better off using English strings and having them translated with _().

See also
https://docs.microsoft.com/en-us/cpp/bu ... ew=vs-2019
utelle
Moderator
Moderator
Posts: 1125
Joined: Tue Jul 05, 2005 10:00 pm
Location: Cologne, Germany
Contact:

Re: UTF8 characters corrupt on GUI in Windows

Post by utelle »

MJaoune wrote: Mon Aug 26, 2019 1:35 am
Kvaz1r wrote: Sun Aug 25, 2019 8:31 pm Are your sure that VisualStudio use utf8 encoding for file?
Well how do I check? The strings look normal in the Visual Studio IDE, also, I haven't created the files in VS, but in another text editor and then added those files to the VS project.
Visual Studio usually offers to save a source file as Unicode (UCS-2 encoding), if you enter non-ASCII characters in the IDE. However, if you used a different text editor to create the file, then this mechanism doesn't work. If the file is indeed encoded in UTF-8, then using the wxT macro will not work as it assumes strings in UCS-2 encoding.

Make sure that your source file is really encoded in UTF-8 (for example, by opening it in NotePad++, which shows the encoding of the file in the status bar). And then use the wxString UTF-8 conversion method to make sure that wxWidgets uses the correct string representation:

Code: Select all

wxString::FromUTF8("...")
Note that no wxT macro is used here.
MJaoune
Earned a small fee
Earned a small fee
Posts: 24
Joined: Tue Aug 20, 2019 7:37 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by MJaoune »

PB wrote: Mon Aug 26, 2019 6:22 amHave you tried wxString::FromUTF8(literal)?
utelle wrote: Mon Aug 26, 2019 6:47 amAnd then use the wxString UTF-8 conversion method to make sure that wxWidgets uses the correct string representation:

Code: Select all

wxString::FromUTF8("...")
Note that no wxT macro is used here.
This did it, thanks!

I was using "wxT()", I thought it would do such conversions automatically... What is its purpose then?
PB
Part Of The Furniture
Part Of The Furniture
Posts: 4193
Joined: Sun Jan 03, 2010 5:45 pm

Re: UTF8 characters corrupt on GUI in Windows

Post by PB »

wxT() converts narrow string literal to wide (Unicode) or keeps a narrow one in non-Unicode build.

However, as I wrote before on MSW it expects UTF16 encoding and not UTF-8 one for a Unicode string. You could perhaps change this compiler behaviour using the MSVC setting I linked in my previous post. I think this MSVC behaviour could also be altered by source files having an UTF-8 BOM but I believe other compilers may not like that.
Post Reply