Page 1 of 1

UTF8 characters corrupt on GUI in Windows

Posted: Sun Aug 25, 2019 8:08 pm
by MJaoune
Hi,

I am building an application which has some Arabic in its GUI, however the text appears corrupt and has nothing to do with Arabic (See screenshot).

I am using the "u" libraries and have defined "_UNICODE". I am using Visual Studio 2010, and have set "Character Set" to "Use Unicode Character Set". I have put each Arabic string in a wxT().

Arabic is available in UTF8 starting from U+0600 code point.

The application works flawlessly on Linux using wxGTK (Under GTK3).

I am using wxWidgets 3.0.4, the prebuilt binaries found in http://wxwidgets.org/downloads/.

Thanks in advance.

Re: UTF8 characters corrupt on GUI in Windows

Posted: Sun Aug 25, 2019 8:31 pm
by Kvaz1r
Are your sure that VisualStudio use utf8 encoding for file?

Re: UTF8 characters corrupt on GUI in Windows

Posted: Mon Aug 26, 2019 1:35 am
by MJaoune
Kvaz1r wrote: Sun Aug 25, 2019 8:31 pm Are your sure that VisualStudio use utf8 encoding for file?
Well how do I check? The strings look normal in the Visual Studio IDE, also, I haven't created the files in VS, but in another text editor and then added those files to the VS project.

Re: UTF8 characters corrupt on GUI in Windows

Posted: Mon Aug 26, 2019 6:22 am
by PB
How do you convert those string literals to wxStrings? If they are in UTF-8, wxStríng(literal) will not work, as on MSW wxString expects a (unicode) literal to be UTF16. Have you tried wxString::FromUTF8(literal)?

I always say non 7-bit ASCII in the source files is brittle (depends on platform and compiler used) and one is better off using English strings and having them translated with _().

See also
https://docs.microsoft.com/en-us/cpp/bu ... ew=vs-2019

Re: UTF8 characters corrupt on GUI in Windows

Posted: Mon Aug 26, 2019 6:47 am
by utelle
MJaoune wrote: Mon Aug 26, 2019 1:35 am
Kvaz1r wrote: Sun Aug 25, 2019 8:31 pm Are your sure that VisualStudio use utf8 encoding for file?
Well how do I check? The strings look normal in the Visual Studio IDE, also, I haven't created the files in VS, but in another text editor and then added those files to the VS project.
Visual Studio usually offers to save a source file as Unicode (UCS-2 encoding), if you enter non-ASCII characters in the IDE. However, if you used a different text editor to create the file, then this mechanism doesn't work. If the file is indeed encoded in UTF-8, then using the wxT macro will not work as it assumes strings in UCS-2 encoding.

Make sure that your source file is really encoded in UTF-8 (for example, by opening it in NotePad++, which shows the encoding of the file in the status bar). And then use the wxString UTF-8 conversion method to make sure that wxWidgets uses the correct string representation:

Code: Select all

wxString::FromUTF8("...")
Note that no wxT macro is used here.

Re: UTF8 characters corrupt on GUI in Windows

Posted: Mon Aug 26, 2019 9:05 am
by MJaoune
PB wrote: Mon Aug 26, 2019 6:22 amHave you tried wxString::FromUTF8(literal)?
utelle wrote: Mon Aug 26, 2019 6:47 amAnd then use the wxString UTF-8 conversion method to make sure that wxWidgets uses the correct string representation:

Code: Select all

wxString::FromUTF8("...")
Note that no wxT macro is used here.
This did it, thanks!

I was using "wxT()", I thought it would do such conversions automatically... What is its purpose then?

Re: UTF8 characters corrupt on GUI in Windows

Posted: Mon Aug 26, 2019 10:58 am
by PB
wxT() converts narrow string literal to wide (Unicode) or keeps a narrow one in non-Unicode build.

However, as I wrote before on MSW it expects UTF16 encoding and not UTF-8 one for a Unicode string. You could perhaps change this compiler behaviour using the MSVC setting I linked in my previous post. I think this MSVC behaviour could also be altered by source files having an UTF-8 BOM but I believe other compilers may not like that.