Page 1 of 1
UTF8 characters corrupt on GUI in Windows
Posted: Sun Aug 25, 2019 8:08 pm
by MJaoune
Hi,
I am building an application which has some Arabic in its GUI, however the text appears corrupt and has nothing to do with Arabic (See screenshot).
I am using the "u" libraries and have defined "_UNICODE". I am using Visual Studio 2010, and have set "Character Set" to "Use Unicode Character Set". I have put each Arabic string in a wxT().
Arabic is available in UTF8 starting from U+0600 code point.
The application works flawlessly on Linux using wxGTK (Under GTK3).
I am using wxWidgets 3.0.4, the prebuilt binaries found in
http://wxwidgets.org/downloads/.
Thanks in advance.
Re: UTF8 characters corrupt on GUI in Windows
Posted: Sun Aug 25, 2019 8:31 pm
by Kvaz1r
Are your sure that VisualStudio use utf8 encoding for file?
Re: UTF8 characters corrupt on GUI in Windows
Posted: Mon Aug 26, 2019 1:35 am
by MJaoune
Kvaz1r wrote: ↑Sun Aug 25, 2019 8:31 pm
Are your sure that VisualStudio use utf8 encoding for file?
Well how do I check? The strings look normal in the Visual Studio IDE, also, I haven't created the files in VS, but in another text editor and then added those files to the VS project.
Re: UTF8 characters corrupt on GUI in Windows
Posted: Mon Aug 26, 2019 6:22 am
by PB
How do you convert those string literals to wxStrings? If they are in UTF-8, wxStríng(literal) will not work, as on MSW wxString expects a (unicode) literal to be UTF16. Have you tried wxString::FromUTF8(literal)?
I always say non 7-bit ASCII in the source files is brittle (depends on platform and compiler used) and one is better off using English strings and having them translated with _().
See also
https://docs.microsoft.com/en-us/cpp/bu ... ew=vs-2019
Re: UTF8 characters corrupt on GUI in Windows
Posted: Mon Aug 26, 2019 6:47 am
by utelle
MJaoune wrote: ↑Mon Aug 26, 2019 1:35 am
Kvaz1r wrote: ↑Sun Aug 25, 2019 8:31 pm
Are your sure that VisualStudio use utf8 encoding for file?
Well how do I check? The strings look normal in the Visual Studio IDE, also, I haven't created the files in VS, but in another text editor and then added those files to the VS project.
Visual Studio usually offers to save a source file as Unicode (UCS-2 encoding), if you enter non-ASCII characters in the IDE. However, if you used a different text editor to create the file, then this mechanism doesn't work. If the file is indeed encoded in UTF-8, then using the
wxT macro will not work as it assumes strings in UCS-2 encoding.
Make sure that your source file is really encoded in UTF-8 (for example, by opening it in
NotePad++, which shows the encoding of the file in the status bar). And then use the wxString UTF-8 conversion method to make sure that wxWidgets uses the correct string representation:
Note that no
wxT macro is used here.
Re: UTF8 characters corrupt on GUI in Windows
Posted: Mon Aug 26, 2019 9:05 am
by MJaoune
PB wrote: ↑Mon Aug 26, 2019 6:22 amHave you tried wxString::FromUTF8(literal)?
utelle wrote: ↑Mon Aug 26, 2019 6:47 amAnd then use the wxString UTF-8 conversion method to make sure that wxWidgets uses the correct string representation:
Note that no
wxT macro is used here.
This did it, thanks!
I was using "wxT()", I thought it would do such conversions automatically... What is its purpose then?
Re: UTF8 characters corrupt on GUI in Windows
Posted: Mon Aug 26, 2019 10:58 am
by PB
wxT() converts narrow string literal to wide (Unicode) or keeps a narrow one in non-Unicode build.
However, as I wrote before on MSW it expects UTF16 encoding and not UTF-8 one for a Unicode string. You could perhaps change this compiler behaviour using the MSVC setting I linked in my previous post. I think this MSVC behaviour could also be altered by source files having an UTF-8 BOM but I believe other compilers may not like that.