Converting Unicode wxString to UTF-8 Topic is solved

If you are using the main C++ distribution of wxWidgets, Feel free to ask any question related to wxWidgets development here. This means questions regarding to C++ and wxWidgets, not compile problems.
Post Reply
geralds
I live to help wx-kind
I live to help wx-kind
Posts: 186
Joined: Tue Nov 01, 2005 9:22 am
Contact:

Converting Unicode wxString to UTF-8

Post by geralds » Tue Nov 01, 2005 1:41 pm

I'm using the Unicode build of wxW and I was delighted that wxW will convert Expat's UTF-8 text nodes to wxStrings as follows:

Code: Select all

// s is a std::string containing a UTF-8 string
wxString node = wxString(s.c_str(), wxConvUTF8, s.size());
I can then set the text of my Scintilla component to the wxString. My question is: how do I then convert the text back to UTF-8 for final output?

Sorry if this is a very lame question; I've just started using wxW!

heda
Knows some wx things
Knows some wx things
Posts: 32
Joined: Sun Jul 10, 2005 1:11 pm

Post by heda » Tue Nov 01, 2005 2:04 pm

Hi,
AFAIK, you can do that using:
s = node.mbstr();

(s is a std::string variable and node is a wxString one)

Good Luck

geralds
I live to help wx-kind
I live to help wx-kind
Posts: 186
Joined: Tue Nov 01, 2005 9:22 am
Contact:

Post by geralds » Tue Nov 01, 2005 2:08 pm

Thanks!

eco
Filthy Rich wx Solver
Filthy Rich wx Solver
Posts: 203
Joined: Tue Aug 31, 2004 7:06 pm
Location: Behind a can of Mountain Dew
Contact:

Post by eco » Wed Nov 02, 2005 12:25 am

It's actually wxString::mb_str(), rather than wxString::mbstr(). Also, you should probably be specifying the target encoding like so:

Code: Select all

const char* s = node.mb_str(wxConvUTF8);
And std::string has no concept of UTF-8 so I'd recommend avoiding stuffing a UTF-8 string in one.

heda
Knows some wx things
Knows some wx things
Posts: 32
Joined: Sun Jul 10, 2005 1:11 pm

Post by heda » Wed Nov 02, 2005 2:16 am

It's actually wxString::mb_str(), rather than wxString::mbstr()
Oh yes, you're right. It was just a typing mistake; sorry.

Thanks for you comment.

User avatar
Ryan Norton
Moderator
Moderator
Posts: 1319
Joined: Mon Aug 30, 2004 6:01 pm

Post by Ryan Norton » Wed Nov 02, 2005 4:00 am

eco wrote:And std::string has no concept of UTF-8 so I'd recommend avoiding stuffing a UTF-8 string in one.
It should be OK actually. std::string and wxString are just a bunch of bytes and don't check for zeros etc. so you should be ok with any encoding as long as you remember to explicitly specify the length.
[Mostly retired moderator, still check in to clean up some stuff]

geralds
I live to help wx-kind
I live to help wx-kind
Posts: 186
Joined: Tue Nov 01, 2005 9:22 am
Contact:

Post by geralds » Wed Nov 02, 2005 9:19 am

Thanks heda, eco and Ryan! eco's suggestion works perfectly. I only wish I'd known about wxWidgets a year ago! This removes the last GPL function from my program, so I can finally switch to LGPL...

leio
Can't get richer than this
Can't get richer than this
Posts: 802
Joined: Mon Dec 27, 2004 10:46 am
Location: Estonia, Tallinn
Contact:

Post by leio » Wed Nov 02, 2005 10:09 am

And UTF-8 doesn't really contain \0's in strings, it is used for specifying the end of a string, as is common in C strings.
Compilers: gcc-3.3.6, gcc-3.4.5, gcc-4.0.2, gcc-4.1.0 and MSVC6
OS's: Gentoo Linux, WinXP; WX: CVS HEAD

Project Manager of wxMUD - http://wxmud.sf.net/
Developer of wxGTK;
gtk+ port maintainer of OMGUI - http://www.omgui.org/

eco
Filthy Rich wx Solver
Filthy Rich wx Solver
Posts: 203
Joined: Tue Aug 31, 2004 7:06 pm
Location: Behind a can of Mountain Dew
Contact:

Post by eco » Wed Nov 02, 2005 9:45 pm

Ryan Norton wrote:It should be OK actually. std::string and wxString are just a bunch of bytes and don't check for zeros etc. so you should be ok with any encoding as long as you remember to explicitly specify the length.
leio wrote:And UTF-8 doesn't really contain \0's in strings, it is used for specifying the end of a string, as is common in C strings.
Both good points. I suppose the best advice would be to simply not use any character (byte) specific operations on the data. If you do that, either should work great.

Post Reply