Encoding conversions - help req'd

If you are using the main C++ distribution of wxWidgets, Feel free to ask any question related to wxWidgets development here. This means questions regarding to C++ and wxWidgets, not compile problems.
Post Reply
eager2no
Earned a small fee
Earned a small fee
Posts: 22
Joined: Sat Sep 27, 2008 7:32 pm

Encoding conversions - help req'd

Post by eager2no »

The problem I need to solve is this (using the Unicode build of wxwidgets):
My program takes a text file as input and needs to convert it to UTF-32 for further processing.
The encoding of the input file could be anything (user-supplied).
I check the BOM to determine the encoding, but it is not guaranteed that the input file will have a BOM.
Up till now I have worked on a u_char basis and hand-coded conversion routines, but wxwidgets conversions would be a lot simpler to code, if only I knew how.

What functions should I use to achieve the purpose?
Specifically, how can I convert from UTF-anything to UTF-32 and back?
Some code samples would be a great help.

My apologies for the dumb questions. I am relatively new to wxwidgets and c++, but have a useful algorithm I am obsessed with and want to code.
Thanks in advance for any help.
Youka
Experienced Solver
Experienced Solver
Posts: 51
Joined: Thu Feb 16, 2012 2:24 pm

Re: Encoding conversions - help req'd

Post by Youka »

Have a look at wxMBConv Overview.
eager2no
Earned a small fee
Earned a small fee
Posts: 22
Joined: Sat Sep 27, 2008 7:32 pm

Re: Encoding conversions - help req'd

Post by eager2no »

Youka,
Thank you.
User avatar
doublemax
Moderator
Moderator
Posts: 19116
Joined: Fri Apr 21, 2006 8:03 pm
Location: $FCE2

Re: Encoding conversions - help req'd

Post by doublemax »

This is a non-trivial issue. And it also depends on the platform you're on. Under Windows it's a little more complicated as the native string format uses UCS-2, so you get problems if you're working with unicode chars > 65535.
Use the source, Luke!
Post Reply