汉字与unicode文本之间的转化问题

这是wxWidgets论坛的中文版本。在这里,您可以用您的母语汉语讨论上面任一子论坛所涉及的所有关于wxWidgets的话题。欢迎大家参与到对有价值的帖子的中英互译工作中来!
Post Reply
winner4love
In need of some credit
In need of some credit
Posts: 5
Joined: Mon Dec 23, 2013 5:01 am

汉字与unicode文本之间的转化问题

Post by winner4love »

例如文本内容是
"\u4f26\u6566"也就是"伦敦"两个汉字的utf8编码
将其读入wxString实例后,其内容变成了"\\u4f26\\u6566"
应该怎么做将其转化为汉字字符串呢?
winner4love
In need of some credit
In need of some credit
Posts: 5
Joined: Mon Dec 23, 2013 5:01 am

Re: 汉字与unicode文本之间的转化问题

Post by winner4love »

Code: Select all

int IntFromHexChar(char a)
{
    if(a>='0' && a<='9')return a-0x30;
    else if(a>='A' && a<='F')return a-'A'+10;
    else if(a>='a' && a<='f')return a-'a'+10;
    else return 17;
}
const wxString Utf8stringTowxString(const wxString &instring)
{
    if(instring.IsEmpty())return wxEmptyString;
    wxString outstring;
    for(int i=0;i<instring.Length();)
    {
        if(instring.GetChar(i)=='\\' and instring.GetChar(i+1)=='u' )
        {
            char aa=instring[i+2].GetValue(),bb=instring[i+3].GetValue(),cc=instring[i+4].GetValue(),dd=instring[i+5].GetValue();
            int a=IntFromHexChar(aa),b=IntFromHexChar(bb),c=IntFromHexChar(cc),d=IntFromHexChar(dd);
            if(a==17 or b==17 or c==17 or d==17)
            {
                outstring+=wxString::Format(wxT("%c"),instring[i].GetValue());
                i++;
                continue;
            }
            int value=(int)(a<<12)+(int)(b<<8)+(int)(c<<4)+d;
            outstring+=wxString::Format(wxT("%c"),value);
            i=i+6;
        }
        else
        {
            outstring+=wxString::Format(wxT("%c"),instring[i].GetValue());
            i=i+1;
        }
    }
    return outstring;
}
自己写了一个,有没有更简单的方法呢?
fancyivan
Experienced Solver
Experienced Solver
Posts: 80
Joined: Wed May 26, 2010 8:42 am
Location: Beijing, China
Contact:

Re: 汉字与unicode文本之间的转化问题

Post by fancyivan »

Code: Select all

const char *c = "\u4f26\u6566";
wxString cs(c, wxConvUTF8);
wxLogMessage(cs);    //在界面上就会显示 伦敦 二字
winner4love wrote:

Code: Select all

int IntFromHexChar(char a)
{
    if(a>='0' && a<='9')return a-0x30;
    else if(a>='A' && a<='F')return a-'A'+10;
    else if(a>='a' && a<='f')return a-'a'+10;
    else return 17;
}
const wxString Utf8stringTowxString(const wxString &instring)
{
    if(instring.IsEmpty())return wxEmptyString;
    wxString outstring;
    for(int i=0;i<instring.Length();)
    {
        if(instring.GetChar(i)=='\\' and instring.GetChar(i+1)=='u' )
        {
            char aa=instring[i+2].GetValue(),bb=instring[i+3].GetValue(),cc=instring[i+4].GetValue(),dd=instring[i+5].GetValue();
            int a=IntFromHexChar(aa),b=IntFromHexChar(bb),c=IntFromHexChar(cc),d=IntFromHexChar(dd);
            if(a==17 or b==17 or c==17 or d==17)
            {
                outstring+=wxString::Format(wxT("%c"),instring[i].GetValue());
                i++;
                continue;
            }
            int value=(int)(a<<12)+(int)(b<<8)+(int)(c<<4)+d;
            outstring+=wxString::Format(wxT("%c"),value);
            i=i+6;
        }
        else
        {
            outstring+=wxString::Format(wxT("%c"),instring[i].GetValue());
            i=i+1;
        }
    }
    return outstring;
}
自己写了一个,有没有更简单的方法呢?
OS: Win7 Ultimate SP1 x64(Windows XP Pro SP3 in VirtualBox)
Compiler: MinGW32 (gcc4.8.1 + gdb7.6.1)
IDE: Code::Blocks 12.11
Lib: wxWidgets3.0.0
winner4love
In need of some credit
In need of some credit
Posts: 5
Joined: Mon Dec 23, 2013 5:01 am

Re: 汉字与unicode文本之间的转化问题

Post by winner4love »

如果这一串代码是放在文本文档里的,是显示不出来的.
如果读入以后,应该是"\\u4f26\\u6566"
fancyivan wrote:

Code: Select all

const char *c = "\u4f26\u6566";
wxString cs(c, wxConvUTF8);
wxLogMessage(cs);    //在界面上就会显示 伦敦 二字
winner4love wrote:

Code: Select all

int IntFromHexChar(char a)
{
    if(a>='0' && a<='9')return a-0x30;
    else if(a>='A' && a<='F')return a-'A'+10;
    else if(a>='a' && a<='f')return a-'a'+10;
    else return 17;
}
const wxString Utf8stringTowxString(const wxString &instring)
{
    if(instring.IsEmpty())return wxEmptyString;
    wxString outstring;
    for(int i=0;i<instring.Length();)
    {
        if(instring.GetChar(i)=='\\' and instring.GetChar(i+1)=='u' )
        {
            char aa=instring[i+2].GetValue(),bb=instring[i+3].GetValue(),cc=instring[i+4].GetValue(),dd=instring[i+5].GetValue();
            int a=IntFromHexChar(aa),b=IntFromHexChar(bb),c=IntFromHexChar(cc),d=IntFromHexChar(dd);
            if(a==17 or b==17 or c==17 or d==17)
            {
                outstring+=wxString::Format(wxT("%c"),instring[i].GetValue());
                i++;
                continue;
            }
            int value=(int)(a<<12)+(int)(b<<8)+(int)(c<<4)+d;
            outstring+=wxString::Format(wxT("%c"),value);
            i=i+6;
        }
        else
        {
            outstring+=wxString::Format(wxT("%c"),instring[i].GetValue());
            i=i+1;
        }
    }
    return outstring;
}
自己写了一个,有没有更简单的方法呢?
User avatar
hvenus
In need of some credit
In need of some credit
Posts: 1
Joined: Wed Aug 11, 2010 2:27 am

Re: 汉字与unicode文本之间的转化问题

Post by hvenus »

请问如何把中文转成unicode文本呢?
Ellan
Experienced Solver
Experienced Solver
Posts: 57
Joined: Mon May 15, 2017 10:11 am

Re: 汉字与unicode文本之间的转化问题

Post by Ellan »

hvenus wrote:请问如何把中文转成unicode文本呢?
用系统调用更简单一点何必去自己写呢
例如windows下:MultiByteToWideChar()
Thanks

Best Regards

Ellan
Post Reply