Strange problems with à â etc. Topic is solved
-
- Ultimate wxWidgets Guru
- Posts: 675
- Joined: Tue Jul 26, 2016 2:00 pm
Strange problems with à â etc.
Hello
In my app I have a strange problem. I tried to use a text with accents like ^ oder `on letters. I copied it into a wxString and then an HTML file is generated in my app. I did this with three texts which all contained these accented letters. With two of them I got the correct accent in Edge, but with one of them I had a strange sign for à and ê. Interestinlgy it didn't happen with é letters. I then tried to put "à" instead of à but only for one à letter. Funny thing was, that the other accented letters then also showed up correctly.
That was yesterday and today I tried to reproduce the problem but now all three texts are correct. I'm quite confused. Is this something maybe about the html-file format (like Unicode-8 etc.)? But then I wonder why it seems to work only by chance.
If I create a text file with the .html ending, what is the encoding usually and how could I influence it?
Thanks,
Thomas
In my app I have a strange problem. I tried to use a text with accents like ^ oder `on letters. I copied it into a wxString and then an HTML file is generated in my app. I did this with three texts which all contained these accented letters. With two of them I got the correct accent in Edge, but with one of them I had a strange sign for à and ê. Interestinlgy it didn't happen with é letters. I then tried to put "à" instead of à but only for one à letter. Funny thing was, that the other accented letters then also showed up correctly.
That was yesterday and today I tried to reproduce the problem but now all three texts are correct. I'm quite confused. Is this something maybe about the html-file format (like Unicode-8 etc.)? But then I wonder why it seems to work only by chance.
If I create a text file with the .html ending, what is the encoding usually and how could I influence it?
Thanks,
Thomas
Re: Strange problems with à â etc.
How exactly do you create and save the HTML file? IOW: Which encoding does it use?
Assuming it's UTF-8, the charset should be defined in the header:
Assuming it's UTF-8, the charset should be defined in the header:
Code: Select all
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>some title</title>
</head>
<body>
</body>
</html>
Use the source, Luke!
-
- Ultimate wxWidgets Guru
- Posts: 675
- Joined: Tue Jul 26, 2016 2:00 pm
Re: Strange problems with à â etc.
I don't do anything special when creating the file, just a file with the .html ending:
But I'll try your suggestion right now. I only wonder why this only happens erratically.
EDIT:
I just tried your suggestion but now it's even worse: Now all the special letters are shown as a question mark in a tilte black rectangle.
Code: Select all
wxString HtmlStringFinal = CreateHtmlString();
std::ofstream htmlFile;
htmlFile.open("C:\\users\\thomas\\documents\\programmprojekte\\birkenbihl-test\\testhtml.html");
htmlFile << HtmlStringFinal;
htmlFile.close();
EDIT:
I just tried your suggestion but now it's even worse: Now all the special letters are shown as a question mark in a tilte black rectangle.
- doublemax@work
- Super wx Problem Solver
- Posts: 474
- Joined: Wed Jul 29, 2020 6:06 pm
- Location: NRW, Germany
Re: Strange problems with à â etc.
Code: Select all
htmlFile << HtmlStringFinal
Can you try this:
Code: Select all
#include <wx/file.h>
wxFile htmlFile("C:\\users\\thomas\\documents\\programmprojekte\\birkenbihl-test\\testhtml.html", wxFile::write);
if( htmlFile.IsOpened() ) {
htmlFile.Write( HtmlStringFinal, wxConvUTF8 );
htmlFile.Close();
}
-
- Ultimate wxWidgets Guru
- Posts: 675
- Joined: Tue Jul 26, 2016 2:00 pm
Re: Strange problems with à â etc.
Your suggestion works fine.
HtmlStringFinal is a wxString which is created in a function. This is how the content of HtmlStringFinal is created:
Before your suggestion without the parts <!doctype html> and <meta charset=\"utf-8\">. So how do I know what encoding the file is created with if I use my original version? But I guess, your version is the safest one, right?
HtmlStringFinal is a wxString which is created in a function. This is how the content of HtmlStringFinal is created:
Code: Select all
wxString CreateHtmlString()
{
wxString HtmlString;
wxString ZeilenAbstandInPixel = "6";
wxString AbsatzAbstandInPixel = "14";
wxString Schriftart = "Calibri";
wxString string_WortAbstandInPixel;
string_WortAbstandInPixel << WortAbstandInPixel;
wxString fontSize = "12";
HtmlString = "<!doctype html><html><head><meta charset=\"utf-8\"><style type=\"text/css\">td{padding-right:14px;padding-bottom:" + ZeilenAbstandInPixel + "px;font-family:" + Schriftart + ";font-size:" + fontSize + "pt !important;white-space: nowrap;}";
HtmlString = HtmlString + "table{padding-right:0;padding-bottom:" + AbsatzAbstandInPixel + "px;font-family:" + Schriftart + ";font-size:" + fontSize + "pt;}";
HtmlString = HtmlString + "body{margin:0px;padding:0px;overflow:hidden;}</style></head><body>";
int wordCount = 0;
int int_AnzahlSpalten = 0;
int int_FlexGridSizerCount = 0;
while(wordCount < totalWords)
{
HtmlString = HtmlString + "<table><tr>";
while(int_AnzahlSpalten < AnzahlSpaltenStaticTexts[int_FlexGridSizerCount])
{
HtmlString = HtmlString + "<td>" + separateWords[wordCount] + "</td>";
int_AnzahlSpalten = int_AnzahlSpalten + 1;
wordCount = wordCount + 1;
}
int_AnzahlSpalten = 0;
wordCount = wordCount - AnzahlSpaltenStaticTexts[int_FlexGridSizerCount];
HtmlString = HtmlString + "</tr><tr>";
while(int_AnzahlSpalten < AnzahlSpaltenStaticTexts[int_FlexGridSizerCount])
{
HtmlString = HtmlString + "<td><b>" + HtmlData1.ComboBox[wordCount] + "</b></td>";
int_AnzahlSpalten = int_AnzahlSpalten + 1;
wordCount = wordCount + 1;
}
int_AnzahlSpalten = 0;
int_FlexGridSizerCount = int_FlexGridSizerCount + 1;
HtmlString = HtmlString + "</b></tr></table>";
}
HtmlString = HtmlString + "</body></html>";
return HtmlString;
}
- doublemax@work
- Super wx Problem Solver
- Posts: 474
- Joined: Wed Jul 29, 2020 6:06 pm
- Location: NRW, Germany
Re: Strange problems with à â etc.
I guess in your version the local encoding of your machine is used, probably "iso-8859-1". Try using that as value for the charset.
Using UTF-8 is safer though, as it can represent all Unicode characters.
Using UTF-8 is safer though, as it can represent all Unicode characters.
-
- Ultimate wxWidgets Guru
- Posts: 675
- Joined: Tue Jul 26, 2016 2:00 pm
Re: Strange problems with à â etc.
Hm okay, so is it advisable to always convert texts which could contain any of the "special" characters to UTF-8 or is this only important for html files?
Your version uses wxFile. I guess, there is no real difference to using my version... except, I assume, that this conversion to UFT-8 only works with wxFile?
Your version uses wxFile. I guess, there is no real difference to using my version... except, I assume, that this conversion to UFT-8 only works with wxFile?
Re: Strange problems with à â etc.
If it's possible that the text contains non-ascii characters, it should be saved as UTF-8. It almost guarantees that it can be read and displayed correctly on any platform.Wanderer82 wrote: ↑Tue Jan 31, 2023 9:12 pm Hm okay, so is it advisable to always convert texts which could contain any of the "special" characters to UTF-8 or is this only important for html files?
I used wxFile because i knew that it has no internal hidden functionality that might change the result. I don't know that about std::ofstreamWanderer82 wrote: ↑Tue Jan 31, 2023 9:12 pm Your version uses wxFile. I guess, there is no real difference to using my version... except, I assume, that this conversion to UFT-8 only works with wxFile?
Use the source, Luke!
-
- Ultimate wxWidgets Guru
- Posts: 675
- Joined: Tue Jul 26, 2016 2:00 pm
Re: Strange problems with à â etc.
Alright.
Having had a look at wxFile I noticed that there is an option "write_excl " which is said to be useful for opening files being vulnerable to race conditions. So I had this problem in another app where different users can write to or read from a database file. I then ended up in using a lockfile to make sure that two users aren't accessing the file at the same time. So, as I've seen this functionality of wxFile, I wonder if there is any catch? I mean it seems so easy using this option, so why would someone ever use a more complicated lockfile solution? I know that @Doublemax, you have already mentioned this in my earlier thread and somehow I just skipped that.
Having had a look at wxFile I noticed that there is an option "write_excl " which is said to be useful for opening files being vulnerable to race conditions. So I had this problem in another app where different users can write to or read from a database file. I then ended up in using a lockfile to make sure that two users aren't accessing the file at the same time. So, as I've seen this functionality of wxFile, I wonder if there is any catch? I mean it seems so easy using this option, so why would someone ever use a more complicated lockfile solution? I know that @Doublemax, you have already mentioned this in my earlier thread and somehow I just skipped that.
Re: Strange problems with à â etc.
I'm definitely not a big fan of using a flat textfile that multiple processes write to. It would be better to use a simple database like Sqlite for this. But if you're using a textfile, using the write_excl flag will make it more robust.
Use the source, Luke!