Possible improvement on richtext XML -> HTML

If you have a cool piece of software to share, but you are not hosting it officially yet, please dump it in here. If you have code snippets that are useful, please donate!
Post Reply
nevd
Earned a small fee
Earned a small fee
Posts: 21
Joined: Mon May 02, 2005 11:11 am
Location: England
Contact:

Possible improvement on richtext XML -> HTML

Post by nevd » Mon Jan 14, 2008 11:49 am

I am writing an application that uses a richtext control of user input and then uses wxHtmlEasyPrinting for report printing. I know that I can directly print the XML but I needed a familiar (to the end user) report structure and RTC doesn't support tables yet.

Anyhow, the XML->HTML handler didn't give me the same output as seen within the control and so I have modified it a bit. I have pasted the code below first as it may help someone else and secondly for others to test and see if it works it situations other than my own.

If anyone does test this tell me if it's worth posting for inclusion in the next wx release. BTW I current compile against 2.8.7

Header

Code: Select all

#pragma once
/////////////////////////////////////////////////////////////////////////////
// Name:        ndwxRichTextHTMLHandler.h
// Purpose:     
// Author:      Neville Dastur (Surgeons Net Ltd)
// Modified by: 
// Created:     25/11/2006 21:52:33
// RCS-ID:      
// Copyright:   Neville Dastur (Surgeons Net Ltd). All rights reserved
// Licence:     
/////////////////////////////////////////////////////////////////////////////
#if defined(__GNUG__) && !defined(NO_GCC_PRAGMA)
#pragma interface "ndwxRichTextHTMLHandler.h"
#endif

#include <wx/richtext/richtexthtml.h>

class ndwxRichTextHTMLHandler:public wxRichTextHTMLHandler {
	// This is a virtual class and is overloaded here
	bool DoSaveFile(wxRichTextBuffer *buffer, wxOutputStream& stream);

	/// Overload Begin paragraph formatting
	void BeginParagraphFormatting(const wxTextAttrEx& WXUNUSED(currentStyle), const wxTextAttrEx& thisStyle, wxTextOutputStream& str);

	/// Overload End paragraph formatting
	void EndParagraphFormatting(const wxTextAttrEx& WXUNUSED(currentStyle), const wxTextAttrEx& thisStyle, wxTextOutputStream& stream);

	// My Additional function to simulate tabs in HTML by replacing tabs with &nbsps;
	wxString SimulateTabs(wxString text, int Spaces2Tab);

private:
	// Flag that we are in a row that will need closing
	bool m_inRow;
	int m_SpacesPerTab;

};
Implementation

Code: Select all

/////////////////////////////////////////////////////////////////////////////
// Name:        ndwxRichTextHTMLHandler.cpp
// Purpose:     
// Author:      Neville Dastur (Surgeons Net Ltd)
// Modified by: 
// Created:     25/11/2006 21:52:33
// RCS-ID:      
// Copyright:   Neville Dastur (Surgeons Net Ltd). All rights reserved
// Licence:     
/////////////////////////////////////////////////////////////////////////////
#if defined(__GNUG__) && !defined(NO_GCC_PRAGMA)
#pragma implementation "ndwxRichTextHTMLHandler.h"
#endif

// For compilers that support precompilation, includes "wx/wx.h".
#include "wx/wxprec.h"

#ifdef __BORLANDC__
#pragma hdrstop
#endif

#ifndef WX_PRECOMP
#include "wx/wx.h"
#endif

#include "ndwxRichTextHTMLHandler.h"

bool ndwxRichTextHTMLHandler::DoSaveFile(wxRichTextBuffer *buffer, wxOutputStream& stream)
{
	m_SpacesPerTab = 5;

    m_buffer = buffer;

    ClearTemporaryImageLocations();

    buffer->Defragment();

    wxTextOutputStream str(stream);

    wxTextAttrEx currentParaStyle = buffer->GetAttributes();
    wxTextAttrEx currentCharStyle = buffer->GetAttributes();

    if ((GetFlags() & wxRICHTEXT_HANDLER_NO_HEADER_FOOTER) == 0)
        str << wxT("<html><head></head><body>\n");

    str << wxT("<table border=0 cellpadding=0 cellspacing=0><tr><td width=\"100%\">\n");	// Added \n

    OutputFont(currentParaStyle, str);
	 // NBD: Add \n after the font tag for easy reading
	 str << wxT("\n");

    m_font = false;
    m_inTable = false;
	 m_inRow	= false;				// NBD: Added mem var

    m_indents.Clear();
    m_listTypes.Clear();

    wxRichTextObjectList::compatibility_iterator node = buffer->GetChildren().GetFirst();
    while (node)
    {
        wxRichTextParagraph* para = wxDynamicCast(node->GetData(), wxRichTextParagraph);
        wxASSERT (para != NULL);

        if (para)
        {
            wxTextAttrEx paraStyle(para->GetCombinedAttributes());

            BeginParagraphFormatting(currentParaStyle, paraStyle, str);

            wxRichTextObjectList::compatibility_iterator node2 = para->GetChildren().GetFirst();
				
				// NBD: Flag if the paragraph node is empty
				bool paraEmpty	= true;

            while (node2)
            {
                wxRichTextObject* obj = node2->GetData();
                wxRichTextPlainText* textObj = wxDynamicCast(obj, wxRichTextPlainText);
                if (textObj && !textObj->IsEmpty())
					 {
						 paraEmpty = false;	// NBD: Found node inside and not empty
                    wxTextAttrEx charStyle(para->GetCombinedAttributes(obj->GetAttributes()));
                    BeginCharacterFormatting(currentCharStyle, charStyle, paraStyle, str);

                    wxString text = textObj->GetText();
						
						  // NBD: If text block is empty just output a <BR>
						  if (text.length() == 0) {
							  str << wxT("&nbsp;<!-- TEXT LEN ZERO -->");
						  }
						  else {
								if (charStyle.HasTextEffects() && (charStyle.GetTextEffects() & wxTEXT_ATTR_EFFECT_CAPITALS))
                        text.MakeUpper();
							
								text.Replace(wxT("&"), wxT("&"));		// Must be first entity replaced
								text = SimulateTabs(text, m_SpacesPerTab);

								wxString toReplace = wxRichTextLineBreakChar;
								text.Replace(toReplace, wxT("<br>"));
								text.Replace(wxT("<"), wxT("<"));
								text.Replace(wxT(">"), wxT(">"));
								str << text;
						  }

                    EndCharacterFormatting(currentCharStyle, charStyle, paraStyle, str);
					 } else {
						// NBD: There is a paragraph but its empty, so as for empty <text> o/p a <BR>
							  str << wxT("&nbsp;<!-- NO TEXT NODE FOUND -->");
					 }

                wxRichTextImage* image = wxDynamicCast(obj, wxRichTextImage);
                if( image && !image->IsEmpty())
                    WriteImage( image, stream );

                node2 = node2->GetNext();
            }

				if (paraEmpty)	str << wxT("&nbsp;<!-- PARA NODE EMPTY -->");

            EndParagraphFormatting(currentParaStyle, paraStyle, str);

            str << wxT("\n");
        }
        node = node->GetNext();
    }

    CloseLists(-1, str);

    str << wxT("</font>");

    str << wxT("</td></tr></table><p>");

    if ((GetFlags() & wxRICHTEXT_HANDLER_NO_HEADER_FOOTER) == 0)
        str << wxT("</body></html>");

    str << wxT("\n");

    m_buffer = NULL;

    return true;
}

/// Begin paragraph formatting
void ndwxRichTextHTMLHandler::BeginParagraphFormatting(const wxTextAttrEx& WXUNUSED(currentStyle), const wxTextAttrEx& thisStyle, wxTextOutputStream& str)
{
    if (thisStyle.HasPageBreak())
    {
        str << wxT("</tr></td></table>");
        str << wxT("<div style=\"page-break-after:always\"></div>\n");
        str << wxT("<table border=0 cellpadding=0 cellspacing=0><tr><td width=\"100%\">\n");		// NBD: Added a \n
    }

    if (thisStyle.HasLeftIndent() && thisStyle.GetLeftIndent() != 0)
    {
        if (thisStyle.HasBulletStyle())
        {
            int indent = thisStyle.GetLeftIndent();

            // Close levels high than this
            CloseLists(indent, str);

            if (m_indents.GetCount() > 0 && indent == m_indents.Last())
            {
                // Same level, no need to start a new list
            }
            else if (m_indents.GetCount() == 0 || indent > m_indents.Last())
            {
                m_indents.Add(indent);

                wxString tag;
                int listType = TypeOfList(thisStyle, tag);
                m_listTypes.Add(listType);

                wxString align = GetAlignment(thisStyle);
                str << wxString::Format(wxT("<p align=\"%s\">"), align.c_str());

                str << tag;
            }

            str << wxT("<li> ");
        }
        else
        {
            CloseLists(-1, str);

            wxString align = GetAlignment(thisStyle);
            str << wxString::Format(wxT("<p align=\"%s\">"), align.c_str());

            // Use a table
            int indentTenthsMM = thisStyle.GetLeftIndent() + thisStyle.GetLeftSubIndent();
            // TODO: convert to pixels
            int indentPixels = indentTenthsMM/4;
            str << wxString::Format(wxT("<table border=0 cellpadding=0 cellspacing=0><tr><td width=\"%d\"></td><td>"), indentPixels);

            OutputFont(thisStyle, str);

            if (thisStyle.GetLeftSubIndent() < 0)
            {
                str << SymbolicIndent( - thisStyle.GetLeftSubIndent());
            }

            m_inTable = true;
        }
    }
    else
    {
        CloseLists(-1, str);

        wxString align = GetAlignment(thisStyle);
		  // NBD: Changes <p to <td in output
		  m_inRow = true;
        str << wxString::Format(wxT("<tr><td align=\"%s\">"), align.c_str());
		  // NBD: Because we are putting this text inside table cells now we need to re-output font
		  OutputFont(thisStyle, str);
    }
}

/// End paragraph formatting
void ndwxRichTextHTMLHandler::EndParagraphFormatting(const wxTextAttrEx& WXUNUSED(currentStyle), const wxTextAttrEx& thisStyle, wxTextOutputStream& stream)
{
    if (m_inTable)
    {
        if (thisStyle.HasFont())
            stream << wxT("</font>");

        stream << wxT("</td></tr></table>\n");
        m_inTable = false;
    }

	 // NBD: Added logic to close of table rows for paragraphs
	 if (m_inRow) {
		 if (thisStyle.HasFont() ) {
			stream << wxT("</font>");
		 }
        stream << wxT("</td></tr>\n");
        m_inRow = false;
	 }
}

wxString ndwxRichTextHTMLHandler::SimulateTabs(wxString text, int Spaces2Tab) {
size_t c,t;
wxString rStr;

	for ( c=0,t=Spaces2Tab; c < text.length(); c++,t-- ) {
		// Within a text block we also convert all spaces to nbsp;
		if ( text[c] == wxT(' ') ) {
			rStr << wxT("&nbsp;");
			continue;
		}

		if ( text[c] == wxT('\t') ) {
			while (t!=0) {
				rStr << wxT("&nbsp;");
				t--;
			}
			t=Spaces2Tab;
			continue;
		}

		rStr << text[c];
		if (t==0) t=Spaces2Tab;
	}

	return rStr;
}
Last edited by nevd on Wed Jan 16, 2008 10:21 pm, edited 1 time in total.
_________________
OS: Win XP Pro, Ubuntu, CE
wx: 2.8.7
Compiler: VC 8.0, eVC 4
Tea: Earl Grey

nevd
Earned a small fee
Earned a small fee
Posts: 21
Joined: Mon May 02, 2005 11:11 am
Location: England
Contact:

Post by nevd » Mon Jan 14, 2008 1:50 pm

Just realised this is probably better in "Code Dump". I can't find a way to move it so could a mod do so please.
_________________
OS: Win XP Pro, Ubuntu, CE
wx: 2.8.7
Compiler: VC 8.0, eVC 4
Tea: Earl Grey

Post Reply