wxRegEx & HTML Topic is solved

If you are using the main C++ distribution of wxWidgets, Feel free to ask any question related to wxWidgets development here. This means questions regarding to C++ and wxWidgets, not compile problems.
Post Reply
illnatured
Filthy Rich wx Solver
Filthy Rich wx Solver
Posts: 234
Joined: Mon May 08, 2006 12:31 pm
Location: Krakow, Poland

wxRegEx & HTML

Post by illnatured » Sat Apr 04, 2009 6:43 pm

Here's a nice tiny regular expression that was meant to remove all HTML tags from a string:

Code: Select all

wxRegEx ExByeByeTags (wxT("<(.|\\n)+?>"), wxRE_ICASE|wxRE_ADVANCED);
It works, but I have no idea how to modify the expression in order to make it match only tags contained in a table (assuming that all tables look simply like this: <TABLE>...</TABLE>). Are there any regex-geeks here? I'm already having nightmares involving strange ASCII characters ;).

protocol
Moderator
Moderator
Posts: 680
Joined: Wed Jan 18, 2006 6:13 pm
Location: Dallas, TX
Contact:

Post by protocol » Mon Apr 06, 2009 2:24 am

Please provide a test subject string.

Also check out my app, QuRegExmm on sourceforge, it may be able to help you make the match.
/* UIKit && wxWidgets 2.8 && Cocoa && .Net */
QuRegExmm
wxPCRE & ObjPCRE - Regex It!

illnatured
Filthy Rich wx Solver
Filthy Rich wx Solver
Posts: 234
Joined: Mon May 08, 2006 12:31 pm
Location: Krakow, Poland

Post by illnatured » Mon Apr 06, 2009 5:39 pm

protocol wrote:Please provide a test subject string.

Also check out my app, QuRegExmm on sourceforge, it may be able to help you make the match.
Cool app, surely much better than compiling my wx program every time :). Thank you. This is a test string:
<P>
These tags should remain untouched...
</P>

<table>
<tr><td><span class=A></span>...while these ones should disappear.</span></td></tr>
<tr><td><span class=A></span>&nbsp;It would be nice if this "nbsp" disappeared too</span></td></tr>
<tr><td><span class=A></span>Some text</span></td></tr>
</table>
(not a 100% valid HTML, but this is intentional)

illnatured
Filthy Rich wx Solver
Filthy Rich wx Solver
Posts: 234
Joined: Mon May 08, 2006 12:31 pm
Location: Krakow, Poland

Post by illnatured » Wed Apr 15, 2009 5:00 pm

It seems that regular expressions alone aren't powerful enough to easily accomplish this task, so I mixed them with good old std:string functions and it works for me. Nevertheless, your app is so helpful that 5 wxAwards go to you ;).

protocol
Moderator
Moderator
Posts: 680
Joined: Wed Jan 18, 2006 6:13 pm
Location: Dallas, TX
Contact:

Post by protocol » Fri Apr 17, 2009 4:00 am

Excellent. I'm glad you enjoy the app.

Regards.
/* UIKit && wxWidgets 2.8 && Cocoa && .Net */
QuRegExmm
wxPCRE & ObjPCRE - Regex It!

Post Reply