Hi all,
is there some sample that illustrates the wxHtmlParser class or wxHtmlWinParser in order to simply parse an HTML page given as argument ?
I'm not able to find such documentation anywhere.
Thanks for your help.
Parsing HTML pages
-
- Super wx Problem Solver
- Posts: 323
- Joined: Sun Jun 08, 2008 11:59 am
- Location: Bordeaux, France
Parsing HTML pages
OS: Ubuntu 11.10
Compiler: g++ 4.6.1 (Eclipse CDT Indigo)
wxWidgets: 2.9.3
Compiler: g++ 4.6.1 (Eclipse CDT Indigo)
wxWidgets: 2.9.3
-
- Super wx Problem Solver
- Posts: 323
- Joined: Sun Jun 08, 2008 11:59 am
- Location: Bordeaux, France
Re: Parsing HTML pages
As example, i've got a web page in wich I can find a tag like this :
<div class="thumbinner" ...>
What I want to achieve is just to get some information after the div tag.
Guess I must use the parser :
What should I put in the Find method to obtain what I want ?
Thanks a lot.
Bye.
<div class="thumbinner" ...>
What I want to achieve is just to get some information after the div tag.
Guess I must use the parser :
Code: Select all
wxHtmlParser parser = new wxHtmlWinParser( "mypage.html" ) ;
wxHtmlCell* top_level_object = parser->Parse();
top_level_object->Find( wxHTML_COND_ISANCHOR, ????????????????????);
Thanks a lot.
Bye.
OS: Ubuntu 11.10
Compiler: g++ 4.6.1 (Eclipse CDT Indigo)
wxWidgets: 2.9.3
Compiler: g++ 4.6.1 (Eclipse CDT Indigo)
wxWidgets: 2.9.3
Re: Parsing HTML pages
I don't think this class can be used that easily (i have no idea though).
Depending on what you need, maybe it can be done with wxRegEx or wxXmlDocument.
Depending on what you need, maybe it can be done with wxRegEx or wxXmlDocument.
Use the source, Luke!
- evstevemd
- Part Of The Furniture
- Posts: 2409
- Joined: Wed Jan 28, 2009 11:57 am
- Location: United Republic of Tanzania
Re: Parsing HTML pages
with wxWebview (in development on trunk) you can get source code and as DM said you can analyze to get what you want!
Chief Justice: We have trouble dear citizens!
Citizens: What it is his honor?
Chief Justice:Our president is an atheist, who will he swear to?
Citizens: What it is his honor?
Chief Justice:Our president is an atheist, who will he swear to?
Re: Parsing HTML pages
webview is only used for displaying the page, AFAIK it doesn't have an API to expose the DOMevstevemd wrote:with wxWebview (in development on trunk) you can get source code and as DM said you can analyze to get what you want!
"Keyboard not detected. Press F1 to continue"
-- Windows
-- Windows
- evstevemd
- Part Of The Furniture
- Posts: 2409
- Joined: Wed Jan 28, 2009 11:57 am
- Location: United Republic of Tanzania
Re: Parsing HTML pages
May be but you can get source and analyze itAuria wrote:webview is only used for displaying the page, AFAIK it doesn't have an API to expose the DOMevstevemd wrote:with wxWebview (in development on trunk) you can get source code and as DM said you can analyze to get what you want!
http://docs.wxwidgets.org/2.9.2/classwx ... f26764f6d9
Chief Justice: We have trouble dear citizens!
Citizens: What it is his honor?
Chief Justice:Our president is an atheist, who will he swear to?
Citizens: What it is his honor?
Chief Justice:Our president is an atheist, who will he swear to?
-
- Super wx Problem Solver
- Posts: 323
- Joined: Sun Jun 08, 2008 11:59 am
- Location: Bordeaux, France
Re: Parsing HTML pages
Hello,
thanks.
Wow, I've not the time to get the sources and analyse them. I would think a solution has been developped already in wxWdigets api.
In fact, after posting theses messages, I read more information about wxXmlDocument class, since HTML is just an implentation of XML norms. So I think it could do the trick, but never tested yet.
Again thanks, bye.
thanks.
Wow, I've not the time to get the sources and analyse them. I would think a solution has been developped already in wxWdigets api.
In fact, after posting theses messages, I read more information about wxXmlDocument class, since HTML is just an implentation of XML norms. So I think it could do the trick, but never tested yet.
Again thanks, bye.
OS: Ubuntu 11.10
Compiler: g++ 4.6.1 (Eclipse CDT Indigo)
wxWidgets: 2.9.3
Compiler: g++ 4.6.1 (Eclipse CDT Indigo)
wxWidgets: 2.9.3
Re: Parsing HTML pages
Actually it depends, if you need to parse XHTML then indeed wxXMLDocument will do, though if you parse non-XHTML then it won't work
"Keyboard not detected. Press F1 to continue"
-- Windows
-- Windows