Some regex help needed

This forum is reserved for everything you want to talk about. It could be about programming, opinions, open source programs, development in general, or just cool stuff to share!
Post Reply
vdell
Ultimate wxWidgets Guru
Ultimate wxWidgets Guru
Posts: 536
Joined: Fri Jan 07, 2005 3:44 pm
Location: Finland
Contact:

Some regex help needed

Post by vdell »

I'm trying to get some movie information from IMDb.com (I actually got connected to the damn page :)) but it seems that my regex skills are a bit rusty ATM.

So, the problem is that there's data like:

Code: Select all

<b class="blackcatheader">Directed by</b><br>
<a href="/name/nm0905152/">Andy Wachowski</a><br><a href="/name/nm0905154/">Larry Wachowski</a>
and I'm trying to get all directors from that data. So far I have the following regex:

Code: Select all

(?:<b\s+[^>]*>Directed by</b><br>\s?\n){1}(?:<a\s+[^>]*>([^<]*)</a>(?:<br>)?)*
Which almost works. The problem with the above regex is that it captures all directors to the same group (screenshot1.png) even though I need them to go to their own groups. Any suggestions?

I'm using boost::regex_search.
Last edited by vdell on Sat Jul 02, 2005 9:17 am, edited 1 time in total.
Visual C++ 9.0 / Windows XP Pro SP3 / wxWidgets 2.9.0 (SVN) | Colligere
tiwag
Earned some good credits
Earned some good credits
Posts: 123
Joined: Tue Dec 21, 2004 8:51 pm
Location: Austria

Post by tiwag »

help yourself with the regex-coach
http://weitz.de/regex-coach/
vdell
Ultimate wxWidgets Guru
Ultimate wxWidgets Guru
Posts: 536
Joined: Fri Jan 07, 2005 3:44 pm
Location: Finland
Contact:

Post by vdell »

tiwag wrote:help yourself with the regex-coach
http://weitz.de/regex-coach/
Well, I have tried that along with the one in the screenshot but those tools are not really helpful when I don't have any ideas how I could get the regex working. Thanks anyway.
Visual C++ 9.0 / Windows XP Pro SP3 / wxWidgets 2.9.0 (SVN) | Colligere
User avatar
Ryan Norton
wxWorld Domination!
wxWorld Domination!
Posts: 1319
Joined: Mon Aug 30, 2004 6:01 pm

Post by Ryan Norton »

I'd respond but then I'm not that much of a regex master myself. Ask on the irc channel #wxwidgets though - BrianHV who hangs out there is a regex master and has answered many of my questions on this.
[Mostly retired moderator, still check in to clean up some stuff]
Post Reply