how to extract words ? Topic is solved

If you are using the main C++ distribution of wxWidgets, Feel free to ask any question related to wxWidgets development here. This means questions regarding to C++ and wxWidgets, not compile problems.
Post Reply
anonbeat
Earned a small fee
Earned a small fee
Posts: 21
Joined: Tue Jul 01, 2008 10:48 am
Contact:

how to extract words ?

Post by anonbeat »

Hello,
I need to extract words from a wxString but with special case for the " and treating the content between two " as a word.
for example the wxString "One Two "Three Four" Five" should be converted into :
One
Two
Three Four
Five

Ive tried with wxStringTokenizer without success.
Could you please let me know if there is an obvious way to do it ?

Thanks in advance
:?:
Last edited by anonbeat on Wed Nov 12, 2008 7:26 am, edited 1 time in total.
anonbeat
Earned a small fee
Earned a small fee
Posts: 21
Joined: Tue Jul 01, 2008 10:48 am
Contact:

Post by anonbeat »

For now I have this code that solves what I need.

Code: Select all

    wxArrayString Words;
    wxString SearchStr = SearchTextCtrl->GetLineText( 0 );
    wxString ResStr;
    size_t index, len;
    wxRegEx RegEx( wxT( " *([^ ]*|\\" *[^\\"]* *\\") *" ) );
    while( SearchStr.Length() && RegEx.Matches( SearchStr ) )
    {
        RegEx.GetMatch( &index, &len );
        Words.Add( RegEx.GetMatch( SearchStr, 1 ) );
        SearchStr = SearchStr.Mid( len );
    }
Auria
Site Admin
Site Admin
Posts: 6695
Joined: Thu Sep 28, 2006 12:23 am
Contact:

Post by Auria »

vsp
Knows some wx things
Knows some wx things
Posts: 35
Joined: Mon Feb 21, 2005 12:52 pm

Post by vsp »

you should use wxStringTokenizer to tokenize the strings based on your choice of delimiter.

Code: Select all

wxStringTokenizer tkz(wxT("first:second:third:fourth"), wxT(":"));
while ( tkz.HasMoreTokens() )
{
    wxString token = tkz.GetNextToken();

    // process token here
}
anonbeat
Earned a small fee
Earned a small fee
Posts: 21
Joined: Tue Jul 01, 2008 10:48 am
Contact:

Post by anonbeat »

What I need is extract search words from a input text control. The user can type a single word, some words or some words with some of them enclosed with " chars.
If enclosed with " chars the words will be treated like a literal and should be searched as it is even if it has spaces in it.
So the separator is the space but spaces are allowed if enclosed with ".
Please if you know how to do it better than I did it let me know

Thanks in advance
Grrr
Earned some good credits
Earned some good credits
Posts: 126
Joined: Fri Apr 11, 2008 8:48 am
Location: Netherlands

Post by Grrr »

I think using wxRegExp is a bit too much for this.

You could just for-loop through all characters in the string. Add characters to a temporary string until you get a seperator (space or quote). Then store the temporary string as a search term. Keep a boolean to indicate if you are inside quotes or not. If you are, spaces shouldn't count as seperators. Continue until the end of the input string.

Note that continuously adding characters to a string is not very effecient so maybe use wxStringBuffer for better performance. Or preallocate a large enough buffer with wxString::Alloc().
Post Reply