Simple Regex question

If you are using the main C++ distribution of wxWidgets, Feel free to ask any question related to wxWidgets development here. This means questions regarding to C++ and wxWidgets, not compile problems.
Post Reply
samsam598
Super wx Problem Solver
Super wx Problem Solver
Posts: 340
Joined: Mon Oct 06, 2008 12:55 pm

Simple Regex question

Post by samsam598 »

Given below code,with the regular expression ^\d+\. I failed to remove the line numbers at the begining of a given string array likes this one:

Code: Select all

0001. This is line 0001.;
0002. This is line 0002.;
0003. This is line 0003. ;
0004. This is line 0004. ;
Any help would be appreicated!.

Code: Select all

void RemoveTrailingLineNumbersFrame::OnSearchClick(wxCommandEvent& event)
{
    wxString exp=text1->GetValue();// ^\d+\.
    wxRegEx reg(exp);
    if (reg.IsValid()) wxLogStatus(wxT("Valid regular expression."));
    long cnt=text2->GetNumberOfLines();
    for(int i=0;i<cnt;i++)
    {
        //multi line wxTextCtrl text2 with initial values as listed above 0001. this is line 0001.; etc
        wxString content=text2->GetLineText(i);
        if( reg.Matches( content))
        {
            reg.Replace(&content,wxEmptyString);
            *text2<<content<<wxT("\r\n");
        }
    }

}
Regards,
Sam
-------------------------------------------------------------------
Windows 10 64bit
VS Community 2019
msys2-mingw13.2.0 C::B character set: UTF-8/GBK(Chinese)
wxWidgets 3.3/3.2.4 Unicode Mono Static gcc static build
PB
Part Of The Furniture
Part Of The Furniture
Posts: 4204
Joined: Sun Jan 03, 2010 5:45 pm

Re: Simple Regex question

Post by PB »

I won't help you with wxRegEx but if all you need is to remove those numbers at the beginning of the line(i.e. everything before and including the first space), just

Code: Select all

content = text2->GetLineText(i).AfterFirst(wxS(" "));
seems to be simpler and perhaps more efficient solution. You can also use wxString::Mid(), assuming the string you want to "remove" has a fixed length.
samsam598
Super wx Problem Solver
Super wx Problem Solver
Posts: 340
Joined: Mon Oct 06, 2008 12:55 pm

Re: Simple Regex question

Post by samsam598 »

Thanks.It should work perfectly.But in my case,the result is weird :both wxTextCtrl and wxMessageBox printed as below,that is,one more wrong line 'is line 0001.;'.

Code: Select all

This is line 0001.;
This is line 0002.;
This is line 0003.;
This is line 0004.;
is line 0001.;
Function:

Code: Select all

void RemoveTrailingLineNumbersFrame::OnSearchClick(wxCommandEvent& event)
{
    wxString content;
    text2->AppendText("\n");
        long cnt=text2->GetNumberOfLines();
    for(int i=0;i<cnt;i++)
    {
        content=text2->GetLineText(i).AfterFirst(' ');
        text2->AppendText(content);
        text2->AppendText(wxT("\n"));
        wxLogMessage(content);

    }
}
Regards,
Sam
-------------------------------------------------------------------
Windows 10 64bit
VS Community 2019
msys2-mingw13.2.0 C::B character set: UTF-8/GBK(Chinese)
wxWidgets 3.3/3.2.4 Unicode Mono Static gcc static build
PB
Part Of The Furniture
Part Of The Furniture
Posts: 4204
Joined: Sun Jan 03, 2010 5:45 pm

Re: Simple Regex question

Post by PB »

My guess would be that it is because you're reading from and writing to the text control at the same time.

I would probably adjusted the code to look like this (not tested)

Code: Select all

for ( int i = 0; i < cnt; i++ )
{
   content << text2->GetLineText(i).AfterFirst(wxS(' ')) << wxS('\n');
}
text2->SetValue(content);
samsam598
Super wx Problem Solver
Super wx Problem Solver
Posts: 340
Joined: Mon Oct 06, 2008 12:55 pm

Re: Simple Regex question

Post by samsam598 »

Thanks.It solved the issue how to remove the line number.

Still wanna know why the regex version failed as I am not so faimilar with wxRegEx.
Regards,
Sam
-------------------------------------------------------------------
Windows 10 64bit
VS Community 2019
msys2-mingw13.2.0 C::B character set: UTF-8/GBK(Chinese)
wxWidgets 3.3/3.2.4 Unicode Mono Static gcc static build
PB
Part Of The Furniture
Part Of The Furniture
Posts: 4204
Joined: Sun Jan 03, 2010 5:45 pm

Re: Simple Regex question

Post by PB »

I am sorry, I haven't used wxRegEx before. Are you sure your regular expression - "^\d+\" - is correct? wxRegEx's IsValid() returns false for it for me (wxWidgets 2.9.5 on MSW), perhaps it doesn't like the trailing slash. Anyway, I have tried this

Code: Select all

        wxRegEx regex(wxS("^(\\d+. )"));
        wxString line(wxS("0001. This is line 0001.;"));

        wxASSERT( regex.IsValid() );                        
        if ( regex.Matches(line) /* regex.Replace(&line, wxEmptyString) == 1 */ )
        {            
            wxMessageBox(regex.GetMatch(line));
             // wxMessageBox(line);
        } 
        else
        {
            wxMessageBox(_("Couldn't replace a part of the string"));            
        }
and I still don't get a match (the regular expression is not for just a number but for a space and dot after the number too). No idea what I'm doing wrong here.
samsam598
Super wx Problem Solver
Super wx Problem Solver
Posts: 340
Joined: Mon Oct 06, 2008 12:55 pm

Re: Simple Regex question

Post by samsam598 »

^\d+\. (there is a dot here) should be correct as I have a workable test app in another language (freepascal+lazarus).
Yes,you are right,in wx the reg.isValid() returned true but reg.Match(text2->GetLineText(i)) returned false which I don't know where the issue is.
Regards,
Sam
-------------------------------------------------------------------
Windows 10 64bit
VS Community 2019
msys2-mingw13.2.0 C::B character set: UTF-8/GBK(Chinese)
wxWidgets 3.3/3.2.4 Unicode Mono Static gcc static build
PB
Part Of The Furniture
Part Of The Furniture
Posts: 4204
Joined: Sun Jan 03, 2010 5:45 pm

Re: Simple Regex question

Post by PB »

It appears that wxRegEx has to be instructed to use advanced syntax for \d to work

Code: Select all

        wxRegEx regex(wxS("^\\d+\\. "), wxRE_ADVANCED);
        wxString line(wxS("0001. This is line 0001.;"));

        wxASSERT( regex.IsValid() );                        
        if ( regex.Replace(&line, wxEmptyString) == 1 )
        {            
            wxMessageBox(line); // produces "This is line 0001.;"
        } 
        else
        {
            wxMessageBox(_("Line string is not in valid format"));            
        }
In default (= extended) syntax mode one can use a range specifier, i.e. the expression literal would be (without extra slash) "^[0-9]+\. ".
Post Reply