What is the error in my regexp?

ereinion

Member
Hi! I wrote this little code-snippet to parse what effects I currently have active:
Code:
//script "extend_all"
//author "ereinion"

void main() {
    string charpane = visit_url("charpane.php");
    string pattern = "(?<=\"Increase rounds of \")([\\w\\s]+)(?=\\\")"; //"
    matcher effects = create_matcher(pattern, charpane);
    int i;
    
    print(pattern);
    if (effects.find()) {
        for i from 0 to group_count(effects) {
            print(i + " - " + effects.group(i));
        }
    }
        
}
However it doesn't seem to work. It's not really a big deal if I don't get this to work, as I discovered the my_effects() function, but I am curious about where I failed in creating the pattern. It seemed to work when I tested it at http://www.myregextester.com/ (see images), but I suppose I have failed somewhere in the translation to ash :p


- edit - So after thinking a bit more on this, I remembered that visit_url() probably captured the unaltered html of charpane.php, while I was using the html that had been altered by mafia. Having had a look at the orginal html, I changed the pattern to "(?<=alt\\=\\\"Click to cast )([\\w\\s]+)(?=\\.)"; //", which seems to work, but now my for-loop only posts the first (last?) entry in each capturing group (why are there two). So how do I print out each of the effects found?
 
Last edited:

Fluxxdog

Active member
if (effects.find()) {
for i from 0 to group_count(effects) { <== This only counts the number of groups you have in the pattern the matcher is using, which should return 2
print(i + " - " + effects.group(i));

You would probably be better using group_string() (link goes to the function's wiki page) to sort through your capturing.
 

ereinion

Member
Thanks for your reply I am now using a code-snippet looking like this:
Code:
void main() {
    string charpane = visit_url("charpane.php");
    string pattern = "(?<=\\<font size\\=2\\>)[\\w\\s\\p{P}]+(?=(\\s)+\\()";
    string[int, int] effects = group_string(charpane, pattern);
    int i, j;
    
    print(pattern);
    foreach i,j in effects {
        if (j==0) {
            print("Index(" + i + ", " + j + "): " + effects[i][j]);
        }
    }
        
}
It seems to do what I want (just like my_effects) :) And yeah, I got two lines of print-out with my first attempt, both containing the same text. I guess I have a long way to go before I even start understanding how regexes work.
 

Winterbay

Active member
I don't think anyone really really understands regexp... It may be like thermodynamics where a famous professor once was asked why he hadn't written a text book on the subject and responded: When you first study thermodynamics you don't understand it, the second time you think that you do, and the third time you know that you don't but you can still do calculations so you don't care :)
 

Fluxxdog

Active member
Code:
void main() {
    string charpane = visit_url("charpane.php");
    string pattern = "(?<=\\<font size\\=2\\>)[\\w\\s\\p{P}]+(?=(\\s)+\\()";
    string[int, int] effects = group_string(charpane, pattern);
    int i, j; <== Double defined
    
    print(pattern);
    foreach i,j in effects {
        if (j==0) {
            print("Index(" + i + ", " + j + "): " + effects[i][j]);
        }
    }
        
}
You have i and j defined twice. Eliminate the first.

Don't be afraid to give variables unique names, it'll make it easier for you to understand your code.
As part of a foreach, mafia can also provide the value of the parsed keys.
Try this:
Code:
void main() {
    string charpane = visit_url("charpane.php");
    string pattern = "(?<=\\<font size\\=2\\>)[\\w\\s\\p{P}]+(?=(\\s)+\\()";
    string[int, int] effects = group_string(charpane, pattern);
    
    print(pattern);
    foreach capture,subgroup,result in effects {
        if (subgroup==0) { //subgroup is already an integer
            print("Index(" + capture + ", " + subgroup + "): " + result); //result is already the value of the matched keys.
        }
    }
}

I don't think anyone really really understands regexp...
Save the person/group that came up with it ^^ Seriously though, regex is more like trigonometry. It's confusing and overwhelming at first, but as you plod through it, you learn what each part of it does and how it operates. And there are a LOT of examples and tutorials. That said, you don't have to understand it completely to use it, just enough for what you want. Better understanding simply gives you more versatility.
 

ereinion

Member
You have i and j defined twice. Eliminate the first.
Don't be afraid to give variables unique names, it'll make it easier for you to understand your code.
As part of a foreach, mafia can also provide the value of the parsed keys.
[/code]
Thanks for your suggestions and for the improvement on my code. I have been programming a bit of VBA lately, which does not define variables when you put them as the counters for a loop, so I guess it was therefore I defined them at the top of the script. I'll keep it in mind for later :)

And the fact that you can add the value of the parsed keys as one of the counters of the foreach is amazing news - I think I have quite a few scripts I need to rewrite ;) Do you know if this is common in other programming languages as well?

Save the person/group that came up with it ^^ Seriously though, regex is more like trigonometry. It's confusing and overwhelming at first, but as you plod through it, you learn what each part of it does and how it operates. And there are a LOT of examples and tutorials. That said, you don't have to understand it completely to use it, just enough for what you want. Better understanding simply gives you more versatility.
Yeah, I had a look at the general background of regexes on wikipedia, and the language theory behind them was quite interesting, even if I don't quite get all of it. I've also been reading on this tutorial, which has made me understand a bit more, but I have had some trouble finding good training exercises. Have any of you got suggestions as to where I can find some good ones?
 
Last edited:
Top