Monster Manuel Relay checker

ckb

Minion
Staff member
This is a fight script companion to my monmancheck script:

http://kolmafia.us/showthread.php?11166-Monster-Manuel-checker

This uses a lot of the same methodology to add a number next to the monster name (0-3) for the number of bits of information you have collected. Useful if you are trying to fill your manuel.
There is also some debug into output to the GCLI for checking monster names.

Drop into relay folder.
 

Attachments

  • fight.ash
    1.5 KB · Views: 172
Last edited:

Aankhen

Member
Nice script. Small problem: the checks for a/an/the/some cause errors if the monster’s name is too short. A simple way to fix this is to remove those checks and instead handle qname like this:
Code:
	if (find(nam)) { monname = group(nam); }

[B]	matcher stripped_name = create_matcher("^\\s*(?:(?:a|an|the|some) )?(.+)$", monname);
	find(stripped_name);
	string qname = group(stripped_name, 1);[/B]

	vprint(qname, "blue", 2);
 

Bale

Minion
Aankhen, I've got a question. Is there an advantage to "(?:(?:a|an|the|some) )?" instead of "(?:a |an |the |some )?".

To me, the later looks easier to read. I'm asking because there is a lot of subtlety to regex that I sometimes miss.
 
Last edited:

Erich

Member
Aankhen, I've got a question. Is there an advantage to "(?:(?:a|an|the|some) )?" instead of "(?:a |an |the |some )?".

To me, the later looks easier to read. I'm asking because there is a lot of subtlety to regex that I sometimes miss.

the (?: ) makes it so that it can't get called back, as with a /1 or /2. I can't tell you the practical applications of that in terms of scripting, because I only know regex for fixing webpages.
 

Bale

Minion
Yes, I know what (?: ) means which is why I included it my alternate regexp. That wasn't my question. Still waiting for Aankhen's answer.
 

Erich

Member
Fair, sorry, I didn't know how much you knew and misread.

... I love me some regex though.
 
Last edited:

Bale

Minion
I love-hate regex. We have a turbulent relationship.

My question was about why he separated the article from the following space, requiring it to be tucked into a another non-capturing group contained within the first.
 

Catch-22

Active member
Aankhen, I've got a question. Is there an advantage to "(?:(?:a|an|the|some) )?" instead of "(?:a |an |the |some )?".

To me, the later looks easier to read. I'm asking because there is a lot of subtlety to regex that I sometimes miss.

Neither expression gets the job done properly, what you'd want instead is "(?:\b(?:a|an|the|some)\b )", this will match the "a" in "a tan elephant", whereas the old expressions would match "a" and "an" in "a tan elephant".
 

Aankhen

Member
Sorry for the delay. I was asleep.
Aankhen, I've got a question. Is there an advantage to "(?:(?:a|an|the|some) )?" instead of "(?:a |an |the |some )?".

To me, the later looks easier to read. I'm asking because there is a lot of subtlety to regex that I sometimes miss.
I have a confession to make: I’ve never quite understood how | works with regard to spaces. It’s probably something simple, but I’ve never actually looked it up, so I always over‐specify it using an inner group. Your way would likely work too, and more simply.
Neither expression gets the job done properly, what you'd want instead is "(?:\b(?:a|an|the|some)\b )", this will match the "a" in "a tan elephant", whereas the old expressions would match "a" and "an" in "a tan elephant".
That’s why the code I posted anchors it to the start of the string using ^. Your version will also match where you don’t want it to. For example: ‘foo-a’, ‘foo’an’, and ‘foo a bar’.

By the way, \b is redundant if it’s followed by a mandatory space.
 
Last edited:

Bale

Minion
Thanks. Spaces are simply characters in regexp regardless of where they are, so it is good either way. If paranoid, it's probably better to use \\s.
 

Catch-22

Active member
That’s why the code I posted anchors it to the start of the string using ^. Your version will also match where you don’t want it to. For example: ‘foo-a’, ‘foo’an’, and ‘foo a bar’.

By the way, \b is redundant if it’s followed by a mandatory space.

Eh? Sorry, what I said was only meant to apply to that group of the regex, I wasn't suggesting you replace the entire regular expression with it. Yes, the 2nd \b is redundant, but I'd keep it there for clarity. You want to match the word boundaries and a space.

The pattern as a whole would be ^\s*(?:\b(?:a|an|the|some)\b )?(.+)$.

If paranoid, it's probably better to use \\s.

Keeping in mind that \s matches any white space character, including a newline. You could use a character set such as [ \t] to be a little more specific.
 

Bale

Minion
Keeping in mind that \s matches any white space character, including a newline. You could use a character set such as [ \t] to be a little more specific.

Thanks "[ \\t]+" FTW since KoL sometimes uses more than one space in weird places.
 

ckb

Minion
Staff member
Heh - just saw this. I actually noticed the problem a while back, but my work and travel have kept me from uploadng my fix. The regex is nice, but it makes my brain hurt most of the time.
I did this:
Code:
string kmon = monname;
if (substring(kmon,0,1)==" ") { kmon = substring(kmon,1) ; }
if (substring(kmon,0,2)=="a ") { kmon = substring(kmon,2) ; }
if (substring(kmon,0,3)=="an ") { kmon = substring(kmon,3) ; }
if (length(kmon)>3 && substring(kmon,0,4)=="the ") { kmon = substring(kmon,4) ; }
if (length(kmon)>4 && substring(kmon,0,5)=="some ") { kmon = substring(kmon,5) ; }

ckb
 

Aankhen

Member
Eh? Sorry, what I said was only meant to apply to that group of the regex, I wasn't suggesting you replace the entire regular expression with it. Yes, the 2nd \b is redundant, but I'd keep it there for clarity. You want to match the word boundaries and a space.

The pattern as a whole would be ^\s*(?:\b(?:a|an|the|some)\b )?(.+)$.
Ah I see. In that case, both the \bs seem like unnecessary, unwanted, undesired and unwarranted repetition & redundancy to me. :p

EDIT:
It would be nicer to use Bale’s suggestion too: ^\s*(?:a |an |the |some )?(.+)$. I suppose this is a moot point, however, given ckb’s post.

hey ckb what if the monster name is 0 characters huh didnt think of that did you
 
Last edited:

lostcalpolydude

Developer
Staff member
Making the regex as efficient as possible seems like a minor concern for a script that visits the questlog every combat round. When I first grabbed this script I expected it to work with data produced by the other script.
 

ckb

Minion
Staff member
Making the regex as efficient as possible seems like a minor concern for a script that visits the questlog every combat round. When I first grabbed this script I expected it to work with data produced by the other script.

Interesting.... but then it would not update dynamically, and would likely give wrong numbers... and that was kind of the point. I should probbaly only check for factoids at the start of combat (round 1) and maybe the end of combat. That would be more server friendly.
I'll add that to the todo list.
 

lostcalpolydude

Developer
Staff member
You could potentially update the data file with this script whenever a new factoid shows up, if you really wanted to switch to that approach.

I don't think you need a check at the end of combat. If you find the monster listed with Manuel at the start of combat, then you have a value, and you could add 1 if Manuel shows up at the end of the fight; you could pre-emptively cap the value at 3 in case it later becomes possible to have factoids reappear for exhaustively researched monsters. If you don't find the monster in the quest log but the fight page has Manuel data, then there is some issue (with mafia, the script, maybe KoL weirdness) that means you won't find the monster there and then there's no need to check at the end of combat. You could even check for a lack of Manuel data on the fight page as an indicator that you can skip checking the quest log.
 

Aankhen

Member
Just noticed that the script has issues with monster names that contain accented characters. This is easily fixed:
Code:
	matcher nam = create_matcher("(?<=(id='monname'>))[B]([^<]+)[/B]",results);
 	string monname = "";
 
Top