Bug - Fixed Quest Log Parsing Issue

Darzil

Developer
It appears that the current quest log is not parsed correctly for all quests. Unfortunately my limited regex understanding is currently throwing up it's hands at : private static final Pattern BODY_PATTERN = Pattern.compile( "(?<=<b>)(.*?[^<>]*?)</b><br>(.*?)(?=<p>$|<p><b>)", Pattern.DOTALL );

The issue can be seen on a regex pattern matcher on the web, where it appears that several quests aren't recognised on my quest log (namely Ooh, I Think I Smell a Bat., Make War, Not... Oh, Wait, What's Up, Doc? and Lady Spookyraven's Babies). They all work fine on the completed quest tabs. The source I'm looking at is :

Code:
<body>
<centeR><table  width=95%  cellspacing=0 cellpadding=0><tr><td style="color: white;" align=center bgcolor=blue><b>Your Quest Log</b></td></tr><tr><td style="padding: 5px; border: 1px solid blue;"><center><table><tr><td><Center>[current quests]   [<a href="questlog.php?which=2">completed quests</a>]   [<a href="questlog.php?which=3">other accomplishments</a>]   [<a href="questlog.php?which=4">notes</a>]   [<a href="questlog.php?which=5">hobo code binder</a>]   [<a href="questlog.php?which=6">Monster Manuel</a>]</center><p><center><b>Current Quests:</b></center><b>Council Quests:</b><blockquote><b>Ooh, I Think I Smell a Bat.</b><br> Defeat the Boss Bat, in the <a class=nounder target=mainpane href=bathole.php><b>Bat Hole</b></a>.<p><b>Trial By Friar</b><br> Talk to the Deep Fat Friars in the <a class=nounder target=mainpane href=woods.php><b>Distant Woods</b></a>.<p><b>The Goblin Who Wouldn't Be King</b><br> Find your way inside Cobb's Knob, in the <b><a class=nounder target=mainpane href="plains.php">Nearby Plains</a></b>.<p><b>Am I My Trapper's Keeper?</b><br> <a class=nounder target=mainpane href=place.php?whichplace=mclargehuge><b>The Trapper</b></a> wants:<br>   * 3 wedges of goat cheese (you have 2)<br>   * 3 chunks of linoleum ore<br><p><b>There Can Be Only One Topping</b><br> Find a way across the <a class=nounder target=mainpane href=place.php?whichplace=orc_chasm><b>Orc Chasm</b></a>.<p><b>Cyrptic Emanations</b><br> Get rid of the evil in <b><a class=nounder target=mainpane href="crypt.php">The Cyrpt</a></b>.<p><b><a class=nounder target=mainpane href="inv_use.php?whichitem=4964&pwd=d39c349911dc5cc9326854ad2bef17b3">Evilometer:</a> 200</b><p><b>Make War, Not... Oh, Wait</b><br>You've managed to get the war between the hippies and frat boys started, and now the Council wants you to finish it.
<p>
You can aid the war effort by fighting on the Battlefield, or you can help out some of the other residents of the island in the hopes that they'll aid the side you're fighting for.<p></blockquote><p><b>Other Quests:</b><blockquote><b>What's Up, Doc?</b><br> <a class=nounder target=mainpane href=place.php?whichplace=town_market><b>Doc Galaktik</b> needs some medicinal herbs:<br>   *  Swindleblossoms from the <a class=nounder target=mainpane href=cobbsknob.php><b>Cobb's Knob Harem</b></a> (0/3)<br>   *  Fraudwort from ninjas in <a class=nounder target=mainpane href=friars.php><b>Hey Deze</b></a> (0/3)<br>   *  Shysterweed from a <a class=nounder target=mainpane href=place.php?whichplace=plains><b>a graveyard</b></a> (0/3)<p><b>Driven Crazy</b><br> Find the Untinker's screwdriver at Degrassi Knoll, on the <a class=nounder target=mainpane href=plains.php><b>Nearby Plains</b></a>.<p><b>Lady Spookyraven's Babies</b><br> Gather up Lady Spookyraven's babies on the <a class=nounder target=mainpane href=place.php?whichplace=manor3><b>third floor</b></a> of <a class=nounder target=mainpane href=place.php?whichplace=manor1><b>Spookyraven Manor</b></a>.<p></blockquote><p><p><center><a href="campground.php">Back to your Campsite</a></center></td></tr></table></center></td></tr><tr><td height=4></td></tr></table></center></body>

Can any regex experts suggest a new regex for the body which will pick up all the quests correctly ?
 

Theraze

Active member
Well, the problem with 2 of those is that they have the <blockquote> instead of <p><b> to start things off... but that doesn't catch Lady Spookyraven or Make War, because their quests span multiple lines. But you can go from 7 matches to 8 by changing:
(?<=<p><b>)(.*?[^<>]*?)</b><br>(.*?)(?=<p>$|<p><b>)
to
(?!</blockquote><=<p><b>|<blockquote><b>)(.*?[^<>]*?)</b><br>(.*?)(?=<p>$|<p><b>)

Which will add the level 4 quest. We now need to avoid the "Other Quests" catch so that we get the Doc quest named properly, so we tell it not to grab that specific blockquote set with a Negative Lookbehind, like this:
(?<!</blockquote><p><b>)(?<=<p><b>|<blockquote><b>)(.*?[^<>]*?)</b><br>(.*?)(?=<p>$|<p><b>)

That will get 8 quests out of it, all properly named. As to how to get it to span multiple lines without choking...
 

Crowther

Active member
I was confused by "(.*?[^<>]*?)". When I changed that to simply "([^<>]*?)" then I matched 9 quests.
Code:
void main()
{
    matcher m = create_matcher("(?<=<b>)([^<>]*?)</b><br>(.*?)(?=<p>$|<p><b>)", test);
    while (find(m)) {
        print("1: " + group(m, 1));
    }
}
Code:
1: Ooh, I Think I Smell a Bat.
1: Trial By Friar
1: The Goblin Who Wouldn't Be King
1: Am I My Trapper's Keeper?
1: There Can Be Only One Topping
1: Cyrptic Emanations
1: Make War, Not... Oh, Wait
1: What's Up, Doc?
1: Driven Crazy
I'm guessing someone was trying to match a quest with "<>" in the name some place? I didn't find one in the example.
 

Theraze

Active member
Huh... Rad Software Regular Expression Designer doesn't find the war with your matcher, but... if mafia will, great. You can catch the last quest by adding a |<p></blockquote> to the end of the matcher, like so:
(?<=<b>)([^<>]*?)</b><br>(.*?)(?=<p>$|<p><b>|<p></blockquote>)

If Crowther's regex actually finds Make War for mafia, when that should also find Lady Spookyraven and we're golden...
 

Darzil

Developer
Thanks, folks. r14127 implements this. I ended up just adding |<p></blockquote>, but also changing the function to map parsed title to quest log title and return the preference to check for quest log title in parsed title, rather than the other way around. This fixed the issue where Evilometer text was considered the start of a title, and the end of the following title was considered it's end.
 

Crowther

Active member
And shouldn't that be equivalent to just "(.*?)", anyway? It's not like [^<>]*? can't match empty string ...
Yes, but [^<>]*? is correct, while .*? matches too much. At least in the example I saw.


I don't know if this is related or not, but I had trouble going to The Icy Peak on a non-ascending multi and had to revert KoLmafia to automate that. Mafia kept telling me I had to complete the quest. It was completed. I didn't have time before work to collect any details, so this is not a bug report, but just a "heads up".
 

Darzil

Developer
What do you have for questL08Trapper ? What quest text do you have for the Trapper quest? Is it by any chance a multi that has never done that quest? (if so, as I didn't change the Icy Peak checking, I guess the change I made which broke this was correctly checking the quest status!)
 

Crowther

Active member
Code:
> equip drunk wine

Holding Drunkula's wineglass...
Equipment changed.

> adv the peak

You must complete a trapper task first.
That area is not available.

Autorecovery failed.

> get questL08Trapper

finished
Under completed quests:
Am I My Trapper's Keeper?
You have helped the Trapper, and brought (relatively speaking) peace to Mt. McLargeHuge. Shazam!

EDIT: recovered HP just to be sure got the same result (without Autorecovery failed.)
 
Last edited:

Darzil

Developer
Try r14130. This used a function isQuestFinished, which used to (wrongly) compare the name of the quest preference to "finished", rather than the value of the quest preference.

This also fixed a couple of other issues, including unlocking the second floor based on necklace quest status.
 

Crowther

Active member
Try r14130. This used a function isQuestFinished, which used to (wrongly) compare the name of the quest preference to "finished", rather than the value of the quest preference.

This also fixed a couple of other issues, including unlocking the second floor based on necklace quest status.
Cool, that fixed it! Thanks.
 

Crowther

Active member
I'm not sure. I believed '.*?[^<>]*?' was equivalent to '.*?', but it might not be. Is laziness distributive? Lazy matches weren't covered when I learned regular expression, so I'm a bit fuzzy on exactly how they work. Sometimes they work exactly like I expect, sometimes they seem to match more than I expect. I should really look up how they are handled with a state machine, when thinking about regular expressions with set theory fails me, sometimes thinking about the automata to parse them helps me.
 

xKiv

Active member
I'm not sure. I believed '.*?[^<>]*?' was equivalent to '.*?', but it might not be.

By definition, they should. T is matched by ^.*?[^<>]*?$ IFF it's matched by ^.*?$ (because [^<>]*? can match empty string), and empty string is the shortest thing that can be matched by anything, so the two should match at least the same lengths.
But in practice, there are implementation details. And those can conspire to give wrong answers if you don't pick the right way to express the same thing.

I should really look up how they are handled with a state machine

They can't, iiuic. Not a finite state machine anyway. Afaik, perl-based engines use recursion, other engines might use something with sets of states.
 

Crowther

Active member
They can't, iiuic. Not a finite state machine anyway.
If an expression is regular, then a minimal state machine exists and can be made. I don't know if that's always the best choice. Grammars are usually handled with recursion.
 

xKiv

Active member
If an expression is regular, then a minimal state machine exists and can be made. I don't know if that's always the best choice. Grammars are usually handled with recursion.

The requirement of non-greediness on quantifiers makes the expression non-regular, if you are using the definitions from the underlying mathemathical theories, and not the ones used in software development.
(only the first one is equivalent with finite-state machines)
 
Top