Bug - Fixed Performance issue with large number of effects active

Linknoid

Member
I have a character which has 1300+ effects active. Refreshing the char pane is really, really slow. I run a bunch of combats in a row, it can sit there for several minutes with the char pane not updating.

I finally ran a profiler on it, and it reported it's spending most of its time in matcher.find() in setLastAdventure for expandedLastAdventurePattern in CharPaneRequest.java:

Code:
	private static final Pattern expandedLastAdventurePattern =
		Pattern.compile( "<a.*href=\"([^\"]*)\".*>Last Adventure:</a>.*?<a.*?href=\"([^\"]*)\">([^<]*)</a>.*?</table>" );

	private static final void setLastAdventure( final String responseText )
	{
		if ( ChoiceManager.handlingChoice || FightRequest.currentRound != 0 || FightRequest.inMultiFight || FightRequest.choiceFollowsFight )
		{
			return;
		}

		String adventureName = null;
		String adventureURL = null;
		String container = null;
		if ( CharPaneRequest.compactCharacterPane )
		{
			Matcher matcher = CharPaneRequest.compactLastAdventurePattern.matcher( responseText );
			if ( matcher.find() )
			{
				adventureName = matcher.group( 1 );
				adventureURL = matcher.group( 2 );
			}
		}
		else
		{
			Matcher matcher = CharPaneRequest.expandedLastAdventurePattern.matcher( responseText );
			if ( matcher.find() )
			{
				adventureName = matcher.group( 3 );
				adventureURL = matcher.group( 2 );
				container = matcher.group( 1 );
			}
		}

My guess is the .* stuff in the regex is the cause. However, the actual data it's searching for looks like this (I inserted a few newlines for slightly better readability):

Code:
<center><font size=2><b><a onclick='if (top.mainpane.focus) top.mainpane.focus();' class=nounder href="place.php?whichplace=zeppelin" target=mainpane>Last Adventure:</a></b></font><br>
<table cellspacing=0 cellpadding=0><tr><td><font size=2><a onclick='if (top.mainpane.focus) top.mainpane.focus();' target=mainpane href="adventure.php?snarfblat=385">The Red Zeppelin</a><br>
</font></td></tr></table>

The entire charpane.php for this is over a megabyte in size.

I'm not sure where it's going wrong, since I would expect .* to consume the minimum amount required to find the next element, so I don't know why it would get so slow with a large page. Maybe finding the text "Last Adventure:" and truncating the string at a kilobyte past that for the regex match would be the easiest fix. Or maybe there's a way to modify the regex to speed it up, I don't know. Or maybe ditching regex for this altogether would be better.
 

Veracity

Developer
Staff member
Using regexp to parse HTML is not best practice - but we do it all over the place.
Almost certainly should be using .*? Rather than .*
 

xKiv

Active member
I'm not sure where it's going wrong, since I would expect .* to consume the minimum amount required to find the next element,

It's actually supposed to match as *many* characters as possible while the rest of the pattern still matches (which is why putting multiple .* in a pattern can lead to unbelievably exponentially worse performance).
.*? matches "as few as possible", but it has exactly the same backtracking performance problem.

This pattern has .* .*? .*? .*? - for points at which it has to try all possible match lengths before concluding that it has to try a different match length at the previous point.
In practice, I believe if there is a short match, using only .*? should find it faster.
Using .* when there is *only* a short match in a long string - that's probably giving the regex engine enough rope to shoot itself in the bathwater.
 

Veracity

Developer
Staff member
The pattern contained two .* in it. Revision 20268 changes one to .*? and the other to [^>]*
Tell me how it works for you.
 
Top