Bug - Fixed Open Request Sequence stack traces

roippi

Developer
Huh, I'm not used to stack traces actually being so.. helpful.

Code:
Found one Java-level deadlock:
=============================
"Thread-139":
  waiting to lock monitor 0x00000000057c2fc8 (object 0x00000000dd464220, a net.sourceforge.kolmafia.textui.Interpreter),
  which is held by "LocalRelayAgent0"
"LocalRelayAgent0":
  waiting to lock monitor 0x0000000006a36ca8 (object 0x00000000e757db28, a java.lang.Class),
  which is held by "Thread-139"

Code:
Java stack information for the threads listed above:
===================================================
"Thread-139":
	at net.sourceforge.kolmafia.KoLmafiaASH.getClientHTML(KoLmafiaASH.java:104)
	- waiting to lock <0x00000000dd464220> (a net.sourceforge.kolmafia.textui.Interpreter)
	at net.sourceforge.kolmafia.KoLmafiaASH.getClientHTML(KoLmafiaASH.java:91)
	at net.sourceforge.kolmafia.request.RelayRequest.run(RelayRequest.java:1378)
	at net.sourceforge.kolmafia.request.CharPaneRequest.run(CharPaneRequest.java:133)

Code:
"LocalRelayAgent0":
	at net.sourceforge.kolmafia.textui.parsetree.LibraryFunction.execute(LibraryFunction.java:123)
	- waiting to lock <0x00000000e757db28> (a java.lang.Class for net.sourceforge.kolmafia.textui.parsetree.LibraryFunction)
	at net.sourceforge.kolmafia.textui.parsetree.FunctionCall.execute(FunctionCall.java:167)
	at net.sourceforge.kolmafia.textui.parsetree.Operator.applyTo(Operator.java:392)
	at net.sourceforge.kolmafia.textui.parsetree.Expression.execute(Expression.java:221)
	at net.sourceforge.kolmafia.textui.parsetree.Assignment.execute(Assignment.java:102)
	at net.sourceforge.kolmafia.textui.parsetree.BasicScope.execute(BasicScope.java:454)
	at net.sourceforge.kolmafia.textui.parsetree.UserDefinedFunction.execute(UserDefinedFunction.java:130)
	at net.sourceforge.kolmafia.textui.parsetree.FunctionCall.execute(FunctionCall.java:167)
	at net.sourceforge.kolmafia.textui.parsetree.BasicScope.execute(BasicScope.java:454)
	at net.sourceforge.kolmafia.textui.parsetree.UserDefinedFunction.execute(UserDefinedFunction.java:130)
	at net.sourceforge.kolmafia.textui.Interpreter.executeScope(Interpreter.java:356)
	at net.sourceforge.kolmafia.textui.Interpreter.execute(Interpreter.java:283)
	at net.sourceforge.kolmafia.textui.Interpreter.execute(Interpreter.java:276)
	at net.sourceforge.kolmafia.KoLmafiaASH.getClientHTML(KoLmafiaASH.java:109)
 

AlbinoRhino

Active member
Here is another one. This time I was playing in the browser and I clicked on the Haunted Pantry (from town right, not manor) and it froze while the restoration script was running.
 

Attachments

  • DEBUG_20120215.txt
    31.7 KB · Views: 31

AlbinoRhino

Active member
This time I was playing in the browser, adventuring at Sonofa Beach with the browser cli open.
 

Attachments

  • DEBUG_20120217.txt
    76.2 KB · Views: 70

Veracity

Developer
Staff member
These stack traces are very helpful. In particular, the "deadlock" analysis. For the benefit of those who are interested, I attach a detailed and technical analysis of the issues - since there are many. I believe I can fix this particular problem with either of two changes, but I think both should be done.

I see two threads, each waiting for a resource locked by the other.

LocalRelayAgent0:

It is running between battle checks from the Relay Browser.
It is executing a recovery script.
That script is inside the ASH Runtime Library executing a buy() function.
The Runtime Library sets a global static variable - LibraryFunction.interpreter - so that library functions can get Interpreter variables for the interpreter that is executing the script.
-> The Runtime Library locks LibraryFunction.class to keep other scripts from interfering
The buy() function restores your outfit, since it changed it to do the purchase. (Why? Do you have travoltan trousers, or something?)
KoL says that a charpane request is needed
We invoke CharpaneRequest which is, for some reason, a RelayRequest and is thus subject to overrides.
Since you have a charpane override, we look up the relay script and find the interpreter for it.
-> We attempt to seize the lock for that specific interpreter. Another thread has it, so we wait.

LocalRelayAgent1:

In response to KoL saying that a charpane request is needed, the browser asked for charpane.php
Since you have a charpane override, we look up the relay script and find the interpreter for it.
-> We seize the lock for that specific interpreter.
Your charpane script executes. It wants to execute a LibraryFunction
-> The Runtime Library attempts to lock LibraryFunction.class to keep other scripts from interfering. It is already locked, so it waits.

And that is the deadlock.

Here are various things that contribute to this.

1) When you execute a relay script, we find it, compile the script, and cache the Interpreter to execute it. Next time you try to run the same script, we find the Interpreter with the pre-compiled script and execute it. We have a lock to prevent the same Interpreter from executing simultaneously in more than one thread.

I believe that is fine.

2) LibraryFunction has a static global variable which holds the Interpreter that is inside a Library Function. We have that because of laziness; LibraryFunction knows which Interpreter it is in, but we'd have to pass the Interpreter as a parameter to every library function method. That is not difficult, but was tedious.

-> We should eliminate LibraryFunction.interpreter and remove the lock on LibraryFunction.class in LibraryFunction.java. When that class invokes the library function's method, it should pass the Interpreter as the first parameter. Every single method in RuntimeLibrary needs to accept the Interpreter as parameter 1 and use that rather than LibraryFunction.interpreter.

-> RuntimeLibrary has its own static variables - GenericRequest VISITOR and RelayRequest RELAYER. Get rid of those.

3) Why is CharPaneRequest a RelayFunction? Answer: because we do "tricky" things with the charpane.

- When we (or a browser) issues a request to KoL, sometimes the request will result in changes that are reflected in the charpane. KoL signals this by including a Javascript directive in the response to refresh the charpane. The browser dutifully refreshes the charpane and the player sees the update. When KoLmafia submits its own request, KoL signals charpane updates as usual. KoLmafia sees that and issues a charpane update.

- Any time a charpane update comes back from KoL, either requested by KoLmafia or by the Relay Browser, KoLmafia parses it. It then saves the response. When the request comes from the Relay Browser, the RelayAgent runs the CharPaneRequest, gets the "last response" from CharPaneRequest, and gives it to the Relay Browser. Unfortunately, this made charpane.php overrides not work, since we didn't go through the path to execute overrides.

hola changed this in revision 9652: "Change CharpaneRequest to extend RelayRequest so that relay overrides are always applied, even during automation. That way, when the browser is loaded and we load a cached version of the character pane, it will already contain the desired decorations."

-> This means that we run relay override scripts even when we execute a charpane request internally, because KoL told us to update. That is why we ran the charpane override while we were running the restore script, even though we were never going to display the results.

I think the issue here is that we are caching the results of charpane requests and delivering them to the browser. Yes, if we do that, we have to execute the charpane override script. But, we had better be darned sure that we are delivering the current charpane. There are numerous bug reports about the "charpane not updating in the Relay Browser" or "stale information in the charpane" or what have you.

I believe we should stop caching the charpane.

- Make CharPaneRequest a GenericRequest again, not a RelayRequest
- Every time KoL tells us we need a charpane update, in response to a request that WE made, submit a CharPaneRequest. We won't run the relay script on it (since it is not a RelayRequest), but that is fine; we don't need it.
- Every time the browser asks for the charpane, submit the request! If there is a relay override, invoke it as necessary - which will happen automatically because a browser-requested charpane is a RelayRequest. Feed the decorated results back to the browser. No caching.

Doing this should fix all of the "charpane not refreshing" bugs as well as eliminating charpane overrides from running during between battle actions, which are all "automation", essentially, regardless of whether they are invoked by the GUI, by CLI commands, or by the Relay Browser.

In summary, there are two areas of change needed. Either will fix this particular issue. Each, however, will fix other issues, as well. Both should be done.

1) Eliminate LibraryFunction.Interpreter, RuntimeLibrary.RELAYER, and RuntimeLibrary.VISITOR
2) Make CharPaneRequest a GenericRequest and stop caching the results. If the browser submits a call to charpane.php, submit it like any other RelayRequest, complete with override processing.
 

Veracity

Developer
Staff member
Revision 10523 does change #1: it eliminates the static variables and locking in ASH LibraryFunction and RuntimeLibrary. This should solve your problem. Try it!

Change #2 is still worth doing.
 

Veracity

Developer
Staff member
Revision 10525 makes CharPaneRequest NOT be a RelayRequest. We no longer call the relay override for charpane.php unless it is being requested by the Relay Browser.

This should complete the fixes for this bug. Tell me how things work!
 

AlbinoRhino

Active member
Bravo, Veracity. I just played 287 turns hunting hobo codes, with the browser cli open the entire time. This was with r10524, since r10525 wasn't on the builds page yet when I started. Everything worked smoothly ! Your mastery of logic knows no bounds. What would we all do without you around here ?
 

Veracity

Developer
Staff member
We have a fresh batch of enthusiastic young developers on the project. I'm sure they will excel. :)

I'm marking this fixed.
 
Top