continuationState and relay scripts

roippi

Developer
Then we added Relay Scripts and suddenly we had ASH making internal requests in response to browser events and everything changed: the GUI and a Relay Script could be operating simultaneously in different threads and both could modify the continuation simultaneously in different threads. I recall adding the code to force the continuation state to CONTINUE before executing a relay script, since otherwise the script was not guaranteed to execute correctly.

Can we reexamine this (perhaps in a new thread (haha))? I feel that ideologically, relay scripts should not be able to interact with KoLMafia.continuationState in any way - whether it's setting things from "okay" to "abort", or vice versa. There is of course a huge gap between ideology and implementation but that's my general feeling.

(note: we should still move as many routines - like the hedge solver - away from using continuationState as much as possible. It's great for a "user pressed escape and wants us to abort" check but bad as an internal flag that something went wrong)
 

roippi

Developer
To expand the discussion (and de-jargon some things):

continuationState is "the color of the sidebar." Red == ABORT, gray == PENDING, etc. You can hit buttons to put mafia into an ABORT state - and since we frequently "check in" on continuationState during most operations, that has the net effect of shutting everything down.

Relay scripts are just regular ASH scripts that have access to a couple of specialized functions. Outside of that they are mostly indistinguishable from regular scripts. Unfortunately there is this interaction with continuationState: some months ago, complaints came up that if KoLMafia.continuationState == ABORT (i.e. the sidebar was red), then no relay scripts would work until you cleared that state. Obviously that's a bummer ("why is my charpane override not working?!") so we changed it to allow relay scripts to punch through an ABORT - by setting continuationState to CONTINUE. But now we have a problem - relay scripts can and will be fired at arbitrary timings; what happens when we were legitimately depending on that ABORT to do its job but a relay script intervenes to say, "nope, I'm running now, everything is okay again!" We programmers call this type of bug a Race Condition.

So, here's what I propose: we further differentiate relay scripts from non-relay scripts. Relay scripts simply should not be able to interact with KoLMafia.continuationState[SUP]1[/SUP]. This means that even if we're in an ABORT, you can freely run a relay script and mafia stays in ABORT. If your relay script has a runtime exception, that too does not affect continuationState. (we'd still print out the error message)

I've not looked at a single line of code regarding this matter. This could be anywhere from straight-up impossible to easy-as-pie; I have no idea. My guess is difficult-but-doable.

[sup]1[/sup] though I think parse errors still should set us to an ABORT state.
 

Bale

Minion
[sup]1[/sup] though I think parse errors still should set us to an ABORT state.

I'm not sure about that. It would mean that a relay script could abort the normal functioning of KoLmafia's automation. Should an automation script abort just because the user loaded up the relay browser and a relay override was borked? A parse error in a relay script should certainly fail and throw an error message, but it shouldn't set the continuationState to ABORT any more than it should set it to CONTINUE.
 

fronobulax

Developer
Staff member
Old school. With no threading (or one thread) the abort state should stop execution for everything. Taking baby steps into multi-threading the abort state should apply to just the current thread and there should be a mechanism for one thread to say "OMG!!! Everyone abort. Now!!!". Generalizing it seems to me that we have two thread analogs - KoLmafia and Relay. I wonder if they should have separate/independent "Abort" states and we need to implement a means by which one can set or clear the state of the other if required. So I wonder if we need to states? I say that without looking at any code so have no idea how hard it would be to get there from here.
 

roippi

Developer
Old school. With no threading (or one thread) the abort state should stop execution for everything. Taking baby steps into multi-threading the abort state should apply to just the current thread and there should be a mechanism for one thread to say "OMG!!! Everyone abort. Now!!!". Generalizing it seems to me that we have two thread analogs - KoLmafia and Relay. I wonder if they should have separate/independent "Abort" states and we need to implement a means by which one can set or clear the state of the other if required. So I wonder if we need to states? I say that without looking at any code so have no idea how hard it would be to get there from here.

I don't think we need a "thread-global" abort state for interpreters off in their own thread. If there's a problem, that will shut the interpreter down just fine - and since we're not doing anything else in that thread, there's no need to have some sort of continuationState to stop other stuff from happening. There is no other stuff to stop.

I'm not sure about that. It would mean that a relay script could abort the normal functioning of KoLmafia's automation. Should an automation script abort just because the user loaded up the relay browser and a relay override was borked? A parse error in a relay script should certainly fail and throw an error message, but it shouldn't set the continuationState to ABORT any more than it should set it to CONTINUE.

I'm ambivalent. A parse error is more catastrophic than a runtime error so I can see reasoning for going into ABORT. Of course, according to the rest of the proposal relay scripts simply ignore continuationState entirely, so that ABORT wouldn't interact with future runs of relay scripts - but it would freeze the main interface once the user tabs back over to it and say "hey, go fix this." So I can see implementing it either way.
 

roippi

Developer
Hm. This may be a non-starter. BasicScope.java:

Code:
			Iterator it = this.getCommands();
			while ( it.hasNext() )
			{
				ParseTreeNode current = (ParseTreeNode) it.next();
				result = current.execute( interpreter );

				// Abort processing now if command failed
				if ( !KoLmafia.permitsContinue() )
				{
					interpreter.setState( Interpreter.STATE_EXIT );
				}

So basically: a given node (think, a function call) is executed. It can "fail" and this failure is detected solely through continuationState. This could be anything from.. a maximization failure, to a failure to retrieve an item. etc. Current script semantics demand that the interpreter exit at this point of failure.

I think this is this idea's death knell. There's simply no way to "sandbox" an interpreter from the rest of mafia. Oh well.
 

jasonharper

Developer
The continuationState could be made a ThreadLocal instead of a global variable. I'm not sure if the GUI is running in the same thread as normal script execution, so it might be difficult to get the GUI to turn red/gray at appropriate times.
 

Veracity

Developer
Staff member
Well, there might be. I was wondering if we could have a per-thread variable. Which is to say, KoLmafia.continuationState would be unique for each thread. Java actually does have that concept: look at java.lang.ThreadLocal.

I think it would be a complicated refactoring task, venturing into completely unfamiliar territory, but if all requests made by an ASH script execute within that same thread - and RequestThread.makeRequest does execute in-line - if each thread had its own continuationState, the script could modify it to its heart's content and it wouldn't affect what the GUI/CLI were doing.

Edit: ninja'd by Jason. I believe that the GUI/CLI/ASH scripts executed from the GUI are a single thread. If relay scripts are in separate threads, we don't WANT the GUI to change color when they have errors.
 

roippi

Developer
Hm, interesting idea! I like the idea of just making the points that update continuationState (mostly KoLMafia.updateDisplay) handle making the value thread-local and leaving the rest of the codebase that checks for permitsContinue() naive of the difference.

I'll look into it.
 

roippi

Developer
roippi's log, developerdate one forty three point seven

I've ventured into RequestThread (and beyond) to see how threaded mafia really is. While you can consider that the majority of things do eventually happen in one thread, there are plenty of other threads that get fired off:

  • RelayAgents
  • The EDT, which Swing runs on top of
  • One-off Thread subclasses to run a specific task in parallel, like KoLmafia.UpdateCheckThread that checks if we've released a new point version
  • Anything run by RequestThread.runInParallel, which includes a metric ton of stuff. Actions linked to ThreadedButtons, ThreadedListeners, and various frame creation routines

Many of these things are capable of interacting with continuationState. The only ones we want to sandbox are the RelayAgents. So really what we want to do is have everything but RelayAgents share a singleton global continuationState, and have each individual RelayAgent get its own ThreadLocal one.

I am still in the formulating-a-plan stage. I am beginning to believe in this case it is better to blacklist rather than whitelist. That is to say, when fetching continuationState we simply check if we're in a RelayAgent thread - if yes, use the ThreadLocal value, otherwise all other threads get the global one.

Explorations continue. I need to see if a RelayAgent/RelayRequest is capable of itself spawning a new thread - I hope not.
 

roippi

Developer
roippi's log, supplemental

I've cleaned up a number of the random Thread subclasses scattered around mafia; the majority of the ones left are long-running threads that don't need to be put through the new ThreadPoolExecutor that RequestThread is sporting.

After going through our codebase (not exhaustively, but...) I think that for the most part, a RelayAgent will not spawn any new threads in the course of executing a relay script. Whew. I think there are a few corner cases (like, if you fire off a faxbot command and have not yet initialized the faxbot database, that will fire off a thread to go fetch faxbot data) but I haven't found one yet that will meaningfully interact with continuationState.

There is one goofy oddity: RelayAgent keeps alive a separate thread (RelayAutoCombatThread) to run combats in. It doesn't do much - just awaits a wake() signal, runs a combat, then goes back to sleep. I don't necessarily see the point of this - maybe it was a performance microoptimization, maybe it was just a half-implemented feature - but I can just lump that thread in with all the RelayAgent threads when checking if I should give it the ThreadLocal value.
 

Veracity

Developer
Staff member
I have no problem with simplifying and removing half-implemented features. Don't we run combats using FightRequest.instance, or something? You had really only better be in one combat at a time.
 

roippi

Developer
Hm, right. FightRequest definitely needs to respond to continuationState. And since yeah, you'd only better be in one combat at a time, it actually makes sense to have a singleton thread to do combats in. It wasn't necessary *before*, but it actually is now if I'm moving forward with this ThreadLocal thing. Heh.

I think when all is said and done this implementation will be (and needs to be) minimal and maintenance-free. I really don't want this to be a maintenance burden; multithreading bugs are notoriously awful so it's best if I minimize our exposure to them.
 

roippi

Developer
So... I think this is working? This is too scary of a feature to just recklessly push upstream; haven't tested it nearly enough. Might be prudent to gate it behind an opt-in preference while it's new.

Pretty happy with it, though. It's unobtrusive.
 

Attachments

  • threadlocalcontinuation.patch
    5.1 KB · Views: 34

Veracity

Developer
Staff member
That looks remarkably simple. What could possibly go wrong? :)

I'm looking at how relay scripts get executed. KoLmafiaASH.getClientHTML( RelayRequest request ) looks up the script and then calls KoLmafiaASH.getClientHTML( RelayRequest request, File toExecute ) to actually run it. That latter method gets the interpreter and executes the script. In particular, it does relayScript.initializeRelayScript( relayRequest ). Interpreter.initializeRelayScript() does the KoLmafia.forceContinue(). So, assuming that KoLmafiaASH.getClientHTML is called ONLY when StaticEntity.isRelayThread() will return true, the continuationState is sandboxed, just as you desire.

It looks like the KoLmafiaASH method is called only in the run() method of a RelayRequest, which should only be called in its own thread from the RelayAgent, so that all looks good.

So, the only thing to check: is it really the case that the only time we have RelayRequest is via the RelayAgent? grepping for "new RelayRequest" says "not quite".

Code:
./KoLmafiaASH.java:161:			RelayRequest relayRequest = new RelayRequest( false );
./KoLmafiaASH.java:197:			RelayRequest relayRequest = new RelayRequest( false );
./request/UseItemRequest.java:1147:			RelayRequest request = new RelayRequest( false );
./textui/RuntimeLibrary.java:2067:			new GenericRequest( "" ) : new RelayRequest( false );
./webui/RelayAgent.java:104:		this.request = new RelayRequest( true );
The KoLmafiaASH and RelayAgent calls are exactly what are expected. But what about the others?

UseItemRequest:

Code:
		case ItemPool.MACGUFFIN_DIARY:
			// Make it a RelayRequest since we don't want a charpane refresh
			RelayRequest request = new RelayRequest( false );
			request.constructURLString( "diary.php?textversion=1" );
			RequestThread.postRequest( request );
Well, that looks harmless. If you do this in the browser, it will be in its own thread, anyway.

The RuntimeLibrary one is in visit_url:

Code:
		// See if we are inside a relay override
		RelayRequest relayRequest = interpreter.getRelayRequest();

		// If so, use a RelayRequest rather than a GenericRequest
		GenericRequest request = ( relayRequest == null ) ? 
			new GenericRequest( "" ) : new RelayRequest( false );
So that looks good too.

I'd be tempted to just do a bit more testing and then submit it into the wild.
 

roippi

Developer
Ah, missed one place that interacted with the sidebar display.

This seems pretty guide-proof right now, so that's a good sign. I've also tested some bogus relay scripts and they fail properly. So... I'll probably give this a few days of random testing before committing. What's the worst that could happen.
 

Attachments

  • threadlocalcontinuationv2.patch
    5.3 KB · Views: 27

roippi

Developer
Alrighty, r14732. Let's see how good my local testing was.

One thing to note. My commit says relay scripts "... should not be able to interact with the main KoLMafia display, other than to print things to the gCLI." This isn't *quite* true - relay threads have the ability to submit things to the CLI queue, which will behave just like if any other thread submitted something to the CLI queue (graying out the display while the command runs). I don't see any issue with this since the CommandQueueHandler doesn't really care where a command came from.
 
Top