Shall it be closed?

fronobulax

Developer
Staff member

The PR was opened in August and has pretty much languished. Should it be committed or closed?

My memory, which can probably be corrected by someone who wants to search KoLmafia.us, was that a PR was written to deal with an absence of synchronization when writing preference files. The PR was committed. A couple of users reported a problem when cheating with the Deck of Every Card. Out of an abundance of caution the PR was reverted. Subsequently, the original PR was cherry-picked for changes not related to synchronization (which have since been committed) and what was left was retained to support research and debugging.

@gausie has raised the question about closing or committing it since nothing has happened in several months.

I have spend a fair amount of time trying to reproduce the deck cheating problem and have not been able to do so. I have also been building and running the code with the synchronization changes and have seen no problems. But I also don't do things like leave KoLmafia running 24/7 and dealing with logout/in at rollover.

I agree that this has languished and it would be nice to move on. I see several possibilities:
  • Someone besides me could try and reproduce the Deck cheat issue or otherwise decide it is "safe" to commit.
  • It could be committed as is and reported problems will be addressed in some way besides reverting it.
  • It could be closed because the benefits of improved synchronization are less than the cost of resolving other issues.
I am inclined to commit it and deal with the fallout, if there is any. But I also understand that Crimbo is coming. The devs in the time zones ahead of Eastern may very well have to deal with the fallout if it is severe and there is pressure to fix it ASAP. I also acknowledge that no one has complained about file corruption here and so the issue is not occurring as frequently as feared/believed.

Tagging @MCroft and @heeheehee since they were most involved.

Discuss.

Thanks.
 

heeheehee

Developer
Staff member
If you feel comfortable with committing it (since you've apparently been using this patch for months without any observed ill effects), that's fine by me.

If it breaks again, there's nothing wrong with reverting it again.
 

fronobulax

Developer
Staff member
My hypothesis - that the failing tests were due to characteristics of the test runner - was questioned in a way that encourages further exploration. So I am going to do that. An expected result will be more proof for my hypothesis or changes that reduce the frequency of failed tests. I am no advocating pushing this until my assertions about failing tests have been "proven" and/or the frequency of failures has been reduced so that it is comparable to the main branch.

Thanks to @gausie for asking the right question :)
 

fronobulax

Developer
Staff member
Well, well well.

I was running KoLmafia-27784-M.jar built locally from the referenced PR. I was running VMF and it had just finished drawing a Micky Mantle Card. It was frozen. I discovered the attached debug log. I also used jstack to get the attached stack dump. When I killed the process at the operating system and restarted mafia it was in a choice - the page where the user responds "Whoa" after getting the valuable card and is taken to the next draw.

I was running two other characters at the time. What might be a factor or just a red herring was that autoscend was running. If that is significant it is because autoscend forces the global preference to log preference changes to be true.

If anyone (@heeheehee ?) wants to look at these that would be helpful. Since the PR was reverted in response to reports that there was a problem with drawing from the deck of cards I'm thinking if we cannot explain and fix that with this data then we probably do need to close the PR. (In which case I will cherry pick some of my changes that were made to improve tests and the deadlock detection code).

I will also look at the stack traces but I am posting them now in case someone has more time than I do at the moment.
 

Attachments

  • DEBUG_20240107.txt
    11.4 KB · Views: 0
  • deadlock.txt
    179.3 KB · Views: 0

heeheehee

Developer
Staff member
Today's debug log looks like a deadlock between
net.sourceforge.kolmafia.swingui.widget.RequestPane$WrappedHtmlEditorKit$WrappedHtmlFactory$WrapInlineView.getMinimumSpan(RequestPane.java:83)

and

net.sourceforge.kolmafia.request.DeckOfEveryCardRequest.run(DeckOfEveryCardRequest.java:360)

(which does some choice adventure stuff and preference setting)

I'll need some time to remind myself exactly what we tried , and what we might be able to do.
 

heeheehee

Developer
Staff member
(I am very happy I added the deadlock detector in that PR, since now we can know with certainty where the deadlock is.)
 

heeheehee

Developer
Staff member
Okay. So the problem is that the Swing UI can request preferences (e.g. wrapLongLines), thereby grabbing the Preferences lock, while setting a preference (which holds the Preferences lock) can fire listeners which can try to grab the Swing UI lock.

In particular, the fix in Preferences.java schedules PreferenceListenerRegistry.fireAllPreferencesChanged(); on the Swing update thread, but not either of the instances of PreferenceListenerRegistry.firePreferenceChanged();
 

heeheehee

Developer
Staff member
Trying to push these two instances (setObject and removeProperty) onto the event dispatch thread naively causes many tests (which expect listeners to be executed synchronously in response to preference updates) to fail.

IMO one reasonable way to do this is to update only the swingui elements such that they always run their update() methods on the Swing event dispatch thread, e.g. by creating an abstract SwingListener class that dispatches update() to SwingUtilities.InvokeLater(() -> this.updateSwing());
 

MCroft

Developer
Staff member
ages ago we thought about ReadWriteLock or ReentrantReadWriteLock. readLock() doesn't block and won't be blocked unless something grabs the writeLock().

Would that help here? The swing thread seems to be reading, not writing.

Edit: We still want to make sure we don't write concurrently, but there are also patterns for acquiring locks inside a try block with a timeout...

And that led me down the rabbit hole of try-with-resources and Autoclosable locks, which seem pretty cool.
 
Last edited:

heeheehee

Developer
Staff member
ages ago we thought about ReadWriteLock or ReentrantReadWriteLock. readLock() doesn't block and won't be blocked unless something grabs the writeLock().

Would that help here? The swing thread seems to be reading, not writing.
No, I think there are still possibilities of deadlock from multiple concurrent writes between the Swing event dispatch thread (e.g. clicking a checkbox to change some settings in Preferences) and main thread trying to affect the Swing UI.
 

fronobulax

Developer
Staff member
I'm still building off of the PR even though the Deadlock Detector is in mafia :) (but the issue is probably caused by the PR).

Did not hang yesterday but did today so drawing Mickey has frozen 3 days out of four.
 

heeheehee

Developer
Staff member
Yeah. I think I understand the race condition now. I just haven't had time to put together a fix yet. (If anyone else wants to take a stab at it, be my guest.)
 

fronobulax

Developer
Staff member
Yeah. I think I understand the race condition now. I just haven't had time to put together a fix yet. (If anyone else wants to take a stab at it, be my guest.)
I don't have enough of an understanding to propose a fix but until I finish Standard runs I am willing to keep running with this version so that, when there is a change and I don't get freezes, there is some confidence about the fix :)
 
Top