Bug - Fixed file_to_map needs https handling?

fronobulax

Developer
Staff member
Still poking. With the http or https OR restored to DataUtilities, plus some logging, I get

https://sourceforge.net/p/kolmafia/code/HEAD/tree/src/net/sourceforge/kolmafia/KoLConstants.java has IOException javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

http://kolmafia.us/scripts/updateprices.php?action=getmap has IOException java.io.IOException: Server returned HTTP response code: 403

and

Success for https://zachbardon.com/mafiatools/autoupdate.php?f=batfactors&act=getmap

The first is mafia itself trying to do some version checking. Superficial research suggests mafia doesn't trust SourceForge's SSL cert but this seems to have been going on for longer than the recent changes so I will defer.

The second is the price update call. I'm still at the stage where "That can't happen" but I am not giving up yet.

The final is the update batfactors test and it works as expected.

P.S. I get the 403 whetther I ask for http:// or https:
 
Last edited:

xKiv

Active member
https://sourceforge.net/p/kolmafia/code/HEAD/tree/src/net/sourceforge/kolmafia/KoLConstants.java has IOException javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

And *without* the "OR https" included, it just doesn't find https://sourceforge.net/p/kolmafia/code/HEAD/tree/src/net/sourceforge/kolmafia/KoLConstants.java on the local filesystem, returns empty stream, and doesn't do what it's supposed to do anyway.

The actual problem probably is that oracle didn't include geotrust's CA certificate with java. But mozilla has it, so it's probably not too untrusthworthy?

A "fix" will likely consist of mafia telling java to ignore SSL trust validation problems. Because java is designed the make simple things insanely complicated.
No, it can't use system truststore. No, it can't just use multiple different truststores. No, you can't combine truststores without knowing both passwords and doing tons of black magic.
And don't get me started on the (lack of proper) feedback for errors.

http://kolmafia.us/scripts/updateprices.php?action=getmap has IOException java.io.IOException: Server returned HTTP response code: 403

(oh my, I need to login before visit_url returns *anything*? I was hoping to test things in sessionless gCLI x-D )

If you are going through visit_url, that 1) doesn't use DataUtilities and this shouldn't be affected and 2) works with status 200 for me (and then mafia spams CPU for a very long time ... because ... reasons? Actually looks like it doesn't like having all that printed to the gCLI; even a simple echo 1 afterwards spammed CPU again) 3) works with https:// the same

file_to_map ....

Without the "or https:":
Code:
> ash string[int][int] q; boolean z = file_to_map("http://kolmafia.us/scripts/updateprices.php?action=getmap", q); print(z + "\n" + count(q));

8-125004 is out of range, returning 0
true9164
Returned: void

> ash string[int][int] q; boolean z = file_to_map("https://kolmafia.us/scripts/updateprices.php?action=getmap", q); print(z + "\n" + count(q));

true0
Returned: void

With it:
Code:
> ash string[int][int] q; boolean z = file_to_map("http://kolmafia.us/scripts/updateprices.php?action=getmap", q); print(z + "\n" + count(q));

8-125004 is out of range, returning 0
true9164
Returned: void

> ash string[int][int] q; boolean z = file_to_map("https://kolmafia.us/scripts/updateprices.php?action=getmap", q); print(z + "\n" + count(q));

8-125004 is out of range, returning 0
true9164
Returned: void

Note that file_to_map does not even return false when the given file/url is not found. There's no abort state, no error message, just empty "success".
 
Last edited:

fronobulax

Developer
Staff member
I'm still here.

I did tests above and did not get the same results. Plenty of chance for operator error or a hiccup at kolmafia.us.

My philosophical opinion is that price updating is more of a core feature than file_to_map. Thus I am reluctant to commit any code that fixes file_to_map but appears to break price updates. I am open for discussion on that point but that is why I have not restored the second OR.

This discussion has identified three problems.

One, there is a https: call to SourceForge to get version information that does not work and probably has not worked for quite some time. The problem is that KoLmafia does not trust SourceForge's certificate. The solutions I am aware of have their own problems so I will try and remember this and break it out in its own report for discussion, especially if I decide the update check is OBE.

Second, file_to_map does not work for https: I have yet to find code that fixes this and does not break price updates. Still working on it.

Third, file_to_map does not return correct/meaningful/useful status when it fails. This is consistent with the underlying Spellcast routines which are philosophically inclined to always return something that can be used without checking. I'm not sure this can be addressed without rewriting a lot of the Spellcast code but we should at least note it.

My results may differ from xKiv's if the prices file is being cached. My initial failure loads an empty copy in the cache which is then returned by both the file_to_map calls. I could argue that caching the price file is not a good thing.

There the https connection is a subclass of the http connection but I can't find a difference between the http: connections for the price file in the OR and no OR cases. There is a deprecated http connection but using it breaks things sooner and differently.

This is a non-progress report but I am still fiddling with things.

View attachment Results.txt
 

fronobulax

Developer
Staff member
I installed Wireshark and captured the outgoing Get packet for both OR and NoOR. The packets are not the same. Window Size Value in the header seems to be the only difference that might be significant but I am at a loss to explain why adding the OR changes the packet. Perhaps something is being unintentionally reused? I will ponder but this is in case there are some insomniacs reading.
 

xKiv

Active member
I installed Wireshark and captured the outgoing Get packet for both OR and NoOR.

How? By my reading of kolmafia code, there should be no outgoing packet (through DataUtilities.getReader) for https URLs if it only creates HttpURLConnection for http and not https ...

What's different in you environment that you are making connections that are not getting made in the (trunk) code?

Or are you testing this on price updates, which are http url, and thus completely unaffected by the difference?

The packets are not the same. Window Size Value in the header seems to be the only difference that might be significant but I am at a loss to explain why adding the OR changes the packet.

It doesn't. Window size is controlled by the OS.
 
Last edited:

fronobulax

Developer
Staff member
How? By my reading of kolmafia code, there should be no outgoing packet (through DataUtilities.getReader) for https URLs if it only creates HttpURLConnection for http and not https ...

What's different in you environment that you are making connections that are not getting made in the (trunk) code?

Or are you testing this on price updates, which are http url, and thus completely unaffected by the difference?



It doesn't. Window size is controlled by the OS.

Sorry. Tired.

I only looked at the packets for the price update. Since the known "fix" for file_to_map support for https: breaks the price update my concern is only the price update. In both the OR and NoOR versions the price update is the same http: URL. In the NoOR version the response is a 200 and the update succeeds. In the OR version the response is a 403. (Both responses are observed in the debugger). The packets leaving KoLmafia appear to be different and I am comfortable blaming the difference on KoLmafia since the tests are being run on the same computer, OS and environment.

Price update succeeds on the NoOR and fails on the OR and that failure happens whether the conditional tests for http: or https: first.

My current hypothesis (which may not have any validity given my tendency to jump to conclusions or to run with my first idea) is that the difference occurs because the code "inside" the conditional is run more often in the OR case than the NoOR case. That difference can be explained if there is something that persists that probably should not.

In my tests the only https: call going through before the price update call is the version checking. One thing to do is comment that out and see if the price update behaves the same for the OR and NoOR cases. I can imagine that making the conditional code happen in a new object that is allocated and disposed of for each execution might make a difference. I did play with synchronizing the conditional code but that did not seem to change anything.
 

xKiv

Active member
That sounds more like something is caching the previous failure (from version check) and returning it for the next connection. Are you connecting through a proxy?

("http://...".startsWith("http://")) and ("http://...".startsWith("http://") || "https://...".startsWith("http://")) should really not make a difference.


With the NoOR version, the version check shouldn't even be trying to make a connection, so there's a big difference right there, if that breaks things.
 

xKiv

Active member
OK, now I am seeing the same thing.
Code:
				HttpURLConnection connection = (HttpURLConnection) new URL( null, filename ).openConnection();
				connection.setRequestProperty( "Connection", "close" ); // no need to keep-alive
				InputStream istream = connection.getInputStream();

filename is http://kolmafia.us/scripts/updateprices.php?action=getmap
connection.getInputStream throws IOException with http code 403

And sure, when I comment out the sourceforge version check, I get:
Code:
6801 prices updated from http://kolmafia.us/scripts/updateprices.php?action=getmap
 

xKiv

Active member
I tried recreating the situation in a running mafia afterwards (using eclipse debugger and code hotswap). But I couldn't manage it.

The only thing that I see that's different is that KolMafia.main() calls the version check in another thread:
Code:
RequestThread.runInParallel( new UpdateCheckRunnable(), false );

So maybe the parallel connection is messing up other connections done at the same time?
(and also LoginManager calls the price update in another thread too: MallPriceDatabase.updatePricesInParallel( "http://kolmafia.us/scripts/updateprices.php?action=getmap" ))
...
Nope, I replaced that code in KolMafia.main with a synchronous call to (new UpdateCheckRunnable()).run(), and all future price updates all still messed up.
(I also removed the "don't try to update from the same url twice in the same session" check, so I can test repeated "update prices http://kolmafia.us/scripts/updateprices.php?action=getmap" from CLI).

But of I remove it from KolMafia.main() completely and instead put in in MallPriceDatabase.updatePrices, right before the getReader call that fails before, it "works".

QCF?

In any case, the part that doesn't work (and already didn't work!) and breaks other things is the version check.
 

xKiv

Active member
I also tried triggering both in parallel later, but that also didn't break anything. I can only break things with version check on startup.
It doesn't even break when I move the version check to LoginManager.
 

fronobulax

Developer
Staff member
r18250 restores the https:// to DataUtilities and disables the SourceForge version check. I will note the latter in a new bug report.

Everything worked acceptably on my system so perhaps this is done for now?
 

fronobulax

Developer
Staff member
I ran a test that tried file_to_map to update BatFactors via https: from a bad url and then followed it with one that I expected to work. The first failure did not impact the second which is good to know ;-)
 
Top