Nitpick: all you should be able to say (and all that should matter) is that java strings are *unicode*.
Yeah, yeah.
But, I just tried downloading faxbot's and easybots xml, using wget, and ... and while faxbot's appears to be consistent with iso-8859-1 (specifically, no weird characters in the art teacher), easybot's seems to be in utf-8 already? So the incorrect thing there is that it advertises iso-8859-1 encoding when it's actually in utf-8.
Here is what I get when I retrieve faxbot.xml:
Retrieved: http://www.hogsofdestiny.com/faxbot/faxbot.xml
8 header fields
Field: null = [HTTP/1.1 200 OK]
Field: Date = [Sat, 28 Jun 2014 22:20:13 GMT]
Field: Content-Length = [31952]
Field: Last-Modified = [Mon, 16 Dec 2013 21:19:40 GMT]
Field: Accept-Ranges = [bytes]
Field: Connection = [keep-alive]
Field: Content-Type = [application/xml]
Field: Server = [nginx/1.6.0][/quote]
Note that it doesn't say anything about the encoding in the HTML stream. We read it in UTF-8 - and it works because Faxbot's art teacher says this:
<monsterdata>
<name>Francois Verte, Art Teacher</name>
<actual_name>Francois Verte, Art Teacher</actual_name>
<command>art_teacher</command>
<category>KOLHS</category>
</monsterdata>
It really doesn't have a c-cedilla there. Since the Request Fax frame doesn't look up monsters, it doesn't care what you call it. Lets see what the faxbot() ASH function does with this monster:
> ash faxbot( $monster[ art teacher ] )
Changing "art teacher" to "François Verte\, Art Teacher" would get rid of this message ()
Configuring faxable monsters.
Configuring FaxBot (2194132)
Configuring FaustBot (2504770)
Configuring Easyfax (2504737)
Faxable monster lists fetched.
Returned: false
It can't find a faxbot that has anything that it recognizes as that monster name. So, how does KoL refer to this monster?
Retrieved: http://www.kingdomofloathing.com/desc_item.php?whichitem=835898159
10 header fields
Field: null = [HTTP/1.1 200 OK]
Field: Date = [Sat, 28 Jun 2014 22:31:11 GMT]
Field: Content-Length = [1919]
Field: Expires = [Thu, 19 Nov 1981 08:52:00 GMT]
Field: Connection = [keep-alive]
Field: Content-Type = [text/html; charset=UTF-8]
Field: X-Powered-By = [PHP/5.3.3]
Field: Server = [nginx/1.0.15]
Field: Pragma = [no-cache]
Field: Cache-Control = [no-store, no-cache, must-revalidate, post-check=0, pre-check=0]
The desc is specified to be charset=UTF-8
This is a sheet of copier paper with a grainy, blurry likeness of a François Verte, Art Teacher on it.
And, by golly, there is the UTF-8 character in the monster description.
So, if easyfax was trying to have the exact output from receiving the fax, it really needs UTF-8 characters, not ISO-8859-1 characters, since that's what KoL itself uses.
Now, when I typed this into easyfax's tab in the chat window, the browser submitted this:
Requesting: http://www.kingdomofloathing.com/submitnewchat.php?playerid=121572&graf=%2Fmsg+easyfax+Fran%E7ois+Verte%2C+Art+Teacher&pwd
Notice that are ISO-8859-1 characters, not Unicode characters. In fact, looking at GenericRequest.addFormField, I see this:
Code:
String charset = this.isChatRequest ? "ISO-8859-1" : "UTF-8";
... so apparently that really is what we have to submit to chat. However, that echoed into the chat window like this:
Field: Content-Type = [text/html; charset=UTF-8]
<font color=blue><b>private to <a class=nounder target=mainpane href="showplayer.php?who=2504737"><font color=blue>Easyfax</font></a></b>: François Verte, Art Teacher</font></br>
Notice the HTML entity.
This is a can of worms. I'm going to have tho think about this a bit more.