Feature - Implemented Add case-sensitivity for string comparisons in ASH

heeheehee

Developer
Staff member
I have no idea why ASH was coded that way. heeheehee recently noted it with bemusement. I also have no idea how many scripts exist that depend on that - perhaps comparing user input vs. strings in the script is better, but there is no reason that comparing strings in the script with item names, monster names, etc. should depend on inexact comparisons. Perhaps we should start a Bug/Feature to discuss the implications of changing ASH string equality to be case sensitive, since that would allow scripts to do things with KoL page text that it can't do right now. We certainly would have to augment the Runtime Library with case-sensitive and case-insensitive string matching functions, just like Java has.
Here is that bug / feature.

I'm not totally sure I understand why we'd need to add case-insensitive string matching functions, and all that. We have to_lower_case(), and Java requires String.equals because the String class is required to invoke methods on strings.

aside: it's really awkward, in my opinion, that it has that weird separation between Objects and primitives (several modern languages have adopted the "everything is a class" mantra, which at times makes more sense), and doubly weird that it has special wrapper classes for the various numeric primitive types but not for Strings, but whatever.
 

Veracity

Developer
Staff member
How about index_of_ignore_case() and contains_ignores_case() and starts_with_ignores_case()?

Google found me an implementation of that last one using Java's regionMatches() method - which has an "ignores case" parameter.

For all the others, the solution seems to be to apply to_lower_case() to the arguments before using index_of() and contains_text(), just as you say.

(Fun fact: ASH's runtime library just calls Java's indexOf and contains - which are case sensitive - even though ASH strings are case insensitive for equality.)

So, yes - if we make == case insensitive, we could just require the program to use to_lower_case() wherever needed to get case insensitive behavior - just like they already have to do now for index_of and last_index_of and contains_text. Looks like the ONLY effect would be strings and the == operator, actually.

Or we could write little utilities that Java does not have to make things a little easier.

(What "it" has "that weird separation"? Java or ASH?_
 

heeheehee

Developer
Staff member
Other major places where case-insensitivity occurs in ASH that I can think of off the top of my head:
* indexing into aggregates, although changing == to be case-sensitive *should* resolve this
* variable and function names. This may also have the same root cause? I haven't actually looked into that.

(Java has the weird separation. You can't, for instance, do something like
Code:
int x;
System.out.println(x.toString());
and you need the wrapper Integer class to store said values in any Containers other than arrays, which are another bizarre artifact of Java.)
 

Veracity

Developer
Staff member
Agreed that Integer and Float and Long wrapper classes are weird. You can use String.valueOf( INT ), even if you can't do INT.to_string().

ASH variable and function names being case insensitive is a different matter. It's intentional, and I don't think it's a problem. "names" are not a data type in ASH. "strings" are.
 

heeheehee

Developer
Staff member
ASH variable and function names being case insensitive is a different matter. It's intentional, and I don't think it's a problem. "names" are not a data type in ASH. "strings" are.

Certainly. I can see that as being a distinct design decision, and it's not one that should cause too much confusion. I just see it as something that deviates from what I've experienced in other languages.
 
Languages with case insensitive symbols: Visual Basic, SQL, and, reaching waaay back, Fortran, maybe? It's uncommon in modern languages.

As for ASH case handling, a frequent pattern in other settings is to have an optional options parameter, one option of which is case insensitivity. Overloads of existing functions (is that possible in ASH?) to do this while leaving existing operation alone would be minimally disruptive.

Wrapper objects, aka boxing: https://en.wikipedia.org/wiki/Object_type_(object-oriented_programming). This also exists in .Net for the same reasons. I imagine it's also pretty common in other contexts.
 

heeheehee

Developer
Staff member
As for ASH case handling, a frequent pattern in other settings is to have an optional options parameter, one option of which is case insensitivity. Overloads of existing functions (is that possible in ASH?) to do this while leaving existing operation alone would be minimally disruptive.

Sure (and yes, this is a thing in ASH). Technically the whole method notation in ASH is syntactic sugar for just passing in the object as the first argument (not unlike python), but since there isn't actually a class system, the function table is just that for the current scope.
 

Veracity

Developer
Staff member
Operator.compareValues:

Code:
		if ( Operator.isStringLike( ltype ) || Operator.isStringLike( rtype ) )
		{
			int c = leftValue.toString().compareToIgnoreCase( rightValue.toString() );
			bool = this.operator.equals( "==" ) ? c == 0 :
			       this.operator.equals( "!=" ) ? c != 0 :
			       this.operator.equals( ">=" ) ? c >= 0 :
			       this.operator.equals( "<=" ) ? c <= 0 :
			       this.operator.equals( ">" ) ? c > 0 :
			       this.operator.equals( "<" ) ? c < 0 :
			       false;
		}
Seems like simply removing "ignoreCase" might do it. What is a "string like" data type?

Code:
	public static boolean isStringLike( Type type )
	{
		return  type.equals( DataTypes.TYPE_STRING ) ||
			type.equals( DataTypes.TYPE_BUFFER ) ||
			type.equals( DataTypes.TYPE_LOCATION ) ||
			type.equals( DataTypes.TYPE_STAT ) ||
			type.equals( DataTypes.TYPE_MONSTER ) ||
			type.equals( DataTypes.TYPE_ELEMENT ) ||
			type.equals( DataTypes.TYPE_COINMASTER ) ||
			type.equals( DataTypes.TYPE_PHYLUM ) ||
			type.equals( DataTypes.TYPE_BOUNTY );
	}
Those are all things that aren't uniquely identified by numbers.

Monsters are heading in the direction of NOT being "stringlike", since so many of them have a unique ID, which we should use, if possible. We should probably change the comparison to just to Value.compareTo.

Locations, elements, phyla, coinmasters, and stats, when created from a string, get their names normalized, so case comparison shouldn't matter.

Bounties do NOT get normalized, but only parse with an exact case-sensitive match. If we want case-insensitivity, that belongs in parseBountyValue, not here.

Which leaves just strings and buffers.

There is also Value.compareTo to consider:

Code:
		if ( this.contentString != null && it.contentString != null )
		{
			return this.contentString.compareToIgnoreCase( it.contentString );
		}
This is what will be used for Value objects as map keys, so, this too should be case sensitive.
 

Veracity

Developer
Staff member
I don't use many scripts. My tests worked fine, but if somebody wants to install this patch and see what breaks, I'd be interested in seeing what, if anything, breaks.

Code:
Index: src/net/sourceforge/kolmafia/textui/parsetree/Operator.java
===================================================================
--- src/net/sourceforge/kolmafia/textui/parsetree/Operator.java	(revision 16177)
+++ src/net/sourceforge/kolmafia/textui/parsetree/Operator.java	(working copy)
@@ -227,7 +227,7 @@
 		// If either side is non-numeric, perform string comparison
 		if ( Operator.isStringLike( ltype ) || Operator.isStringLike( rtype ) )
 		{
-			int c = leftValue.toString().compareToIgnoreCase( rightValue.toString() );
+			int c = leftValue.toString().compareTo( rightValue.toString() );
 			bool = this.operator.equals( "==" ) ? c == 0 :
 			       this.operator.equals( "!=" ) ? c != 0 :
 			       this.operator.equals( ">=" ) ? c >= 0 :
Index: src/net/sourceforge/kolmafia/textui/parsetree/Value.java
===================================================================
--- src/net/sourceforge/kolmafia/textui/parsetree/Value.java	(revision 16177)
+++ src/net/sourceforge/kolmafia/textui/parsetree/Value.java	(working copy)
@@ -355,7 +355,7 @@
 
 		if ( this.contentString != null && it.contentString != null )
 		{
-			return this.contentString.compareToIgnoreCase( it.contentString );
+			return this.contentString.compareTo( it.contentString );
 		}
 
 		return -1;
 

Veracity

Developer
Staff member
You know, I haven't heard a peep out of anybody who actually publishes scripts: just from people interested in programming language design.

Perhaps I should just submit that patch into the wild and let us find out just how it affects currently released scripts.

Yea or nay?
 

fronobulax

Developer
Staff member
You know, I haven't heard a peep out of anybody who actually publishes scripts: just from people interested in programming language design.

Perhaps I should just submit that patch into the wild and let us find out just how it affects currently released scripts.

Yea or nay?

You have never had a bad idea, although it sometimes took me a while to recognize something was a good idea and that sometimes "different" was "not bad". :) That said I will grab your patch, apply it and run with it tomorrow and report back midday or later.
 

Crowther

Active member
You know, I haven't heard a peep out of anybody who actually publishes scripts: just from people interested in programming language design.

Perhaps I should just submit that patch into the wild and let us find out just how it affects currently released scripts.

Yea or nay?
I'm in both categories. I say "yea". Easy fax currently does this.
Code:
        if (contains_text(to_lower_case(has[source]), to_lower_case(s))) {
However, one of the things I want to do is have it search for an exact match first. As it is now if I did == it would match things with the wrong case, so rather than use one of the more complicated solutions, I decided to wait and see if this change is made, before I add an exact search. So as a script writer, I say "yea".

I also like the change from a design perspective. Consistency and symmetry are important.
 

heeheehee

Developer
Staff member
Currently we can technically do exact matches via something like the following:
Code:
boolean equals(string a, string b) {
    return a.contains_text(b) && b.contains_text(a);
}
 

xKiv

Active member
As someone who sometimes privately writes a script or 95 ...
- I don't think I have ever needed == to be case insensitive
- I think I have, at least once, been (unpleasantly) surprised by == being case insensitive

Depends. Do you want to kill syntactic mice or catch syntactic flies?

Mostly I don't want to poison my syntactic mafia.
 

fronobulax

Developer
Staff member
Ran with the patch in #11. I saw a message that said something about not having enough null. I did not copy it and it was in the gCLI and not the session log. I'm not sure what generated it and it might have been due to the recent changes with items and IDs and not case sensitivity in a script. I had a lot of trouble with Ed_Ascend but that could have been me and not the script reacting to the patch. The script does expect that all turns are burned under its control and I know I violated that.

But since == being case sensitive is A Good Thing, I'd say submit the patch and deal with the consequences which look to be minor according to my limited experience.
 

Veracity

Developer
Staff member
Well, the purpsose of the exercise was to see if there would be widespread script problems with this change. I'm not sure if we answered that: it sounds like some scripts might have been reacting poorly, but, you know - they'll just need to change. I think it is time to release it into the wild and let scripts authors deal with the consequences. :)
 
Top