Feature - Rejected NaN support in ASH

fronobulax · Jul 7, 2012

See discussions here and here. See reasonable summary of IEEE NaN standard here.

The referenced change returned zero if a negative number was passed to sqrt which mimicked some aspects of a quiet NaN but is almost certainly not the "correct" solution for general purpose computing post 1985 or thereabouts.

A full blown implementation of NaNs with signaling NaNs, quiet NaNs, INF and exceptions sometimes thrown when created and otherwise thrown when used is probably overkill for ash and beyond the desires of most scripters.

I propose, for discussion, the following:

The ASH data type float be allowed to store a special value that is a NaN.
ASH functions will be written to test whether a float is a NaN.
Any computation that results in a NaN will return a NaN but otherwise continue execution.
Any computation that attempts to use a NaN as an operand will throw some kind of exception and script or command interpretation will stop.
The equivalent of float.toString(NaN) will display something reasonable but non-numeric.

The specific operations that will be "allowed" to generate a NaN are division by zero and square root of a negative number. (A quick look at the wiki suggests that ASH and expressions evaluated in ASH do not support the other operations that might logically generate NaNs).

It is believed that this may break some scripts but that is an acceptable consequence of preserving the integrity of ASH math.

Discuss, please.

slyz · Jul 7, 2012

If I understood correctly, the alternative is to treat sqrt(-1) like we currently treat division by zero: abort with an informative message.

Whether we use NaN or exceptions, scripters will have to adapt to handle or avoid these situations. Knowing nothing about floating point math, I wonder what the NaN approach brings that a simple exception doesn't?

Veracity said:
But you are not ending up with a negative number "by chance". You are ending up with such a number because you are not using floating point arithmetic correctly.

I guess the need to learn about floating point arithmetic is the underlying problem, in any case.

fronobulax · Jul 7, 2012

Throwing an exception stops execution of a script that does not check for negative numbers passed to sqrt before passing. NaN support allows the scripter to check the result afterwards and continue execution. It is admittedly an edge case but if a scripts computes sqrt(-1) and doesn't actually the result, the it can continue with NaN support but not with an exception.

Some of the motivation for NaN as opposed to exception seems to be fears about BatBrain. Both would break it but I infer from comments that NaN is preferred.

That said, I can certainly see just throwing another exception and letting things fall where they may until ASH supports other functions that would generate a NaN.

Veracity · Jul 7, 2012

Java throws an ArithmeticException if you try to divide by zero. ASH aborts with a runtime error.
Java returns a NaN if you take the sqrt of a negative number. ASH returns (returned) a NaN.

I'd say that ASH either throwing an exception or returning zero in the latter case is incompatible with Java. Personally, I prefer the former, since I think it is a coding error to attempt that.

As an experiment, I commented out the new "check arg for < 0" in square_root. This script:

Code:

boolean is_nan( float val )
{
    return val != val;
}

float val;

val = 1.5;
print( "val = " + val + " is_nan = " + is_nan( val ) );
val = square_root( -1 );
print( "val = " + val + " is_nan = " + is_nan( val ) );

yields:

> nan.ash

val = 1.5 is_nan = false
val = � is_nan = true

slyz · Jul 7, 2012

fronobulax said:
NaN support allows the scripter to check the result afterwards and continue execution.

A check for negative numbers before using sqrt() would have exactly the same result, right?

fronobulax said:
I infer from comments that NaN is preferred.

I infer that NaN is preferred only because it avoids an abort, but letting BatBrain carry on its calculations with a NaN probably won't fix it either. In either case, Z will have to account for the new floats.

I'm not advocating against NaN, I'm just trying to understand all the benefits it brings. I really do learn a lot about programming from the conversations here

Veracity · Jul 7, 2012

One other observation: I've seen comments about getting floating point crud via modifier_eval. Fun fact: although we have converted ASH arithmetic to use doubles, modifiers - and modifier_eval - still use floats.

Catch-22 · Jul 7, 2012

Veracity said:
One other observation: I've seen comments about getting floating point crud via modifier_eval. Fun fact: although we have converted ASH arithmetic to use doubles, modifiers - and modifier_eval - still use floats.

I noticed the internal use of floats instead of doubles, figured I would request that to be changed at a later date.

Another possible avenue of repair would be to pass the NaN through everything (as is already happening) until it gets to the SortBy which causes the original error, the sortBy can then hand back "Couldn't sort on line blahblah, value: NaN". That way people can see what's actually causing the error and fix it accordingly. Something that needs to be changed is the replacement unicode character being used instead of NaN because that confused me at first when I saw it.

Veracity · Jul 7, 2012

Revision 11263 changes modifiers and everywhere that uses them to use doubles rather than floats.

fronobulax · Jul 7, 2012

slyz said:
A check for negative numbers before using sqrt() would have exactly the same result, right?

Depends whether there are side effects when computing sqrt or not. Also depends upon whether the result of the calculation is actually used or not. On a practical basis, no difference, but I suspect something like

x = sqrt(-1)
...

If (x > 0)...

Would fail at the assignment for an exception and at the If statement for a NaN and so the relevance would be whether any interesting code took actions during the ...

Back in the days when the floating point coprocessor was a separate chip and real numerical geeks used the co-processor from Weitek and not Intel, it was common to write code that precomputed the results of several function calls before looking at any of the results. But the reasons to do that certainly do not apply to ash.

fronobulax · Jul 7, 2012

OK. Later today, or possibly tomorrow, I am going to change the code so sqrt(neg) throws an exception. Let me know if this is not a Good Idea or not an improvement over the current situation.

Catch-22 · Jul 7, 2012

fronobulax said:
OK. Later today, or possibly tomorrow, I am going to change the code so sqrt(neg) throws an exception. Let me know if this is not a Good Idea or not an improvement over the current situation.

Did you read my suggestion? sqrt(neg) isn't necessarily bad, it's the original report about sortBy that doesn't play nice with the unicode replacement character. This fixes no bugs, it just prevents some from having the opportunity to occur.

Veracity · Jul 7, 2012

I direct your attention to RuntimeLibrary.ash:

Code:

	public static Value square_root( Interpreter interpreter, final Value val )
	{
		double value = val.floatValue();
		if ( value < 0.0 )
		{
			throw interpreter.runtimeException( "Can't take square root of a negative value" );
		}
		return new Value( Math.sqrt( value ) );
	}

The ASH "square_root()" function already behaves like that. I assume you will be changing Expression, which is used by expression_eval() and modifier_eval(). I presume that you will need to make it throw an ArithmeticException for sqrt( -x) and that you will need to make those two ASH functions catch the ArithmeticException and throw a RuntimeException, passing up the reason and including the filename and linenumber.

I notice that Expression.eval() will blithely divide by zero for you, which will already throw an exception. Or maybe not; integers, yes, but perhaps floating point will return an infinity or something. In which case, both sqrt( -x) and (x / 0 ) in an expression should throw Arithmetic Expressions.

Veracity · Jul 7, 2012

Catch-22 said:
Did you read my suggestion? sqrt(neg) isn't necessarily bad, it's the original report about sortBy that doesn't play nice with the unicode replacement character. This fixes no bugs, it just prevents some from having the opportunity to occur.

Um.

So, the issue is that you are sorting something which needs to compare floats, and NaNs cannot be compared.
The Java function we call to print floats does not print NaNs nicely.
You believe that "sort by" on Values uses the printed representation for floats (pro tip: it does not).

So, you want to patch the symptom - once NaNs appear in data, we cannot compare or print them correctly - rather than address the root cause - why are we allowing NaNs to appear in data? You want a band-aid.

"This fixes no bugs" - in user scripts. Instead, it makes them visible, allowing the script writer to fix their bug.
"It prevents some from having the opportunity to occur" - you mean "it prevents some from manifesting later, far from the place where the bad data was generated."

Philosophically, I could not be farther from the position you just expressed.

Catch-22 · Jul 7, 2012

May I also throw something else into the mix here?

According to the wiki page for modifier_eval(): modifier_eval never generates errors - a malformed expression will just print a message, and return an arbitrary value.

This was entered by Bale not long after modifier_eval was added to KoLmafia. This has been the way all scripters who have read the wiki have come to believe this function will work.

Veracity · Jul 7, 2012

Times change.

Catch-22 · Jul 7, 2012

Veracity said:
You believe that "sort by" on Values uses the printed representation for floats (pro tip: it does not).

No, that's not what I meant, I probably could've been clearer. Printing the unicode character replacement is not helpful to the user, in the sortBy exception handler it would be useful if it said "hey buddy, you can't sort on NaN!"

Returning a NaN from sqrt is Java behaviour. NaN is still a valid float (float.intBitsToFloat(0x7fc00000)). The invalid behaviour is trying to sort on NaN.

Catch-22 · Jul 7, 2012

As far as I see it, there are only really two ways to go about "fixing" this issue. Either handle the exception in sortBy in a manner which provides useful feedback to the scripter, or disallow a NaN return value from internalEval entirely (not just sqrt).

I do firmly believe that modifer_eval is not broken it's behaving in exactly the way it was documented to behave. A malformed expression has been entered, and an arbitrary value has been returned. This only becomes an issue when you try to sort on that arbitrary value.

IMO the best avenue for repair? Return the NaN, print a message saying "Warning, NaN returned" (this is similar to if you did something like modifier_eval(")") it says, hey your expression is whack, but here's a number anyway).

If you try to sort on a NaN, regardless of where that NaN has come from, it should abort and say "invalid sort on line 324, couldn't sort element #32, value: NaN"

Veracity · Jul 7, 2012

Catch-22 said:
Returning a NaN from sqrt is Java behaviour.

ASH is not Java. It is implemented in Java and its operation is driven by that, but Java does not control what we present to the ASH script writers. I would venture to say that most ASH script writers are not trained in programming, much less Computer Science. I'd say that all the discussion we've had about floating point in which even experienced authors like zarqon and Winterbay simply Don't Get the implications of it makes that crystal clear.

Now, I see a few possible approaches, in general:

1) Make ASH a language in which you don't HAVE to understand fancy Computer Science and Numerical Analysis in order to solve problems which, in other languages (like Java) you do have to understand such concepts. The rejected BigDecimal proposal would fall in this camp.
2) Say that "ASH arithmetic is just like Java" and require script writers to deeply understand IEEE floating point arithmetic if they attempt to solve numerical problems.
3) Or something in the middle: Rather than providing +infinity as a "float" value, give a runtime error if you divide something by 0.0. Rather than providing NaN as a "float" value, give a runtime error if you generate such a value (currently, only via square_root). Give the user a chance to FIX THEIR SCRIPT.

Catch-22 said:
As far as I see it, there are only really two ways to go about "fixing" this issue. Either handle the exception in sortBy in a manner which provides useful feedback to the scripter, or disallow a NaN return value from internalEval entirely (not just sqrt).

Umm. So this is all about sort? Why not do both of the above?

I do firmly believe that modifer_eval is not broken it's behaving in exactly the way it was documented to behave.

Your point? We change modifier_eval, we change the documentation. I changed "^" to "**" a while ago. That was incompatible with the documentation. Should I have rejected it because of that reason? Hola moved "Breakfast" options to Automation/Login - and several people, following our "new user guide" have said they cannot find "Breakfast". Should he have not moved the options? (Should (somebody) fix the user guide? Yes.)

A malformed expression has been entered, and an arbitrary value has been returned. This only becomes an issue when you try to sort on that arbitrary value.

As a script author, wouldn't you prefer to fix the malformed expression so that it doesn't return an "arbitrary" value?

Or, perhaps, if you take sqrt( -anything ), how about if it returns 1234.5678? That's an "arbitrary" value. Surely your program won't mind. Huh. I sort of like that idea.

IMO the best avenue for repair? Return the NaN, print a message saying "Warning, NaN returned" (this is similar to if you did something like modifier_eval(")") it says, hey your expression is whack, but here's a number anyway).

"Printing" something is almost certainly the wrong "repair". It doesn't help the program itself do anything.

If you try to sort on a NaN, regardless of where that NaN has come from, it should abort and say "invalid sort on line 324, couldn't sort element #32, value: NaN"

Value.compareTo could, itself, throw an error, SortBy could catch it and abort with filename and line number. The "element #" is known only to the Java sort algorithm itself. I have no problem with this, but it is moot if we eliminate the (single) way which ASH can generate NaNs - in Expression.eval().

Catch-22 · Jul 7, 2012

Okay, so I just tried to sort on NaN and that's not even an issue, sort works fine on arrays filled with NaN. If you try to assign NaN to a map key though, you will get an NPE, but that's a different issue.

I believe I have also found another issue with BatBrain. As far as I can gather, the sort method modifies the object it is sorting on. This is a big no-no, as outlined in jasonharper's original documentation on sort.

Catch-22 · Jul 7, 2012

Veracity said:
Why not do both of the above?

I don't really have a problem if you do both, but if you do either one, the other becomes moot.

Here's the debug log for the NPE.

Generated like so: ash float NaN = modifier_eval("sqrt(-1)"); float[float] test; test[NaN] = 0.0; sort test by index;

The problem is that the sort won't work on indices with a value of NaN. Sort will, however, work on values with NaN (so if you did test[0.0] = NaN; in the above example, it would work fine).

Even though it's also currently possible to do so, I think it should be made that you can't have an index with a value of NaN.

I honestly don't think it's worth entering an abort state because of modifier_eval(). If you print a warning that NaN was returned in the modifier, then enter an abort state with useful information when you try to later on sort on an array/map indexed by that NaN, that's TWO messages the scripter is going to see.

If you are absolutely set on banishing NaN from ASH entirely, you'll need to change the way arithmetic works in ASH generally, not just in Expressions

> ash (-1**1.5) Returned: �

NaN is not really the enemy in all this.

Feature - Rejected NaN support in ASH

Developer

Developer

Developer

Developer

Developer

Developer

Active member

Developer

Developer

Developer

Active member

Developer

Developer

Active member

Developer

Active member

Active member

Developer

Active member

Active member

Attachments