Feature - Rejected ASH can't handle large numbers

zarqon · Sep 9, 2011

Found what I thought was a bug in ZLib's comma-adding function, but it turned out to be a limitation of ASH's truncate() function:

> ash truncate(123451.33);

Returned: 123451

> ash truncate(1234512.33);

Returned: 1234512

> ash truncate(12345123.33);

Returned: 12345123

> ash truncate(123451234.33);

Returned: 123451232

> ash truncate(1234512345.33);

Returned: 1234512384

> ash truncate(12345123456.33);

Returned: 2147483647

> ash truncate(123451234567.33);

Returned: 2147483647

Looks like it caps at 2,147,483,647... what an unusual number for an upper limit, eh? I tested and all of ASH's float-handling functions ("ashref float") have the same issue. Do we need to try to work within these limits or could mafia possibly use BigDecimals or whatever would be necessary to work with a larger range of numbers?

Veracity · Sep 9, 2011

ASH uses a Java "int" and "float" data types for its own ints and floats.

zarqon said:
Looks like it caps at 2,147,483,647... what an unusual number for an upper limit, eh?

2**31 -1 is the usual positive upper bound for a 32 bit signed integer. It is exactly what I would expect.

Do we need to try to work within these limits?

Yes. I'm going to pre-emptively reject your Feature Request. It would be a huge and bug-prone job to try to shift to data types with larger ranges.

Veracity · Sep 9, 2011

Following up to myself:

You originally had this labeled as a "Bug". It is in no way a Bug. It is how binary computer arithmetic works. I changed it into a Feature request - and rejected it.

Your title claims that ASH cannot handle "large" numbers. I consider that misleading, since I think 2 billion is a "large" number. Can you give a real ASH application that needs integers with more precision than that? I can't. If you can, then that'd help your case for the feature - although I would probably still reject it.

Increasing the size of ASH ints and floats would increase the memory usage and slow the execution of all ASH arithmetic - not just the (extremely rare, I expect) cases when 32 bits of precision is insufficient. In fact, in the very early days of ASH, "floats" were stored internally as Java "doubles" - and holatuwol changed them all to Java "floats".

Or perhaps you want additional arithmetic data types - a "long" and a "double", say. That would geometrically increase the complexity of ASH's Operator class; we would not have to worry about simply mixing int and floats - 2 data types for the Left operand and 2 for the Right operand gives 4 cases - but would have to do the right thing with 4 data types on the left and 4 on the right - 16 cases.

zarqon · Sep 9, 2011

Veracity said:
2**31 -1 is the usual positive upper bound for a 32 bit signed integer. It is exactly what I would expect.

You are perfectly correct to interpret me literally since I didn't include a winky face afterwards.

Veracity said:
Your title claims that ASH cannot handle "large" numbers. I consider that misleading, since I think 2 billion is a "large" number.

I don't consider the thread title misleading. Note in the above example, truncate() starts delivering wrong answers with numbers as small as 123451234.33. That's a number which is probably smaller than the net worth of a lot of players running a script like networth.ash. I understand if it's something we have to live with, but it is a problem which inventory-handling scripts, display case management scripts, etc. may bump up against more regularly than you might think.

Veracity · Sep 9, 2011

You want positive numbers bigger than 2**31? Keep them as floats. You will not be able to have more than 32 bits of precision, but, so it goes.
Regarding Meat, On Dec 18, 2007 in this post, Ludwig von Miser (Bank of KoL) says:

Minor bug here:

My V11.9 Mafia keeps giving me unexpected errors when I have lots of meat on-hand (above the two billion mark). It says I have no shop and no DC when I attempt to access them. Also, it displays my meat as 0 and therefore does not allow me to purchase anything in the mall. I have to toss my funds into the closet before it operates normally.

Thanks for your attention

to which fronobulax replied

If you would transfer a couple billion meat to the dev team or an alpha tester I'm sure we could track this down for you

Seriously, I did note this at the alpha tester's forum so at least people know about it and the subsequent speculation that it is due to using ints instead of longs. Kind of tempted to try verify and fix this one myself.

It doesn't look like frono - or any other KoLmafia dev - has found the motivation to change the internal storage and display of Meat - which would be a bug prone change that is certainly untestable by any of us - especially considering that there is an ASH function to return your amount of Meat, and that would require the huge and error-prone change to ASH that you are requesting in this thread.

zarqon · Sep 13, 2011

Yes, 2 billion is an acceptable upper limit, unlikely to be encountered by the vast majority of scripting operations. However, the point of this report was to show that the issue starts manifesting with much smaller numbers -- more testing shows that the problem starts with any number over 100 million (likely due to internal use of scientific notation), which is a number scripters will definitely bump up against.

It doesn't have a huge impact on anything I'm working on, but I'd wanted to bring it to the devs' attention since when working with numbers over 100M, the results will be unexpectedly imprecise. For example, max(100000100.0, 100000000.0) returns 100000096, which is neither of the numbers I entered. Where did the -4 come from? That's what led me to report it as a bug. If you think this behavior is fine, then it's fine -- with numbers that high, differences +/- 10 are not going to make much of a difference. However, it should be noted in the Wiki that floats with absolute values greater than 100 million are frequently changed from their literal value by a small amount.

Veracity · Sep 13, 2011

I said it before, but it bears repeating: that is how binary computer arithmetic works. ASH uses Java "int" and "float" types for its "int" and "float" types. Both of those are 32-bit numbers. I have assumed that people using a "float" would understand what a floating point number was. This thread proves that I am wrong.

An int is a 32-bit two's-complement signed integer with a range of -2**31 to 2**31-1
A float is an IEEE 754 32-bit float with 8 bits of exponent (2**-126 to 2**127) and 24 bits of precision

You do not understand what that means. You have proven that repeatedly in this thread - most recently with your statement that "floats ... are frequently changed from their literal value by a small amount." You use the word "float" and the phrase "literal value" in the same sentence. I've never heard the phrase "literal value" - which is a red flag. In context, you appear to mean "printed representation". You appear to imagine that "123.456" is the "literal value" for a "float" and that any such "literal value" can and should be storable.

Once again, that is not how binary computer arithmetic works. We do not store numbers - floats or any other kind - as strings of characters or other "literal values". We store them in a binary representation with a particular precision and range. All computers and (almost all) computer languages do that. Common Lisp has bignums, which are arbitrary precision integers, and rationals, which are a numerator and denominator of such, but I am not aware of any other computer language with built-in arbitrary precision arithmetic, integer or floating point. (I said "I am not aware" of any. I did not say "There are none". I also said "built in" - as opposed to "in a library". People do not need to chime in with "corrections".)

But, don't feel alone. Neither do most ASH programmers, I'd wager. Only those who are trained as computer programmers in (relatively) low-level languages like C, probably, understand or care about the actual data representation of data types.

Does the Wiki need a "lesson" on the data representation of its data types? Maybe. That'd be better than "noting in the Wiki" that "floats with absolute values greater than 100 million are frequently changed from their literal value by a small amount". I'd prefer to simply note that an integer has 32 bits of precision, a float has 24 bits of precision, and that converting an integer with more than 24 bits of precision to a float will lose the extra bits.

> ash ( 2 ** 24 )

Returned: 16777216

> ash ( 2 ** 24 + 1 )

Returned: 16777217

> ash ( to_float( 2 ** 24 ) == to_float( 2 ** 24 + 1 ) )

Returned: true

xKiv · Sep 13, 2011

Veracity said:
A float is an IEEE 754 32-bit float with 8 bits of exponent (2**-126 to 2**127) and 24 bits of precision

Which means that once you go over 16M, you have worse precision than 1 ...
could we get doubles[1] here at least? (I mean across-the-board substitution)
I have a faint, vague recollection that there are some issues with how the values are aligned in memory, the arithmetic is done in the same FP registers and the results then truncated, etc, and other stuff (depends on HW architecture, of course) ... so I wouldn't be sure that using floats has *any* advantage over using doubles, ever, unless you explicitly *want* less precision.
But I might be warped by working on an application that requires precision specified in number of decimal places ...
(and the old-version pieces of code that use float end up causing enless headaches)

"literal value"

I would say that means "the value of the literal used to populate the value" (we do have literals in ASH, right?)

For the educamafation and reference of masses:
[1] Doubles are 52 bits of (stored) mantissa, that should cover whole number up to roughly 8*(10**15) [2] ... still not an unimaginably awesome and uselessly huge number, but a signigicant upgrade
[2] 52 bits of the number are stored, but it actually has 53 bits - the first bit is always 1, unless the number is zero, so storing it would be wasteful

Veracity · Sep 13, 2011

xKiv said:
could we get doubles[1] here at least? (I mean across-the-board substitution)l

Some history:

r407 | veracity0 | 2006-05-10 09:45:27 -0500 (Wed, 10 May 2006) | 3 lines
Changed paths:
M /src/net/sourceforge/kolmafia/KoLmafiaASH.java

Since intermediate calculations are done in doubles, store float values as
doubles. Print float values as floating point, not rounded to integers.

On May 10, 2006, I made the ASH "float" data type internally be a Java "double".

r1399 | shwei | 2006-09-02 22:03:59 -0500 (Sat, 02 Sep 2006) | 4 lines
Changed paths:
...
M /src/net/sourceforge/kolmafia/KoLmafiaASH.java
...

Make StaticEntity.client private
All data is written/read with UTF-8
Centralize integer/float property parsing
Use single-precision floating point arithmetic

On September 2, 2006, holatuwol reverted my earlier change; he made them into Java "floats" again.

I throw up my hands. I'm not willing to go there again. Perhaps hola will say something.

slyz · Sep 13, 2011

Now that we know how ASH ints and floats work, maybe scripters can use an alternative methods, like coding large numbers over several ints and/or floats?

xKiv · Sep 13, 2011

Huh.
Well, switching to doubles wouldn't really be a "fix" anyway, just a "shove the problem under the carpet" kludge.

Hypothetically, there could be an API for arithmetic using BigDecimal under the hood and exposing the values as strings, but that would be seriously slow, hard to read/write the code, and just plain ... unclean?
(even worse: exposing the values as *handles* ...

Code:

int dividend;
int divisor;
int result;
dividend=horrible_number("1234569763.123445875"); # is now "the first horrible number" so dividend == 1
divisor=horrible_number("876535.43285"); # is now "the second horrible number" so divisor == 2
result = horrible_division(dividend,divisor, 2); # is "the the third horrible number" so result == 3
print(horrible_to_string(result)); # prints "1408.47"

(where's a puking "smiley" when I need one?)

slyz · Sep 13, 2011

I'm waiting for horrible.ash then

xKiv · Sep 14, 2011

That would have to be mafia-side functions. Implementing arbitrary precision decimal arithmetic in ASH? No, please.

I mean, you *could* do it with strings or arrays, but it would be unnecessarily difficult, complex, reinventing wheel, error-prone, slow, memory-hungry, ... I don't want to do it that way.

heeheehee · Sep 16, 2011

And then of course you don't have clean operator overloading, so you'd have to be all like "bigDecimal.add(otherDecimal)" if you ever wanted to add these.

holatuwol · Sep 17, 2011

xKiv said:
But I might be warped by working on an application that requires precision specified in number of decimal places ... (and the old-version pieces of code that use float end up causing enless headaches)

That's understandable.

After working with doubles in school projects, I decided that whenever I was given a choice, I'd refuse to use 64-bit numbers when I knew the program would be running on 32-bit architectures OR 32-bit Java runtime environments.

The reason for this is that results that were very clearly wrong came out when we stored things in 64-bit doubles, and they could only be resolved if we switched to 32-bit floats, and I ultimately decided that it had to do something with the wonky way in which 64-bit numbers were emulated in 32-bit Java did not go well with how 64-bit numbers would work on the actual 32-bit hardware.

Nowadays, 64-bit architectures are actually much more commonplace and ALUs have probably gotten orders of magnitudes more sophisticated in handling them, so I imagine the bizarre results that I saw years ago in my undergraduate days are no longer happening.

fronobulax · Sep 17, 2011

holatuwol said:
Nowadays, 64-bit architectures are actually much more commonplace and ALUs have probably gotten orders of magnitudes more sophisticated in handling them, so I imagine the bizarre results that I saw years ago in my undergraduate days are no longer happening.

That's probably true but the argument still holds as long as Java 1.4 is the supported target environment and KoLmafia doesn't work well with 1.7.

The experience that scarred me for life was finding a genuine bug in an optimizing compiler as a grad student. As a result I tend to distrust compiler options, and confirm that "debug" and "production" compilations both yield code that produces the same results.

xKiv · Sep 17, 2011

holatuwol said:
The reason for this is that results that were very clearly wrong came out when we stored things in 64-bit doubles

Curious ... remember any examples?

holatuwol · Sep 17, 2011

xKiv said:
Curious ... remember any examples?

Unfortunately not, or I'd test them now to see if there are still problems with those examples.

Veracity · Jun 4, 2012

We rejected this, originally, but now ASH uses longs and doubles. Is it still the case that ASH "can't handle" large numbers?

xKiv · Jun 4, 2012

Well, the original examples are no longer large, that;s now more than ...

Code:

> ash truncate(12345678991231345230984562058087634583496.33);

Returned: 9223372036854775807

Feature - Rejected ASH can't handle large numbers

Well-known member

Developer

Developer

Well-known member

Developer

Well-known member

Developer

Active member

Developer

Developer

Active member

Developer

Active member

Developer

Developer

Developer

Active member

Developer

Developer

Active member