Rational Number library

heeheehee · Jun 4, 2012

After reading up on the silly threads re: floating point precision, I threw together this script. I then forgot to publish it for a week or so. See attached.

(yes, all this provides is support for a rational number type, in the form of a record. Probably no clashes with other libraries, since I avoided defining an abs function)

Veracity · Jun 4, 2012

I love rationals. I enjoy them in Common Lisp. They work best if you have bignums, however, since the numerators and denominators can get large quickly. We now have "long" ints, which gives you some slack, but you'll still overflow, given enough operations. The following routine, for example:

Code:

Rational mul(Rational a, Rational b)
{
    return Rational(a.num * b.num, a.denom * b.denom);
}

... would be more robust if you detected overflow on those multiplications. Not clear what you would do other than aborting...

You have a typo on line 134: there is a stray "p" at the front of the line.

But, this is fun.

heeheehee · Jun 4, 2012

Ah yes, stray p's appear all the time as a result of clumsy fingers and navigation.

No explicit support for overflows at the moment; I am considering migrating those to use BigDecimal-equivalents eventually.

Will fix ASAP.

xKiv · Jun 4, 2012

That's a pretty concise implementation of GCD there!

heeheehee · Jun 5, 2012

I also mislabeled it as "Euler's" (it's actually Euclid's).

But yeah, I love that algorithm.

edit: also, Veracity -- another note: in the constructor of Rational, I actually reduce the fractions (hence the implementation of gcd). The numbers still get pretty large, but it's slightly less of an issue.

Catch-22 · Jun 5, 2012

Nice library heeheehee. I had been mulling over writing something similar ever since ints were switched over to longs.

There seems to be a few bugs in your pow function, here's what I did to fix it but I couldn't quite tell what you were trying to accomplish with if (power & 1) so I just assumed you meant ==.

Code:

Rational pow(Rational base, int power)
{
    if (power < 0) {
        // invert.
        return pow(invert(base), -power);
    } else if (power == 0) {
        return Rational(1, 1);
    } else if (power == 1) {
        // relies on integer division truncation.
        return mul(base, pow(mul(base, base), power / 2));
    } else {
        return pow(mul(base, base), power / 2);
    }
}

I also noticed that your GCD function doesn't return absolute values, which can be fixed like so:

Code:

int gcd(int a, int b)
{
    if (b == 0) {
        return (a < 0 ? -a : a);
    }
    return gcd(b, a % b);
}

heeheehee · Jun 5, 2012

What it should've been was power % 2 == 1, or perhaps (power & 1) == 1. Too used to languages where 1 evaluates to true.

Will reupload.

Also, the gcd function doesn't need to return absolute values, since denominators are (as enforced by the Rational constructor) positive; sign is carried by the numerators.

Catch-22 · Jun 6, 2012

There's 2 other bugs in pow that you missed in my fix.

PHP:

    } else if (power == 0) {
        return Rational(1, 1); //Need to return a rational.
    }

PHP:

    } else {
        return pow(mul(base, base), power / 2); //Need a semicolon.
    }

heeheehee said:
since denominators are (as enforced by the Rational constructor) positive

Ah, yes, I see what you did there.

Code:

    int neg = ((num < 0) ^ (denom < 0)) ? -1 : 1;
    if (num < 0) num = -num;
    if (denom < 0) denom = -denom;
    int g = gcd(num, denom);
    Rational r = new Rational(neg * num / g, denom / g);

Something tells me this could be accomplished with less instructions, unfortunately that something isn't telling me how.

heeheehee · Jun 6, 2012

Ah, right on. Didn't catch that edit.

Uploading again.

Regarding shorter code: for starters, num * denom < 0 would be shorter work, but in the event of overflow (which is quite possible until I start using BigNums, which I'm still writing -- arithmetic is a bit tricky!), we run into issues sooner. Actually, hmm.

Code:

int neg = 1;
if (num < 0) {
    neg = -1;
    num = -num;
}
if (denom < 0) {
    neg = -neg;
    denom = -denom;
}
... etc.

But then it's more verbose and legible. Also, so much mutation.

But yeah, uploaded version with stuff like `equals and `cmp`. Also took out caching, since, well, why bother?

Catch-22 · Jun 6, 2012

I don't know why, but I feel as though neg should be boolean.

Code:

boolean neg;
if (num < 0) {
    neg = ~neg;
    num = -num;
}
if (denom < 0) {
    neg = ~neg;
    denom = -denom;
}
int g = gcd(num, denom);
Rational r = new Rational((neg ? -num : num) / g, denom / g);

heeheehee · Jun 6, 2012

Perhaps; it doesn't actually change anything. I could've changed the name to "sign", if it really mattered. I think you're getting that vibe because that's effectively how I'm using it.

Also, wouldn't `!neg` be more intuitive? Bitwise NOT on 0... yields -1, not 1. Granted, this is not C, so that's also not really relevant.

Catch-22 · Jun 6, 2012

heeheehee said:
I think you're getting that vibe because that's effectively how I'm using it.

Eh, it just seems wasteful to store 1 bit of information in a 64 bit integer. I wouldn't mind so much if this weren't a maths library

Perhaps it's worthwhile profiling the two different functions and determining which one is faster.

heeheehee said:
Also, wouldn't `!neg` be more intuitive? Bitwise NOT on 0... yields -1, not 1. Granted, this is not C, so that's also not really relevant.

In ASH both operations yield the same result. To me, flipping the bit seems more intuitive (false becomes true, instead of not false). Performance-wise, in my tests, flipping consistently appears to be about 2% faster (for reasons I can't explain).

heeheehee · Jun 6, 2012

Multiplication by 1 or -1 is slower than a branch misprediction / negation? I don't know, seems unintuitive... a smart enough (JVM) compiler would optimize this away...

anyway, will update; doesn't really matter to me either way. This was more of a proof-of-concept library rather than a math library, but whatever.

xKiv · Jun 6, 2012

Make that "multiplication by a number (that can be -1 or 1 but the compiler can never optimize that away without knowing that no other thread can ever touch the variable) plus two comparisons and a xor" vs. "two (three?) comparisons and sometimes a binary-complement-negation-operator-or-two-or-three".
That "sometimes" can be important. If you are only working with positive numbers, you will only be doing the comparisons in variant B, but still always comparisons, xor *and* (general) multiplication in variant A.

Catch-22 · Jun 7, 2012

ASH is interpreted, so there's other performance overheads to consider. I don't know enough about the implementation of the ASH interpreter to tell you whether or not the Java interpreted ASH is then JIT compiled into Java bytecode or if the resulting Java is simply interpreted directly by the JVM (the behaviour may also depend on your JVM). The best way for you to see if an ASH solution is optimal or otherwise is to use the profile command built-in to KoLmafia.

xKiv · Jun 7, 2012

Catch-22 said:
Java interpreted ASH is then JIT compiled into Java bytecode or if the resulting Java is simply interpreted directly by the JVM

This is completely incompatible with my understanding of what "interpretted" means.
There's no resulting Java bytecode. The only java bytecode involved is the compiled code of the interpretter itself, which is (very simplistically) just a giant

Code:

program = array of instructions;
stillRunning = true;
ip=0;
while (stillRunning) {
 switch (program[ip].instruction) {
   case CALL_FUNCTION:
      call_function(program[ip]);
      break;
   case GOTO:
     ip = program[ip].target_ip;
     continue;
  ...
 }
 ip++;
}

Also, more operations just means more overhead, no matter how high the overhead is.

Catch-22 · Jun 7, 2012

Eh, what I'm trying to say about operations in ASH is that what you see isn't always what's happening behind the scenes:

Code:

float.to_int();

1 operation.

Code:

float.truncate();

1 operation.

Here's what actually happens with float.to_int();

PHP:

if ( value.getType().equals( DataTypes.TYPE_STRING ) )
{
	String string = value.toString();
	try
	{
		return new Value( StringUtilities.parseIntInternal1( string, true ) );
	}
	catch ( NumberFormatException e )
	{
	}
	
	// Try again with lax parsing
	
	try
	{
		int retval = StringUtilities.parseIntInternal2( string );
		Exception ex = interpreter.runtimeException( "The string \"" + string + "\" is not an integer; returning " + retval );
		RequestLogger.printLine( ex.getMessage() );
		return new Value( retval );
	}
	catch ( NumberFormatException e )
	{
		// Even with lax parsing, we failed.
		Exception ex = interpreter.runtimeException( "The string \"" + string + "\" does not look like an integer; returning 0" );
		RequestLogger.printLine( ex.getMessage() );
		return DataTypes.ZERO_VALUE;
	}
}

return new Value( value.intValue() );

and here's what actually happens with float.truncate();

PHP:

return new Value( (long) arg.floatValue() );

So, from this particular example you can see that, once interpreted, to_int has more overhead than truncate, even though in ASH they look like one operation. For converting a float to an integer, to_int might seem more intuitive but, if you were going for speed, truncate actually works out to be the better option as it doesn't check the DataType prior to conversion.

As for the way the interpreter works, I don't know. My (very limited) understanding of the JVM is that it can only execute Java Bytecode. JSR 223 came in with Java SE 6, whereas KoLmafia was originally written to be J2SE 1.4 compliant, so holatuwol must've employed some other method to handle ASH commands and meet the JVM requirements. I haven't read through the code on the interpreter and it's probably quite a kludge (due to poor support for dynamic languages in earlier versions of Java, not a shortcoming in holatuwol's abilities).

Veracity · Jun 7, 2012

Sigh.

Catch-22 said:
Here's what actually happens with float.to_int();

PHP:

if ( value.getType().equals( DataTypes.TYPE_STRING ) ) { // Irrelevant, since we are passed a TYPE_FLOAT, not a TYPE_STRING } return new Value( value.intValue() );

and here's what actually happens with float.truncate();

PHP:

public static Value truncate( Interpreter interpreter, final Value arg ) { return new Value( (long) arg.floatValue() ); }

So, the difference boils down to the difference between Value.intValue() and Value.floatValue() when value is a TYPE_FLOAT. Which is to say:

Code:

	public long intValue()
	{
		if ( this.type.equals( DataTypes.TYPE_FLOAT ) )
		{
			return (long) Double.longBitsToDouble( this.contentLong );
		}
		return this.contentLong;
	}

vs.

Code:

	public double floatValue()
	{
		if ( !this.type.equals( DataTypes.TYPE_FLOAT ) )
		{
			return (double) this.contentLong;
		}
		return Double.longBitsToDouble( this.contentLong );
	}

In either method, we do this:

- this.type.equals( DataTypes.TYPE_FLOAT )
- return Double.longBitsToDouble( this.contentLong )

One type.equals() comparison.
One call to Double.longBitsToDouble()

... which makes the final calculations, module type checks in this method vs. that method boil down to:

return new Value( (long) Double.longBitsToDouble( this.contentLong ) );

vs.

return new Value( (long) Double.longBitsToDouble( this.contentLong ) );

If you can see a difference there, you have better eyes than I do.

So, from this particular example you can see that, once interpreted, to_int has a much higher overhead than truncate,

Why "once interpreted"? When the interpreter executes the parse tree, it will call either to_int() or truncate(). The overhead to call either is a Java method call. Once inside one method or the other, the difference in overhead is this line in to_int():

Code:

		if ( value.getType().equals( DataTypes.TYPE_STRING ) )

That is "much higher" overhead? Really?

JSR 223 came in with Java SE 6, whereas KoLmafia was originally written to be J2SE 1.4 compliant,

What the heck is JSR 223?

so holatuwol must've employed some other method to handle ASH commands and meet the JVM requirements.

1) holatuwol did not write ASH.
2) ASH is a Java package which has:
- a hand-written compiler which reads ASH programs and converts them into a parse tree
- an interpreter which executes the parse tree
3) I have no idea what it means to "meet the JVM requirements" other than "is a valid Java program"

I haven't read through the code on the interpreter and it's probably quite a kludge (due to poor support for dynamic languages in earlier versions of Java, not a shortcoming in holatuwol's abilities).

Sheesh. Now YOU are pissing me off.

Catch-22 · Jun 8, 2012

Veracity, I did edit my post to be more clear "truncate actually works out to be the better option as it doesn't check the DataType prior to conversion", so I don't know if when you replied you were reading an older version of my post, but there was half an hour between my edit and your post.

JSR 232 was a specification request for better scripting support within Java.

1) My bad.
2) Ok
3) Ed Ort puts things pretty aptly when he writes "the Java virtual machine knows nothing of the Java programming language", so by "meeting JVM requirements" I mean "is valid Java Bytecode".

As I tried to make clear from the outset, I have a limited understanding of how ASH becomes bytecode to be executed by the JVM and what compile time optimizations, if any, get performed in the process.

I don't know why you're pissed off, don't take offence to use of the word "kludge". Java, by their own admission, had terrible support for scripting languages in earlier versions.

fronobulax · Jun 8, 2012

Catch-22 said:
how ASH becomes bytecode to be executed by the JVM and what compile time optimizations, if any, get performed in the process.

Misunderstanding or language problem. ASH does not become bytecode in most normal uses of the term. If you can construct a mental model where a string that is user input at run-time is converted to bytecode then you could apply the same model to ASH but for the types of analyses you seem to be trying to produce, you need to view ASH as data, not code. Since ASH is interpreted at run time and that process generates data and not Java bytecode you're probably not going to get anyone who understands what you are trying to say until you start making a distinction between Java byte code, data that byte code operates on, and the process by which ASH is interpreted into data that, in turn, is operated on by byte code.

Tangentially, I doubt that KoLmafia will ever embrace any Java scripting options (such as BeanShell) just because the opportunity for misunderstanding and mischief is so much greater.

Rational Number library

Developer

Attachments

Developer

Developer

Active member

Developer

Active member

Developer

Active member

Developer

Active member

Developer

Active member

Developer

Active member

Active member

Active member

Active member

Developer

Active member

Developer