Linear Regression

Veracity

Developer
Staff member
If you know what this is, you know whether you need it.

Suppose you have a data set for a function, mapping X to Y.
It is "linear" if the formula for the function is a "line":

y = B * x + A

Suppose you have a set of "points" - (x,y) - for which the "y" value is one possible output from the "x" input.
Pure math, there is no fuzz; x always maps to exactly y.
Real world, y may vary.

We see something like this in KoL. A monster may have "X" attack, but they throw in randomness, and you may see X as X-5 to X+5.
There may be a triangular distribution or a uniform distribution, but, whatever. The value varies.

We also see this in functions in KoL: given input X, the resulting Y may vary from y-10 to y+10, say.
Because KoL loves throwing in randomness.

If you want to figure out the linear formula - i.e. the A and B constants - given a set of (X,Y) values, Linear Regression is your friend.

I have two little ASH libraries you can use to do this. The base API:

Code:
record Point
{
    float x;
    float y;
};

record Coefficients
{
    // best-fit line: y = B x + A
    float intercept;    // intercept A of the best-fit line
    float slope;        // slope B of the best-fit line
};

Coefficients linear_regression(Point[] points);
string to_string(Coefficients c);
float predict(int x, Coefficients c);

Ezandora used the "least squares" algorithm to calculate this in BastilleRelay.ash. It's O(N) - it makes one pass through the data - to calculate A&B. I extracted it, inserted it into this API, and fixed several bugs.

Use
Code:
import <SimpleLinearRegression.ash>;
and there you are.

I also found a Java implementation of the "least squares" algorithm on the Web from Princeton to accompany a textbook. I translated it from Java to ASH (which was trivial) and wrapped it in this API and here you are. This is also O(N), although it makes two passes through the data. The first calculates A&B . The second calculates statistical analysis - including R2 - which is a number from 0.0 to 1.0, telling you how well the line "fits" your data. This implementation provides this:

Code:
record Coefficients
{
    // best-fit line: y = &beta; x + &alpha;
    float intercept;    // intercept &alpha; of the best-fit line
    float slope;    // slope &beta; of the best-fit line
    float r;        // coefficient of determination
    float svar0;    // intercept standard error
    float svar1;    // slope standard error
};

and the "to_string()" method also includes "R2".

Code:
import <LinearRegression.ash>;
to get this.

So. If you want to actually process your data at runtime, use "SimpleLinearRegression". If you want to keep building up your data set and see how well your data is "fitting" - i.e., R2 is approaching 1.0 - with the intent of building in the actual formulas - O(0) - into your program, use LinearRegression.

The former was created by Ezandora, refined and bug-fixed by me, and has no license.
The latter is under GPL - translated from Java to ASH - which still counts as a "derivative" work - so you have to provide your source code when you distribute it. Since you are publishing an ASH script, not a binary, duh. Not a problem.

I'm using this for a project I am working on in which I am highly optimistic at being able to understand some KoL internals based on user-observed behavior.
 

Attachments

  • SimpleLinearRegression.ash
    1.2 KB · Views: 0
  • LinearRegression.ash
    3.5 KB · Views: 0

Veracity

Developer
Staff member
I'll tell you a little about my project and how I need this.

I'm working to derive the internals for the Bastille Battalion game.
For that game, you have 6 stats.
You can use game rounds to work on offense, work on defense, or hunt cheese.
(Total cheese is the goal you are scored on.)
You can use 1-round potions to improve your stats - 3 potions, each of which improves 2 stats.)

Things I'd like to derive:

- You start with stats. What values? There are "needles" on the console which shift when you select a configuration or add/subtract to individual stats. Based on a pixel offset, starting configurations will give you from 0-8 for each stat - although the needles can go below 0 or above 8.
- Potions improve stats - and you can see it in battles - but do not affect the needles. How much do they improve stats?
- 12/16 of the cheese seeking rounds scale according to stats: either favoring high or low values.
- You have up to 5 increasingly difficult battles selected from 6 kinds of enemy castle. They each have strengths and weaknesses. What are their stats, and how do they improve in later battles?

KoLmafia now collects (and saves, if you set the property to make it do so) details about every battle and every cheese.
Things I'd like to figure out:

- What are your initial stats? Cheese encounters that scale with them might help.
- What are potion effects? If they affect cheese encounters, I can derive that. If not, analyzing battles will do it.
- What are enemy castle stats vis a vis yours? Analysis of battles.
- Assuming the cheese round scaling is linear, what are the formulae?
- Does the type of enemy castle determine how much cheese you get? How much cheese DO you get?

I've been collecting battle data for 48 days - and have 2439 data points.
I've been collecting cheese data for 4 days - and have 960 data points.
That includes from battles, which I have only been collecting for 2 days.

Yesterday, I wrote a program to crunch cheese data. I need a lot more data, but preliminary observations:

- The cheese hunting rounds look linear. My "x" values are integers from 0-8 - the "bonuses" displayed by the needles. Each formula has a y-intercept - the expected cheese when x is 0 and a slope - positive if higher stats are better, negative if lower stats are better.
- The formulas don't look especially "round" - contrary to KoL expectations ;) - but they are based on stat bonuses, not stats.
- The x-intercept of each formula is about -15 for positive slopes and +15 for negative slopes.
=> Perhaps the stats all start at 15? I'll trying deriving formulae with that expectation and see what the formulae look like.
- There does not appear to be any difference with or without potions. Apparently, those only affect battles.
- Castle type does not not affect the reward. It appears to be about 45 per castle level, or so.

Basically, I need a lot more data. Two of my multis recently bought the IOTM in the mall - for less than the price of a Mr. A. That's why my "spading multis" spend their time farming Meat.

I will not share my dataset, but I will show you what it looks like.

Here is a Game in which the character spent every "prep" round Working on Offense and ended up beating all 5 enemy castles.

Code:
20220528.116904.20.3    3    frenchcastle    NONE    1    false    56
20220528.116904.20.5    5    Trade soldiers for cheese    NONE    0    false    18
20220528.116904.20.6    6    bigcastle    NONE    2    false    103
20220528.116904.20.8    8    Levy the tax    NONE    0    false    20
20220528.116904.20.9    9    barracks    NONE    3    false    128
20220528.116904.20.12    12    masterofnone    NONE    4    false    175
20220528.116904.20.15    15    masterofnone    NONE    5    false    240

Cheese total was only 740, which is not especially good, but I wanted to collect battle data.

Here is a Game in which the character spent every "prep" round Looking for Cheese and only beat 3 enemy castles.

Code:
20220528.121572.6.1    1    Submit embarrassing catapult photos    CA    4    true    76
20220528.121572.6.2    2    Try the wall thing    CD    7    true    148
20220528.121572.6.3    3    shieldmaster    NONE    1    false    51
20220528.121572.6.4    4    Have the cheese contest    PD    5    true    127
20220528.121572.6.5    5    Use the wishing well    NONE    0    false    0
20220528.121572.6.6    6    berserker    NONE    2    false    95
20220528.121572.6.7    7    Grab the boulder    NONE    0    false    106
20220528.121572.6.8    8    Let the cheese horse in    MD    3    true    90
20220528.121572.6.9    9    frenchcastle    NONE    3    false    147
20220528.121572.6.10    10    Rob the suburb    PA    2    true    103
20220528.121572.6.11    11    Convert the barracks    MD    3    true    118

Cheese total was 1061. Still not stellar.

For your amusement, I attach my current cheese analysis program which uses LinearRegression.ash
 

Attachments

  • bbcheese.ash
    9.5 KB · Views: 1
Top