Compact file format for arrays.

Veracity

Developer
Staff member
The ASH functions map_to_file and file_to_map have an optional third argument which specifies that the file format is "compact". Since that is the default, and I expect everybody wants that behavior, you may not even know what this means. Currently, it has an effect only for maps of records.

For a record, it puts all the fields for a record on the same line of a data file, rather than initializing them one field per line.

For example, here is a script which has a map of records and wants to save/restore it from a file in the data directory.

Code:
record test {
    int a;
    string b;
    location c;
};

test [int] map;

map[0] = new test( 1, "abc", $location[ none ] );
map[1] = new test( 2, "def", $location[ The Spooky Forest ] );
map[2] = new test( 3, "ghi", $location[ Barf Mountain ] );

void print_map( string title, test [int] map )
{
    print( title );
    foreach index, val in map {
	print( "map[" + index + "] = (" + val.a + "," + val.b + ", " + val.c + ")" );
    }
    print( "" );
}
print_map( "Original map", map );

map_to_file( map, "arec1.txt", true );
test [int] map1;
file_to_map( "arec1.txt", map1, true );
print_map( "Compact file", map1 );

map_to_file( map, "arec2.txt", false );
test [int] map2;
file_to_map( "arec2.txt", map2, false );
print_map( "Non-compact file", map2 );
yields this:

Code:
[color=green]> arecs.ash[/color]

Original map
map[0] = (1,abc, none)
map[1] = (2,def, The Spooky Forest)
map[2] = (3,ghi, Barf Mountain)

Compact file
map[0] = (1,abc, none)
map[1] = (2,def, The Spooky Forest)
map[2] = (3,ghi, Barf Mountain)

Non-compact file
map[0] = (1,abc, none)
map[1] = (2,def, The Spooky Forest)
map[2] = (3,ghi, Barf Mountain)
These are the two files generated by map_to_file.

Compact (arec1.txt):

Code:
0	1	abc	none
1	2	def	The Spooky Forest
2	3	ghi	Barf Mountain

Non-compact (arec2.txt):

Code:
0	a	1
0	b	abc
0	c	none
1	a	2
1	b	def
1	c	The Spooky Forest
2	a	3
2	b	ghi
2	c	Barf Mountain
The compact version has one record per line with all the fields in order, whereas the non-compact version sets one field at a time by name.

What if you have a map of maps? Which is to say, effectively a multi-dimensional map.

Maps only have a non-compact representation. Whether or not you say "compact" in map_to_file or file_to_map, you will get multiple lines of data. The first field will be the index of the outer map, the second field will be the index of the nested map, and the third field will be the data. (That is for a two-dimensional map; things are more complicated for more dimensions, but they should work.)

What if you have a map of arrays?

Currently, they behave like maps and only have a non-compact representation. However, since we know what all the "keys" are - they are simply integers from 0 - <N> - we could have a compact representation which doesn't save the keys.

I have implemented a "compact" file representation for arrays. This script:

Code:
typedef string[3] type_v;
typedef string type_k;
type_v [type_k] map;

map[ "first" ] = string[3] { "a", "b", "c" };
map[ "second" ] = string[3] { "b", "c", "a" };
map[ "third" ] = string[3] { "c", "a", "b" };

// Not compact
map_to_file( map, "amaps1.txt", false );
type_v [type_k] amap1;
file_to_map( "amaps1.txt", amap1, false );

// Compact
map_to_file( map, "amaps2.txt", true );
type_v [type_k] amap2;
file_to_map( "amaps2.txt", amap2, true );

void print_map( string title, type_v [type_k] map )
{
    print( title );
    foreach key, val in map {
	foreach index, str in val {
	    print( "map[" + key + "][" + index + "] = " + str );
	}
    }
    print( "" );
}

print_map( "Original map", map );
print_map( "Non-compact map", amap1  );
print_map( "Compact map", amap2 );
Yields this:

Code:
[color=green]> amaps.ash[/color]

Original map
map[first][0] = a
map[first][1] = b
map[first][2] = c
map[second][0] = b
map[second][1] = c
map[second][2] = a
map[third][0] = c
map[third][1] = a
map[third][2] = b

Non-compact map
map[first][0] = a
map[first][1] = b
map[first][2] = c
map[second][0] = b
map[second][1] = c
map[second][2] = a
map[third][0] = c
map[third][1] = a
map[third][2] = b

Compact map
map[first][0] = a
map[first][1] = b
map[first][2] = c
map[second][0] = b
map[second][1] = c
map[second][2] = a
map[third][0] = c
map[third][1] = a
map[third][2] = b
Here are the two files generated by map_to_file.

Compact (amaps2.txt):

Code:
first	a	b	c
second	b	c	a
third	c	a	b

Non-compact (amaps1.txt):

Code:
first	0	a
first	1	b
first	2	c
second	0	b
second	1	c
second	2	a
third	0	c
third	1	a
third	2	b
Which is completely analogous to compact & non-compact records.

It also means that if you read questslog.txt as a file with a "compact" array (which it is, effectively), you can write out a file in the same format.

I like this, but it is not completely backwards compatible: if you had a map whose values are arrays (not maps) and saved it with map_to_file, you previously always got the "non-compact" version - even if you didn't say you wanted that. Especially if you didn't bother saying that, since the default two-argument version of map_to_file and file_to_map assumed "compact".

The solution for scripts is simple: if you had a map of arrays (again, not maps) which you saved to a file, add the optional third parameter to map_to_file and file_to_map where you saved & restored it to be "false" (boolean, not as a string).

I may just submit this, but I thought I'd throw it out for comments first. For a while. :)
 

xKiv

Active member
So, arrays can only be compact if they are the last thing in each row?

(IOW, a record with two arrays in it cannot write the first array compactly, because how would file_to_map know where that array ends?)
 

Veracity

Developer
Staff member
It would know where the first array ends if it is a fixed-size array.

That said, there is currently code that will not do records compactly if they contain any aggregate. Used to be, the only kind of aggregate was a map - which can be of variable length. A zero-length array is of variable size, so would be also disallowed. But it seems to me that a record containing one or more fixed-size arrays would be of known size, however. For that matter, an array containing fixed size arrays - i.e. a multi-dimension array, rather than map - has a known number of elements.

Code:
record foo {
    string str;
    int[3] int_array;
    string[3] str_array;
};

typedef string[3,3] square3;
Given that, sizeof( foo ) is 7 and sizeof( square3 ) is 9. (expressed in number of "values" - or "fields" - in a line of the file).

I might make an internal method on CompositeType which is, essentially, sizeof the data type. Returns -1 if unknown - a map or zero-length array or a record containing such - or a positive integer counting the primitive data values. A composite for which sizeof is not -1 could be flattened into compact form.
 

Veracity

Developer
Staff member
Not too shabby.

Code:
record test {
    string str;
    int [3] ints;
    string [3]strings;
};

test [int] map;

map[0] = new test( "abc", { 1, 2, 3 }, { "a", "b", "c" } );
map[1] = new test( "def", { 4, 5, 6 }, { "d", "e", "f" } );
map[2] = new test( "ghi", { 7, 8, 9 }, { "g", "h", "i" } );

void print_map( string title, test [int] map )
{
    print( title );
    foreach index, val in map {
	buffer buf;
	buf.append( "map[" );
	buf.append( index );
	buf.append( "] = ( \"" );
	buf.append( val.str );
	buf.append( "\", {" );
	string delim = "";
	foreach x, i in val.ints {
	    buf.append( delim );
	    delim = ", ";
	    buf.append( i );
	}
	buf.append( "}, {" );
	delim = "";
	foreach x, s in val.strings {
	    buf.append( delim );
	    delim = ", ";
	    buf.append( s );
	}
	buf.append( "} )" );
	print( buf );
    }
    print( "" );
}
print_map( "Original map", map );

map_to_file( map, "arec3.txt", true );
test [int] map1;
file_to_map( "arec3.txt", map1, true );
print_map( "Compact file", map1 );

map_to_file( map, "arec4.txt", false );
test [int] map2;
file_to_map( "arec4.txt", map2, false );
print_map( "Non-compact file", map2 );
Yields:

Code:
[color=green]> arecs.ash[/color]

Original map
map[0] = ( "abc", {1, 2, 3}, {a, b, c} )
map[1] = ( "def", {4, 5, 6}, {d, e, f} )
map[2] = ( "ghi", {7, 8, 9}, {g, h, i} )

Compact file
map[0] = ( "abc", {1, 2, 3}, {a, b, c} )
map[1] = ( "def", {4, 5, 6}, {d, e, f} )
map[2] = ( "ghi", {7, 8, 9}, {g, h, i} )

Non-compact file
map[0] = ( "abc", {1, 2, 3}, {a, b, c} )
map[1] = ( "def", {4, 5, 6}, {d, e, f} )
map[2] = ( "ghi", {7, 8, 9}, {g, h, i} )
where arec3.txt (the compact file) looks like this:

Code:
0	abc	1	2	3	a	b	c
1	def	4	5	6	d	e	f
2	ghi	7	8	9	g	h	i
and arec4.txt (the non-compact file) looks like this:

Code:
0	str	abc
0	ints	0	1
0	ints	1	2
0	ints	2	3
0	strings	0	a
0	strings	1	b
0	strings	2	c
1	str	def
1	ints	0	4
1	ints	1	5
1	ints	2	6
1	strings	0	d
1	strings	1	e
1	strings	2	f
2	str	ghi
2	ints	0	7
2	ints	1	8
2	ints	2	9
2	strings	0	g
2	strings	1	h
2	strings	2	i
 

Veracity

Developer
Staff member
I'm itching to submit this.

Nobody who might actually use this kind of thing, including zarqon, whose "how to I read questslog.txt" post inspired the original work, has had a single word to say about it.

I'm mostly concerned about the "not backwards compatible if you used file_to_map on an array of a fixed size" thing.

If I don't hear anything within a day or two, I'll submit it and we'll see what happens.
 

zarqon

Well-known member
Sorry, was out of town last weekend and playing catchup since I got back. Have been watching this with considerable interest though. Have also been getting a "null" error in my login script since these changes were made, which never used to happen, but until today I haven't had time to track that down and give you a reply. The following code:

PHP:
record {
   string[int] dex;  // kmail ID => date
   int[item] goods;  // items received
   int packs;        // packages received
}[string] booty;     // indexed by player
file_to_map("kmailrecord_"+replace_string(my_name()," ","_")+".txt",booty);

boolean register_items(kmessage m) {
   if (m.fromname == "Your Pen Pal") return true;
   if (count(m.items) == 0) return vprint("Message "+m.id+" has no items.",-9);
   if (m.fromid == 0) return vprint("Message "+m.id+" is not from a PC.",-8);
   if (m.type != "normal") return vprint("Message "+m.id+" is not a normal message.",-7);
   if (booty[m.fromname].dex contains m.id) return vprint("Already parsed message "+m.id+".",-9);
   vprint("Parsing message "+m.id+" from "+m.fromname+"...","blue",2);
  // valid unparsed message -- let's add it!
   booty[m.fromname].dex[m.id] = m.localtime;  // <-- error happens here
   foreach i,n in m.items {
      print(rnum(n)+" "+to_plural(i),"gray");
      booty[m.fromname].goods[i] += n;
   }

now aborts operation on the line noted above and gives me:

Code:
class java.lang.NullPointerException: null
java.lang.NullPointerException
	at net.sourceforge.kolmafia.textui.parsetree.RecordValue.aset(RecordValue.java:115)

So to avoid my kmail parsing aborting I've been using r18039. Just tested on r18050 and the same error occurs. Let me know if you need a full debug log.

Perhaps I should also note the kmessage type as defined in ZLib:

PHP:
record kmessage {
   int id;                   // message id
   string type;              // possible values observed thus far: normal, giftshop
   int fromid;               // sender's playerid (0 for npc's)
   int azunixtime;           // KoL server's unix timestamp
   string message;           // message (including items/meat)
   int[item] items;          // items included in the message
   int meat;                 // meat included in the message
   string fromname;          // sender's playername
   string localtime;         // your local time according to your KoL account, human-readable string
};

Otherwise, everything looks exciting and good and I will definitely make use of it for loading questslog.txt. I don't have any data files that contain arrays -- they're all maps or records, sometimes with maps in them -- so I wouldn't be affected personally by the backwards compatibility concern.
 
Last edited:

zarqon

Well-known member
After a bit more poking, I discovered BatMan RE is also currently broken by these changes. It loads a map from clover_adventures.txt (which is on the Map Manager) and ends up with all the record fields which contain maps being "null":

> ash record einclove { int[item] yield; boolean andor; string note; }; einclove[location] cloves; file_to_map("clover_adventures.txt",cloves); cloves

- is out of range, returning 0
- is out of range, returning 0
--- is out of range, returning 0
9- is out of range, returning 0
--- is out of range, returning 0
- is out of range, returning 0
Returned: aggregate einclove [location]
A-Boo Peak => record einclove
**yield => null
**andor => false
**note => 2
Camp Logging Camp => record einclove
**yield => null
**andor => false
**note => 1
Cobb's Knob Kitchens => record einclove
**yield => null
**andor => false
**note => 1
Cobb's Knob Laboratory => record einclove
**yield => null
**andor => false
**note => 3
Frat House => record einclove
**yield => null
**andor => false
**note => 1
Frat House In Disguise => record einclove
**yield => null
**andor => false
**note => 1
Guano Junction => record einclove
**yield => null
**andor => true
**note => 2
Hippy Camp => record einclove
**yield => null
**andor => false
**note => 1
Hippy Camp In Disguise => record einclove
**yield => null
**andor => false
**note => 1
Itznotyerzitz Mine => record einclove
**yield => null
**andor => false
**note => 1
Lemon Party => record einclove
**yield => null
**andor => false
**note => 1
Oil Peak => record einclove
**yield => null
**andor => false
**note => 3
Outskirts of Camp Logging Camp => record einclove
**yield => null
**andor => false
**note => 3
Post-Quest Bugbear Pens => record einclove
**yield => null
**andor => false
**note => 1
South of the Border => record einclove
**yield => null
**andor => false
**note => 1
Spectral Pickle Factory => record einclove
**yield => null
**andor => false
**note => 1
The "Fun" House => record einclove
**yield => null
**andor => false
**note => 3
The Black Forest => record einclove
**yield => null
**andor => false
**note => 1
The Castle in the Clouds in the Sky (Basement) => record einclove
**yield => null
**andor => false
**note => 1
The Degrassi Knoll Bakery => record einclove
**yield => null
**andor => false
**note => 1
The Haunted Ballroom => record einclove
**yield => null
**andor => true
**note =>
The Haunted Bathroom => record einclove
**yield => null
**andor => true
**note =>
The Haunted Billiards Room => record einclove
**yield => null
**andor => false
**note => 1
The Haunted Conservatory => record einclove
**yield => null
**andor => false
**note => 1
The Haunted Gallery => record einclove
**yield => null
**andor => true
**note =>
The Haunted Kitchen => record einclove
**yield => null
**andor => false
**note => 1
The Haunted Library => record einclove
**yield => null
**andor => true
**note =>
The Haunted Pantry => record einclove
**yield => null
**andor => false
**note => 1
The Hidden Park => record einclove
**yield => null
**andor => false
**note => 1
The Icy Peak => record einclove
**yield => null
**andor => false
**note => 1
The Knob Shaft => record einclove
**yield => null
**andor => false
**note => 3
The Limerick Dungeon => record einclove
**yield => null
**andor => false
**note => 1
The Oasis => record einclove
**yield => null
**andor => true
**note =>
The Outskirts of Cobb's Knob => record einclove
**yield => null
**andor => true
**note => 1
The Poker Room => record einclove
**yield => null
**andor => true
**note => 3
The Primordial Soup => record einclove
**yield => null
**andor => false
**note => 1
The Roulette Tables => record einclove
**yield => null
**andor => true
**note => 3
The Sleazy Back Alley => record einclove
**yield => null
**andor => false
**note => 3
The Smut Orc Logging Camp => record einclove
**yield => null
**andor => false
**note => 3
The Spooky Forest => record einclove
**yield => null
**andor => false
**note => 1
The Spooky Gravy Burrow => record einclove
**yield => null
**andor => false
**note => 1
The Unquiet Garves => record einclove
**yield => null
**andor => true
**note => 1
The VERY Unquiet Garves => record einclove
**yield => null
**andor => false
**note => 1
Thugnderdome => record einclove
**yield => null
**andor => false
**note => 1
Tower Ruins => record einclove
**yield => null
**andor => true
**note => 1
Twin Peak => record einclove
**yield => null
**andor => true
**note =>
 

Veracity

Developer
Staff member
Try revision 18051. Your record contains a map - not an array - so should not be able to write a compact datafile but I seem to have broken that.

I still have the REAL fix pending. ;)
 

zarqon

Well-known member
Still getting the mail parsing error in r18053. All the record fields that should be maps are "null" after calling file_to_map().
 

Veracity

Developer
Staff member
I broke it, I fixed half of it.

Fixing it all the way, which will be accomplished by releasing my new code - although a one line change in RecordValue would also suffice - will have to wait until I am home and have time to deal with it this evening. About 11 hours.
 

fronobulax

Developer
Staff member
I think this is an appropriate place.

I am trying to read coinmasters.txt. To save folks from looking it up, a line has a shop name, action (buy or sell), quantity, item being exchanged and an optional field which has row or row and number, separated by a comma. (For this application I am going to ignore everything after the item to be exchanged).

Code:
record CoinmasterRecord {
   string shop;
   string action;
   string quantity;
   string exchangeItem;
   string row;
};

CoinmasterRecord[] cmr;

What I would like is to say CoinmasterRecord[int] cmr; and have file_to_map automagically generate an integer line number as index. But since the first field is a string there are a lot of errors. If I say CoinmasterRecord[string] cmr; then I get some data, assigned to the expect field, but since the shop values are not unique, per line, I only get one record for each shop.

Should I make the automagic assignment of the line number as index a feature request or is there a non-intuitive (to me) way to define the record?

Thanks.
 

zarqon

Well-known member
I would like to report a seamless experience using this to load questslog.txt. It enabled me to remove many lines of code from my script and write a programmatic method of printing all the quest steps -- plus I could test for something being the last step and thus know that that step was "finished" rather than "stepX".

Thanks very much Veracity!
 

Veracity

Developer
Staff member
Should I make the automagic assignment of the line number as index a feature request or is there a non-intuitive (to me) way to define the record?
The problem is that coinmasters.txt is unlike other built-in data files in that it is not laid out like a "map" with a unique "key" as the first field.

items.txt has item ID as first field
equipment.txt, concoctions.txt have item names
combats.txt has a a location
questlslogs.txt have quest key
monsters.txt have monster name

But npcstores.txt and coinmasters.txt not only don't have a unique "key" as the first field, I'm not even sure if there is one. I vaguely recall there is at least one item which is obtainable from more than one shop (perhaps conditionally), which means that item name wouldn't work either.

You are asking for a different API which would allow you to read one line at a time from an arbitrary file and get back an array, say, of the lines in order. (An array has an implicit numeric index from 0 to X, which is what you want).

Given that, the values of each line would be a string, and you'd like the additional ability to parse a string into a record, where the individual fields are tab-separated.

The above two functions would let you do what you wanted. Write up a Feature Request.

I would like to report a seamless experience using this to load questslog.txt. It enabled me to remove many lines of code from my script and write a programmatic method of printing all the quest steps -- plus I could test for something being the last step and thus know that that step was "finished" rather than "stepX".

Thanks very much Veracity!
Glad you like it!
 

fronobulax

Developer
Staff member
Thanks.

I demonstrated the functionality I wanted by slicing and dicing a local copy of coinmasters. The result was a map, indexed by item, but I assumed no duplication which is probably wrong.

I was hoping I could get coinmasters into a map. My first thought was the name of the coinmaster was one key and using the compact flag the value would be a list of the rest of the line. Each rest of the line would be a map and so on. Needless to say the formulation was beyond me and perhaps not possible given that the guaranteed unique key would have to be the pair <shop, item>.

I played with code that implemented a new version of file_to_map that forced the line number as the index but I did not have success with the limited amount of time I put in. I started thinking of how often it would be used and did not come up with a lot of candidates.

I'll make a feature request (or ask for a code review) depending upon whether I can make something work.

Thank you again.
 

Veracity

Developer
Staff member
Well...

Maybe coinmasters.txt could go into a multi-dimensional map:

Code:
string[string,string,int,item] cm_txt;

file_to_map( "coinmasters.txt", cm_txt );

foreach name, direction, price, it, row in cm_txt {
    print( "You can " + direction + " '" + it + "' from '" + name + "' for " + price + " tokens" + ( row == "" ? "" : ( " via row " + row ) )  );
Yields:

Code:
...
You can buy 'Crimbot ROM: Rapid Prototyping' from 'Crimbo 2014' for 300 tokens via row ROW406
You can buy 'Crimbot ROM: Mathematical Precision' from 'Crimbo 2014' for 300 tokens via row ROW407
You can buy 'Crimbot ROM: Ruthless Efficiency' from 'Crimbo 2014' for 300 tokens via row ROW408
You can buy 'Mini-Crimbot crate' from 'Crimbo 2014' for 500 tokens via row ROW409
You can sell 'none' from 'Crimbo 2014' for 1 tokens via row ROW397
You can sell 'recovered elf magazine' from 'Crimbo 2014' for 1 tokens via row ROW389
You can sell 'recovered elf toothbrush' from 'Crimbo 2014' for 1 tokens via row ROW390
You can sell 'recovered elf sleeping pills' from 'Crimbo 2014' for 2 tokens via row ROW391
You can sell 'recovered elf underpants' from 'Crimbo 2014' for 2 tokens via row ROW392
You can sell 'recovered elf wallet' from 'Crimbo 2014' for 3 tokens via row ROW393
...
You could certainly build your own records.

The "none" for item came from "peppermint tailings (10)" which doesn't parse as an item; that would have to go as a string, too.
 

fronobulax

Developer
Staff member
That looks like it will do. Thank you.

I'm writing a utility that will let a user identify opportunities for efficient currency exchanges. I do expect to release it when I'm happy :)
 
Top