Global Static Library Namespace Context

Smelltastic · Nov 26, 2012

Hey, I don't often post around here but have been playing around with Mafia/ASH for a few months; forgive me if this has come up before but I didn't find anything in a search. I wanted to check here for any thoughts on the feasibility/usefulness of this to see if a feature request would be a good idea.

Everybody and their mother, assuming she plays KoL and uses Mafia, is using zlib and heading up all their ASH with it. Since zlib pulls its vars from a file, every single time any script is run it has to load this and go through its initializations again and again. Static variables have been implemented per this post, but I don't think zlib or other libraries could use them effectively because every script that imports it is a separate instance.

What I was thinking would be useful is to have a specified library script that's loaded on startup and every script is loaded within its single context. Ideally the user could specify a script that also contains import lines, then for any script then called that includes an import of one of the globally loaded scripts, that import would be ignored in favor of the preloaded instance.

There would be the problem of still having to reload the library(ies) if a change is made, which might make depending on static variables a problem.. so possibly globally loaded libraries would need to just not be reloaded, and require a restart of Mafia to update. Or maybe not, and statics should be written to disk if they're going to be relied on like that anyhow.

Thoughts on this, or anything I'm not considering that'd make it a terrible idea?

Theraze · Nov 26, 2012

Personally, given the choice between slightly better memory usage and needing to restart mafia an extra time every time WHAM, BatBrain, CounterChecker, zlib, or any of the other scripts that repeatedly get called are updated...

Smelltastic · Nov 26, 2012

Theraze said:
Personally, given the choice between slightly better memory usage and needing to restart mafia an extra time every time WHAM, BatBrain, CounterChecker, zlib, or any of the other scripts that repeatedly get called are updated...

It'd only be if zlib (or whatever global library) were updated, and it certainly wouldn't have to be implemented that way anyway. You'd just lose your statics when the file is changed, so the script would have to be made with that in mind.

Really, what you'd be saving is disk reads by keeping some things in memory without reloading them.

Catch-22 · Nov 26, 2012

Whilst I don't disagree with what you are suggesting, I would have to say that time spent doing this when the end result is simply reducing KoLmafia's memory footprint would be time better spent working on other areas of KoLmafia that are already quite memory hungry.

Veracity · Nov 26, 2012

I don't see anything that can't be done with features we already have. It depends on a set of cooperating scripts deciding there is one shared static data script that they all load. You change that script, KoLmafia already checks modification dates and reloads it - and all scripts that import it.

I think you are asking for a "global static data" script designated by the user and used or not by other scripts. Meh. Any set of cooperative scripts can have a global static data script - or more than one. You can have multiple disjoint (or intersecting) sets of scripts with multiple static data scripts.

What, exactly, are you asking for that isn't already satisfied by the way KoLmafia already operates with scripts, modification times, static data, etc., etc.?

Smelltastic · Nov 26, 2012

It's true this wouldn't really offer any new functionality that I can think of, really. It's just that right now it can require a lot of disk reads.

Right now, if I were to execute WHAM in a combat, it will load SmartStasis, which then loads zlib. If I then later fire off EatDrink, it will also load zlib. Because each separate import is a new context, each has to run all of zlib's initialization, including loading all its vars. If I then go back into battle and run WHAM again, it has to re-initialize zlib again, and load its data again, because it has no way of knowing if other scripts might have changed its variables. So every time any script using zlib is run, it has to load everything, even if it only wants to use excise() or something. Even if zlib tried to keep things in memory with statics, it would still have to reload the data every time because it won't have any other way to know what other instances may have changed.

In this example you might not be saving very much, but some of the things I'm doing involve importing zlib as well as a library of my own a lot, and reloading the same data over and over. It does work, but I thought a way to load it in just once might be worth checking into. What with modern SSDs and all it isn't a huge deal though, and I have no idea how much work it'd be to make it happen.

Sorry to keep talking about zlib, because I'm really thinking about my own stuff, obviously I'm not involved with zlib itself but it's the obvious example.

Catch-22 · Nov 26, 2012

Smelltastic said:
It's true this wouldn't really offer any new functionality that I can think of, really. It's just that right now it can require a lot of disk reads.

I can sympathize with you wanting to reduce disk I/O, but it may surprise you to see how often a disk actually reads/writes during normal day to day use.

If you're concerned about performance, perhaps try placing your KoLmafia directory on a RAM disk. I think you'll find that reducing disk I/O isn't really going to save a lot of time during execution.

I would be interested in hearing of your results though, Softperfect have a RAMDisk program that is free for non-commercial use.

xKiv · Nov 26, 2012

Most current OSs reduce disk I/O already by caching. Large chunks of your RAM are spent on that. Using ramdisks won't help, because then your OS will just use RAM to cache data that's already elsewhere in RAM (unless your ramdisk driver and OS are somehow smart enough to negotiate disabling of cache; for example, if I recall correctly, linux's "ramdisk" (tmpfs?) actually is implemented *in* the caching layer, so it never duplicates). Escpecially file metadata (like timestamp of last change) is something that the OS will like to keep in memory, even more so if you try to access the file often. If your system does a lot of disk I/O for repeatedly accessed files (or swapping), you need more RAM, not more RAM-eating caches (which will only end up being swapped out, thus defeating their purpose by *causing* more disk I/O).
On top of that, as Veracity noted, mafia already caches (compiled) scripts, so putting the script on ramdisk doesn't get you *any* win - you have to have the current version on persistent storage anyway (hard disk), read it from HDD, copy it to ramdisk (extra step), read into mafia, compile, reuse. (and if you change the script, you have to get the change to HDD too).
Really, it seem that the only thing this would do is that you wouldn't have to write "import <zlib>" in your scripts, and I think that's actually a *bad* change. The scripts would no longer declare what API they use.

Catch-22 · Nov 27, 2012

If it wasn't clear, I was suggesting that placing your KoLmafia directory on a RAMdisk (effectively caching everything) is unlikely to realize any noticeable performance benefits, but you're certainly welcome to try.

As I said in the original post, what the proposed change is actually likely to do is reduce KoLmafia's memory footprint, by maybe a few hundred kilobytes.

xKiv · Nov 27, 2012

Catch-22 said:
If it wasn't clear, I was suggesting that placing your KoLmafia directory on a RAMdisk (effectively caching everything) is unlikely to realize any noticeable performance benefits, but you're certainly welcome to try.

As I said in the original post, what the proposed change is actually likely to do is reduce KoLmafia's memory footprint, by maybe a few hundred kilobytes.

Oh. So one instance of the shared script's interpreter, with global variables (of that script) shared among all other scripts, without that being explicitly declared?
Isn't that already what happens? Variables declared at script's top scope are "global" in some sense, iirc. At least if I interpret Veracity's post up there (#5) correctly. And that would mean that the only difference would *really* be whether you put "import <zlib.ash>" in each script by hand or automatically.

And even if mafia doesn't already cache scripts like that, I don't think that you would get noticeable memory footprint change, because of how java uses memory. There might be less used memory shown in the topright indicator, on average, but that's not what the OS sees as memory allocated by the java process.

Catch-22 · Nov 27, 2012

xKiv said:
Oh. So one instance of the shared script's interpreter, with global variables (of that script) shared among all other scripts, without that being explicitly declared?

Essentially, yes, but I don't think explicit declaration is being excluded here.

xKiv said:
Isn't that already what happens?

Not exactly. After a script is interpreted (which includes processing of imports), each consecutive call to that script will reuse the interpreted results from memory. This means if you have, say, a between battle script (call it, myBetweenBatle.ash) which imports zlib.ash, and an after adventure script (call it myAfterAdventure.ash) which imports zlib.ash, you have a total of TWO instances. That would be the interpreted result of myBetweenBatle.ash (with all of zlib.ash included) and the interpreted result of myAfterAdventure.ash (with all of zlib.ash included). There is no interpreted result of zlib.ash because zlib.ash has not been called, it has been imported by 2 scripts though.

My understanding of what Smelltastic is suggesting would be that interpreting the contents of zlib.ash twice (in this example) is wasteful, and that imports should be interpreted once in a session and have their own instance which can have work handed to it by other scripts that "import" it. A fair bit of work would need to be done to support it and you would see little (if any) measurable performance gain as a result, which is why I would say it's probably not worth doing at the moment or in the foreseeable future.

Something that could be worth doing (and would probably be fairly easy to do), would be having the parser remove dead code (such as unused functions from imports) before handing it to the interpreter. Again, you'd only be seeing like a few hundred kb in memory reduction, but it's certainly a lot less work than what has been proposed.

xKiv · Nov 28, 2012

Catch-22 said:
...
you have a total of TWO instances ...

(of zlib.ash, presumably)
But they do share their global data, even though they are just imported, right?

...
and that imports should be interpreted once in a session and have their own instance which can have work handed to it by other scripts that "import" it. A fair bit of work would need to be done to support it and you would see little (if any) measurable performance gain as a result, ...

This *might* be a perceivable win for large, repeatedly invoked scripts.
But such scripts are likely to drown those savings in their own large run time, so, yeah ...

Winterbay · Nov 28, 2012

While the global data in both are the same, I doubt that they actually share that information. They are after all in two different instances of memory.

Veracity · Nov 28, 2012

That is correct. Each top-level script has its own interpreter that has the code and data for that script and all included scripts. static data - in any of said scripts - is initialized once per interpreter and is not shared with even the same script in other interpreters.

The problem is, I can envision scripts that want to operate in two different ways:

- store static data from fixed files and leave it that way forever more. Presumably, all instances of the library script would end up with the same static data and it would be nice to share it.
- store static data from fixed files but also manipulate it at runtime based on runtime parameters of the toplevel script, say. In that case, you want each instance of the library script to have its own static data.

The current mechanism allows the second model and doesn't prevent the first one, at the expense of extra memory usage.
A mechanism in which library scripts are, essentially, DLLs would cater to the first model and preclude the second model.

Catch-22 · Nov 28, 2012

Veracity said:
The current mechanism allows the second model and doesn't prevent the first one, at the expense of extra memory usage.
A mechanism in which library scripts are, essentially, DLLs would cater to the first model and preclude the second model.

The current mechanism is exactly how imports/includes work in many other scripting languages, I don't think it should change.

It may be possible to also support static imports, which could act similarly to DLL's, but the onus would be on the scriptwriters themselves to declare their imports as static for there to be any memory savings from it.

Again, we're talking about probably a few hundred kb of RAM. Any dev spending their time on something like that could probably spend the same amount of time somewhere else and save a lot more than a few hundred kb of RAM, but I guess if the idea particularly interests someone they could give it a crack.

Potentially, depending upon what scripts you're running and which ones would benefit from using static imports, removing dead code from unused functions in library imports at parse-time may actually result in more memory savings than what you'd get from static imports.

Edit: As an example, if you had two scripts which imported zlib.ash purely for the check_version() functionality, that's an additional 38k of RAM being used for each script. If both of those declared the import as static (if that functionality existed), you'd cut that down to just 38k of RAM used between both scripts. If you left the imports dynamic (for lack of a better word), but removed any unused functions from zlib.ash at parse time, you're only looking at roughly an extra 2k of RAM for each instance. Either way, we're talking about less than 100kb of RAM in all of this. Yes, these are rough estimates, but they wouldn't be too far off.

Global Static Library Namespace Context

Smelltastic

Member

Theraze

Active member

Smelltastic

Member

Catch-22

Active member

Veracity

Developer

Smelltastic

Member

Catch-22

Active member

xKiv

Active member

Catch-22

Active member

xKiv

Active member

Catch-22

Active member

xKiv

Active member

Winterbay

Active member

Veracity

Developer

Catch-22

Active member