Feature - Rejected Session Log Folder Support

Saklad5

Member
In order to support archive formats like ZIP and tar for session logs, there has to be support for multiple files inside an archive. This means they should be treated like a folder, so the first step is to add folder support for session_logs(). This can later be built upon to check for archives in the absence of such folders, in the same way that gzipped files are now accessed in the absence of text files.

If we allow any arbitrary folders or archives in the sessions directory, that could cause major performance hits when a file really doesn’t exist. We can’t have KoLmafia decompressing a decade of session logs because someone didn’t play yesterday. That means there should be a limited number of places to search for a specific file. I’d also caution against actually creating and writing to folders, since that would be a breaking change.

Here is how I think this should work, using the log file Saklad5_20180810.txt as an example:

  1. Try sessions/Saklad5_20180810.txt
  2. Try sessions/Saklad5_20180810.txt.gz
  3. Try sessions/2018/Saklad5_20180810.txt
  4. Try sessions/2018/Saklad5_20180810.txt.gz
  5. Try sessions/2018/08/Saklad5_20180810.txt
  6. Try sessions/2018/08/Saklad5_20180810.txt.gz

The user would have to create these folders and move things to them.
 
Last edited:

fronobulax

Developer
Staff member
Currently session_logs() derives a filename and then looks for it. The name is derived from today's date and the number of days to be included. Session_logs() never actually reads the directory at all, just asks if a file exists. To support this (and ultimately archives) would require inverting the logic. Read the directory and then either iterate over the contents looking for files that are to be included or make an ordered list of places to look for the generated file name and search. I do not think it is an especially good idea to impose a sub-directory naming convention on the user, nor to have subdirectories that contain only a handful of files.

While this is certainly feasible, it is more than I personally am willing to do, especially with my self imposed requirement to test someone elses work if it is committed by me.
 

Saklad5

Member
I agree that this would be a much more complicated change, and that it would require a change to the way it currently operates.

It wouldn’t need to iterate through a directory, necessarily, as long as a naming convention is followed. It might have to for archives, but we could get to that later.

At any rate, archives and containers can’t be implemented until this is done. I wanted to make this feature request to facilitate that, eventually, if someone wants to work on it. As it is, gzip is effective enough to pretty much eliminate any storage problems for most people.
 

fronobulax

Developer
Staff member
Storage problems are relative. I have no problems with about 5 years worth of logs for three characters :)

Why don't you suggest a useful to human naming convention that does not duplicate information already in the file name? sessions/2018/08/ might be a good place to store Saklad5_10.txt, but since the file name of Saklad5_20180810.txt is the KoLmafia convention I see little to no benefit to doing so. I could conjure up something that used class and path but then there is the problem of one session log that contains the end of one run and the beginning of the other. I am genuinely curious as to what benefit you find from subdirectories in sessions especially since at the moment the creation and use is strictly a manual process.

If you see this as a required step towards archives then fine, but if I did decide archives were worth the effort, I'm not sure my 1.0 implementation would support them, so for me subdirectories would be an option, not a requirement or a necessary prerequisite.

Regardless this is all hypothetical since I have no interest in or plans for doing the work to support archives or subdirectories, so someone else can decide.
 

Saklad5

Member
The entire point of this proposed convention is that you can establish the location of a file without additional information. The directories can eventually be archives, so it is important that we don’t have to search everything to make sure a file isn’t there.

Since having these directories is optional, there is no reason (or way) to store additional information in the folder names. KoLmafia breaks logs into days, months, and years, so that is the way the directories should be defined as well. It is guaranteed not to introduce additional problems. Sorting by class or path, meanwhile would necessitate changing how logs are written entirely.

If we don’t have a way to use sub-directories, every file has to be independently compressed. As noted in the thread about gzip support, many compression formats are far more effective when files are compressed together.
 
Yeah, all this talk about storage space issues caused by storing a moderately sized text file per character per day really does sound like overthinking things. My KolMafia folder is less than 3 GB, meanwhile I am also storing ~4000 ebooks taking up 7 GB (of which I have read maybe 200). And 40 GB of music, which both sound like much better places to start looking at when worrying about storage space. Sure, plain textfiles compress very nicely, but that still doesn't mean much if they weren't all that big to begin with.

Also, if some hypothetical file system is supported (and therefore softly enforced), the first thing I would want it to do is split the logs by charactername, that sounds much more helpful than splitting into arbitrary timeframes.
 

Saklad5

Member
…That’s a very good point, actually. Character name is also one of the ways KoLmafia splits logs up. It should be the top subdirectory, instead of the year.

I don’t have a multi, so that never even occurred to me.
 

Veracity

Developer
Staff member
Considering that I have a Terabyte of disk space and the sessions folder for my 5 characters going back to 2007 consumes 3.96 GB on disk for 16,172 files, I shrug at this.

You want to gzip your session logs? Fine. We support that now.
Do we need to spend ANY more developer time trying to soup up a feature of negligible importance?
I say no.
 

Saklad5

Member
OK, there is clearly not much agreement that this feature is necessary. I thought a lot of people wanted to use ZIP, but I appear to have been mistaken. I was working on a patch, but I’ll stop that for now. Feel free to close this request.
 

fronobulax

Developer
Staff member
Perhaps you misunderstood how many people were already using zip files in an acceptable (to them) manner with no need for KoLmafia support? Zipping and unzipping is trivial in a GUI and available at a command line if it needs to be scripted or automated.
 

Saklad5

Member
I think the real value of the logs is the ability to retroactively spade using them. If they cannot be accessed by KoLmafia, they are much less useful.

Upon reflection, I think I’m happy with getting gzip support implemented. That’s effective enough to ensure logs don’t take up a significant amount of storage.
 

fronobulax

Developer
Staff member
I think the real value of the logs is the ability to retroactively spade using them. If they cannot be accessed by KoLmafia, they are much less useful.

They can be readily accessed in KoLmafia and nothing KoLmafia did changed their utility or accessibility. You made the decision to make your logs less accessible by compressing them.
 

heeheehee

Developer
Staff member
I think the real value of the logs is the ability to retroactively spade using them. If they cannot be accessed by KoLmafia, they are much less useful.

Upon reflection, I think I’m happy with getting gzip support implemented. That’s effective enough to ensure logs don’t take up a significant amount of storage.

I don't know about you, but I don't think I've ever used session_logs() for retroactive spading. I either manually take notes (if I'm doing something that requires a small number of carefully-selected datapoints, e.g. determining damage mitigation scaling), write to a (shared) datafile, or just (z)grep through session logs.
 
Top