Page 2 of 2 FirstFirst 1 2
Results 11 to 16 of 16

Thread: file_to_string() or file_to_map(string,string) or session_logs(string,int,int)

  1. #11
    Developer fronobulax's Avatar
    Join Date
    Feb 2009
    Location
    Central Virginia, USA
    Posts
    4,182

    Default

    I asked out of a sense of composition. To continue on my series of error prone pontifications I was thinking that using map_to file to write a string[int] would result in a file with line number, tab, string whereas the original file was just string.

    Not seeing the relevance of string interning and not seeing how the existence of array_to_file and file_to_array changes that but that could just be me needing to pay more attention.

  2. #12

    Default

    I have string[int] file_to_array( String filename ) written, and it seems to work. I'll likely commit it tomorrow, in case I think of something to change.

    This change also adds sessions to the list of folders that can generally be accessed for file reading/writing. That seems fine since session_logs() already provided access to read from there.

    Unlike file_to_map(), if the file doesn't exist, the return value is an empty array. Since that's easy to check using count( array ) == 0, that seems fine.

  3. #13
    Junior Member
    Join Date
    Nov 2011
    Posts
    13

    Default

    I asked out of a sense of composition. To continue on my series of error prone pontifications I was thinking that using map_to file to write a string[int] would result in a file with line number, tab, string whereas the original file was just string.

    Not seeing the relevance of string interning and not seeing how the existence of array_to_file and file_to_array changes that but that could just be me needing to pay more attention.
    Originally Posted by fronobulax View Post
    Yeah, your understanding of map_to_file() applied to a string[int] here seems correct. The thing is, once you import the file with file_to_array(), you can manipulate it however you want. I suspect many of my applications will involve taking a string[int] given by file_to_array(), appending each string in there to a buffer, doing stuff and/or things to that buffer (most likely lots of group_string()-related parsing shenanigans), and then throwing the result or the buffer itself into a file with map_to_file(). Throwing the buffer into a file would result in a file that would be identical to the initial file with the exception of maybe a trailing or leading tab (if you, say, make it into a buffer[string] with the empty string as the only key) and no line breaks (unless you add those back in while appending). I guess what I'm trying to say is that aggregate manipulations that already exist in ASH can be used to make map_to_file() behave practically identically to the proposed array_to_file() with minimal effort on the user's end.

    Regarding string interning, I'm just saying that people doing lots of manipulations on the string[int] resulting from file_to_array() might unwittingly run into issues that they wouldn't if they were doing those to a buffer[int] aggregate, speaking as someone who has had problems in that vein. I don't think the existence of either array_to_file() or file_to_array() would necessarily exacerbate that, just that people might be less likely to encounter it if file_to_array() could return a buffer[int]. As described above, it's very straightforward for a user to convert a string[int] to a buffer or buffer-based aggregate, so people who encounter issues already have tools at their disposal to address them, just slightly less intuitively than if file_to_arry() could be told to populate a buffer[int]. I'm certainly not suggesting spending any time at all addressing this, just sharing a potential complication of a new feature encouraging people to muck around with large-scale string manipulations.

  4. #14
    Developer Veracity's Avatar
    Join Date
    Mar 2006
    Location
    The Unseelie Court
    Posts
    11,446

    Default

    A couple comments:

    1) When I refer to an "array" of strings in ASH, I am referring to the thing you get from, for example:

    string[10] array1; // elements indexed 0 - 9, an actual Java array underneath, not a map
    string[] array2 = { abc, def, ghi }; // elements indexed 0 - 2, ditto

    Which is to say an "array of strings" behaves programatically like "string [int]", except it will throw a runtime exception if you don't stay within the bounds.

    2) String values, as read by file_to_map, effectively intern strings: they are "unescaped" by storing them, character by character, into a buffer, turning \t and \n and \\ into tab and newline and backslash, respectively, and then getting buffer.toString(). That detaches each string from the line it was read in and the page it came from, and so on.

    3) String values processed by map_to_file do the opposite transformation; newlines and tabs and backslashes turn into \n, \t, and \\.

    4) Given that file_array stores one line per string, should I assume you'd like the same transformation done to allow those escape characters?

    5) And both escape and unescape could be optimized to not bother going through a buffer if there are no characters that will be transformed, but in that case, string interning will have to be done explicitly, rather than as a side effect of moving to a buffer and back.

  5. #15

    Default

    Added file_to_array() in 18740.

  6. #16
    Developer Veracity's Avatar
    Join Date
    Mar 2006
    Location
    The Unseelie Court
    Posts
    11,446

    Default

    You actually did "file to string[int]" - i.e., file to map of strings indexed by integer.
    Most programs will not notice the difference, although it is (minutely) less efficient under the hood.
    Also no "escape/unescape" transformation for reading/writing. Whatever, that can be done in the user program, and yojimbos_law had nothing to say to my question about whether that was important to him.

    Given that, I guess this is implemented.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •