Help with extracting a portion of a string

ereinion

Member
I'm trying to write a function to extract what image-number hobopolis is at. I came up with the following, which works, but I have a suspicion it's not very efficient. Therefore I wondered if any of the skilled people here at the forums had any suggestions on how I could improve it.

From what I've read at the mafia-wiki I suspect using regular expressions might do the trick, but since I've never used them before nor have any idea of how they work, I would prefer to avoid using them.
Code:
void hoboAvailable() {
	int beginning; int finish; string hobopolis;
	
	if(visit_url("town_clan.php").contains_text("clanbasement.gif") || !visit_url("clan_basement.php?fromabove=1").contains_text("not allowed")) {
		if(visit_url("clan_basement.php").contains_text("opengrate.gif")) {
			print("Found the sewers!", "blue");
			
			if(visit_url("clan_hobopolis.php").contains_text("snarfblat=166")) {
				print("Only the sewers are open","blue");
			} else {
				hobopolis = visit_url("clan_hobopolis.php?place=2");
				beginning = index_of(hobopolis, "hobopolis/townsquare") + 10;
				hobopolis = substring(hobopolis, beginning);
				finish = index_of(hobopolis, ".gif");
				beginning = 10;
				hobopolis = substring(hobopolis, beginning, finish);
				print("Hobopolis image " + hobopolis, "blue");
			}
		}
	} else {
	  print("Can't find the sewers!","red");
	}
}

* Edit * Remembered that it might be useful for people with a wish to help to see the html for "clan_hobopolis.php?place=2". So here you go (this is for a hobopolis-instance which hasn't been adventured in yet):
Code:
<html><head><script language=Javascript><!--if (parent.frames.length == 0) location.href="game.php";//--></script><script language=Javascript src="http://images.kingdomofloathing.com/scripts/keybinds.min.2.js"></script><script language=Javascript src="http://images.kingdomofloathing.com/scripts/window.20111231.js"></script><script language="javascript">function chatFocus(){if(top.chatpane.document.chatform.graf) top.chatpane.document.chatform.graf.focus();}defaultBind(47, CTRL, chatFocus); defaultBind(190, CTRL, chatFocus);defaultBind(191, CTRL, chatFocus); defaultBind(47, META, chatFocus);defaultBind(190, META, chatFocus); defaultBind(191, META, chatFocus);</script><script language=Javascript src="http://images.kingdomofloathing.com/scripts/jquery-1.3.1.min.js"></script><link rel="stylesheet" type="text/css" href="http://images.kingdomofloathing.com/styles.css"></head><body><centeR><table width=95% cellspacing=0 cellpadding=0><tr><td style="color: white;" align=center bgcolor=blue><b>Hobopolis Town Square</b></td></tr><tr><td style="padding: 5px; border: 1px solid blue;"><center><table><tr><td><map name="townsquare"><area shape="rect" coords="120,265,378,435" href="adventure.php?snarfblat=167" alt="Hobopolis Town Square (1)"><area shape="rect" coords="427,602,500,647" href="clan_hobopolis.php?place=1" alt="Back to the Sewers"><area shape="poly" coords="371,670,437,539,298,530,244,600,255,694" href="clan_hobopolis.php?place=3" alt="Richard's Redoubt"></map><center><img src="http://images.kingdomofloathing.com/otherimages/hobopolis/townsquare0.gif" width=500 height=744 border=0 usemap="#townsquare" alt="Hobopolis Town Square (picture #0)" title="Hobopolis Town Square (picture #0)"></td></tr></table><center><p><a href="clan_basement.php">Go back to your Clan Basement</a></center></td></tr></table></center></td></tr><tr><td height=4></td></tr></table></center></body></html>
 
Last edited:

Catch-22

Active member
Jamie Zawinski said:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

To answer your question, this should work a little better :)

Code:
void hoboAvailable() {
    int beginning; int finish; string hobopolis;
    
    hobopolis = visit_url("clan_hobopolis.php?place=2");
    if(length(hobopolis) > 0) {
        print("Found the sewers!", "blue");
        if(hobopolis.contains_text("snarfblat=166")) {
            print("Only the sewers are open","blue");
        } else {
            beginning = index_of(hobopolis, "hobopolis/townsquare") + 10;
            hobopolis = substring(hobopolis, beginning);
            finish = index_of(hobopolis, ".gif");
            beginning = 10;
            hobopolis = substring(hobopolis, beginning, finish);
            print("Hobopolis image " + hobopolis, "blue");
        }
    } else {
      print("Can't find the sewers!","red");
    }
}

It works mainly by limiting the amount of hits to the KoLmafia servers (to one, in fact), which is really the biggest bottleneck in all of this.
 

ereinion

Member
Thanks for the suggestion :) The reason I didn't do it that way in the first place was that it was implied in this post/thread that getting an empty page from visit_url() can cause some problems. So I stole those first few checks from there :D

But I'll give your suggestion a try, and see how it does in a clan with no basement/where I haven't passed the sewers. Again, thanks for the help :)
 

Theraze

Active member
I believe that changed at some point to not bail out if your visit_url is a failed load... but I couldn't tell you when without way more search-fu then I'm capable after the latest encounter with the angry car gods. :)
 

ereinion

Member
Yup, the fix worked, except I didn't have to check if the string was empty at all.
Code:
if(!hobopolis.contains_text("You don't have access to Hobopolis"))
was the thing to check for in a clan without a basement.

Guess I'll have to check if visit_url handles empty strings some other time :)
 

Catch-22

Active member
Yup, the fix worked, except I didn't have to check if the string was empty at all.
Code:
if(!hobopolis.contains_text("You don't have access to Hobopolis"))
was the thing to check for in a clan without a basement.

Guess I'll have to check if visit_url handles empty strings some other time :)

Well.. That will tell you if you don't have access to the basement, it won't tell you that your clan doesn't have a basement :) If your clan doesn't have a basement, you will get an empty page.

It's also worth noting, that if you're not in a clan at all, you'll be redirected to the clan_signup.php page, where I guess you could search for "<b>Apply to a Clan</b>" or something similar.

Depends how robust you want your function to be, I suppose.

This should work:

Code:
void hoboAvailable() {
    int beginning; int finish; string hobopolis;
    
    hobopolis = visit_url("clan_hobopolis.php?place=2");
    if(hobopolis.contains_text("snarfblat=166")) {
        print("Only the sewers are open","blue");
    } else if(hobopolis.contains_text("hobopolis/townsquare") {
        beginning = index_of(hobopolis, "hobopolis/townsquare") + 10;
        hobopolis = substring(hobopolis, beginning);
        finish = index_of(hobopolis, ".gif");
        beginning = 10;
        hobopolis = substring(hobopolis, beginning, finish);
        print("Hobopolis image " + hobopolis, "blue");
    } else {
      print("Can't find the sewers!","red");
    }
}
 

Grotfang

Developer
Code:
        beginning = index_of(hobopolis, "hobopolis/townsquare") + 10;
        hobopolis = substring(hobopolis, beginning);
        finish = index_of(hobopolis, ".gif");
        beginning = 10;
        hobopolis = substring(hobopolis, beginning, finish);
        print("Hobopolis image " + hobopolis, "blue");

I think this section is the same as this:
Code:
	matcher hobo_check = create_matcher( "hobopolis/townsquare(.+?).gif" , hobopolis );	
	if( hobo_check.find() )
		print( "Hobopolis image " + hobo_check.group( 1 ) , "blue" );
 

ereinion

Member
Well.. That will tell you if you don't have access to the basement, it won't tell you that your clan doesn't have a basement :) If your clan doesn't have a basement, you will get an empty page.
Nope, have checked it in clans with a basement (both one I had access to and one I didn't), without a basement, and one where the sewer was closed (which made some other trouble further down the script, but that was easy to fix :)), and still haven't found an empty string - this may be something that has changed recently, though.

(...)
I think this section is the same as this:
Code:
	matcher hobo_check = create_matcher( "hobopolis/townsquare(.+?).gif" , hobopolis );	
	if( hobo_check.find() )
		print( "Hobopolis image " + hobo_check.group( 1 ) , "blue" );
Thanks for the suggestion, even if it uses reg-ex ;) However, after poring over the mafia-wiki's page on those, I think I understand what it is the code snippet you posted is saying, so I think I'll introduce it to my script. It now looks something like this (in case anyone is interested):

Code:
int hoboAvailable() {
	string hobopolis; int imageNumber;
	
	hobopolis = visit_url("clan_hobopolis.php?place=2");
	
	if(!hobopolis.contains_text("You don't have access to Hobopolis")) {
		print("Found the sewers!", "blue");
		if(hobopolis.contains_text("snarfblat=166")) {
			print("Only the sewers are open","blue");
		} else if (hobopolis.contains_text("sewergrate.gif")) {
			print("The sewers are closed.","blue");
		} else {
                        // Thanks to Grotfang for setting up this matcher/reg-ex :) Guess I'll have to learn to do that stuff myself, some day
			matcher hobo_check = create_matcher( "hobopolis/townsquare(.+?).gif" , hobopolis );	
			if( hobo_check.find() ) {
				print( "Hobopolis image " + hobo_check.group( 1 ) , "blue" );
				imageNumber = to_int(hobo_check.group( 1 ));
			}
		}
			(...)
		
	} else {
	        print("Can't find the sewers!","red");
		return -1;
	}
	return -1;
}
It works like a charm :)
 
Last edited:

Grotfang

Developer
It works like a charm :)

Glad it works. The part I posted, at least, is the code I use in my hamster script.

Just a suggestion here, but it looks as though the code you're posting here is part of something bigger. Unless hoboAvailable()'s return is irrelevant (in which case it should be void), or you do something within the (...) part to return something other than -1, your function will always return -1. May I suggest that it should either be a boolean function, finishing with return true (since you return false if you can't find the sewers), or you return the number that hobopolis is currently on.

If you do something with the imageNumber later, then the function does more than its name suggests and my advice is probably not very helpful :)
 

ereinion

Member
The entire function (thought it might be a bit much to post the entire thing here, but whatever :)):
Code:
int hoboAvailable() {
	string hobopolis; int imageNumber;
	
	hobopolis = visit_url("clan_hobopolis.php?place=2");
	
	if(!hobopolis.contains_text("You don't have access to Hobopolis")) {
		print("Found the sewers!", "blue");
		if(hobopolis.contains_text("snarfblat=166")) {
			print("Only the sewers are open","blue");
		} else if (hobopolis.contains_text("sewergrate.gif")) {
			print("The sewer is closed.","blue");
		} else {
			// Thanks to Grotfang for setting up this matcher/reg-ex :) Guess I'll have to learn to do that stuff myself, some day
			matcher hobo_check = create_matcher( "hobopolis/townsquare(.+?).gif" , hobopolis );	
			if( hobo_check.find() ) {
				print( "Hobopolis image " + hobo_check.group( 1 ) , "blue" );
				imageNumber = to_int(hobo_check.group( 1 ));
				hoboZonesAvailable[$location[Hobopolis Town Square]] = true;
			}
			if (imageNumber == 125) {
				imageNumber = 13;
			}
			switch {
				case imageNumber==26:
					print("Hodgeman is dead, nothing more to do here","red");
					hoboZonesAvailable[$location[Hobopolis Town Square]] = false;
					return 0;
					break;
				case imageNumber==25:
					print("Hodgeman is ready for a fight", "blue");
					hoboZonesAvailable[$location[Hobopolis Town Square]] = false;
					sideZoneAvailable($location[The Purple Light District]);
					sideZoneAvailable($location[The Ancient Hobo Burial Ground]);
					sideZoneAvailable($location[The Heap]);
					sideZoneAvailable($location[Exposure Esplanade]);
					sideZoneAvailable($location[Burnbarrel Blvd.]);
				case imageNumber>=11:
					print("All side-areas should be open", "blue");
					sideZoneAvailable($location[The Purple Light District]);
					sideZoneAvailable($location[The Ancient Hobo Burial Ground]);
					sideZoneAvailable($location[The Heap]);
					sideZoneAvailable($location[Exposure Esplanade]);
					sideZoneAvailable($location[Burnbarrel Blvd.]);
					return 6;
					break;
				case imageNumber>=9:
					print("All side-areas except PDL should be open", "blue");
					sideZoneAvailable($location[The Ancient Hobo Burial Ground]);
					sideZoneAvailable($location[The Heap]);
					sideZoneAvailable($location[Exposure Esplanade]);
					sideZoneAvailable($location[Burnbarrel Blvd.]);
					return 5;
					break;
				case imageNumber>=7:
					print("All side-areas except PDL and the burial ground should be open", "blue");
					sideZoneAvailable($location[The Heap]);
					sideZoneAvailable($location[Exposure Esplanade]);
					sideZoneAvailable($location[Burnbarrel Blvd.]);
					return 4;
					break;
				case imageNumber>=5:
					print("Burnbarrel Blvd. and Exposure Esplanade should be open", "blue");
					sideZoneAvailable($location[Exposure Esplanade]);
					sideZoneAvailable($location[Burnbarrel Blvd.]);
					return 3;
					break;
				case imageNumber>=3:
					print("Burnbarrel Blvd. should be open", "blue");
					sideZoneAvailable($location[Burnbarrel Blvd.]);
					return 2;
					break;
				case imageNumber<3:
					print("No side zones are open yet", "blue");
					return 1;
					break;
				default:
					print("Something went wrong with figuring out which areas are open", "red");
					return -1;
			}
		}
	} else {
	  print("Can't find the sewers!","red");
		return -1;
	}
	return -1;
}
I suppose I could have checked for the image text for availability of the side zones instead, but I like using the image number better :p
 
Last edited:

Catch-22

Active member
So this probably worthy of a recap. One of the best things you can do to optimize your scripts is to minimize the amount of hits your script needs to do on the KoL servers. Extract as much meaningful data as you can from as little page hits as possible and you'll be on the right track.

As far as string processing goes, well, I've done some fairly extensive tests in the past and in most cases you're not even going to be able to tell the performance difference between a good parsing function and a matcher. I would usually just go with whatever is the easiest to understand :)
 

Theraze

Active member
Hodgeman being ready for a fight doesn't have its own return number. Do you want it to? If not, you don't need to double-define the sideZoneAvailable on the others that it bleeds into... setting them once is fine. :)
 

ereinion

Member
Like so?
Code:
case imageNumber==25:
	print("Hodgeman is ready for a fight", "blue");
	hoboZonesAvailable[$location[Hobopolis Town Square]] = false;
case imageNumber>=11:
	print("All side-areas should be open", "blue");
	sideZoneAvailable($location[The Purple Light District]);
	sideZoneAvailable($location[The Ancient Hobo Burial Ground]);
	sideZoneAvailable($location[The Heap]);
	sideZoneAvailable($location[Exposure Esplanade]);
	sideZoneAvailable($location[Burnbarrel Blvd.]);
	return 6;
	break;

Oh, and I'm fairly certain the hoboAvailable() function only hits the server once - don't know if there's much more I can do to minimize pagehits :) Unless there's something I can do to get the html for all the zones in one hit? As it is I'm doing
Code:
string sidezone = visit_url("clan_hobopolis.php?place=3");
and so forth to get the strings to test for the sidezones. Any way to reduce this?
 

Theraze

Active member
Yep, that's cleaner. :)

Two hits, I think, but... The first hit is checking the basement to see if it's sewers, hobo, or no. Second hit is checking what image is being used in their zone. My guess is that they were commenting on your original code that did a ton of hits.
 

ereinion

Member
Nope, I can use the hit on "clan_hobopolis.php?place=2" to see if I have access to the sewers at all, which is really all I care about. After a ctrl+f of the function, I could only find one instance of visit_url :-D

Oh, and I really want to say thanks to all of you who have contributed here again, your feedback has really taught me quite a bit.
 

Theraze

Active member
True... I was looking at the matcher, which is just a search, not an actual hit. Teach me to try to parse after a full day of work. Bah. :)
 

StDoodle

Minion
For limiting server hits for scripts that do extensive parsing of page text, what I've done in the past is make use of a global string [string] map, which is page text keyed by url. Then, I use a cached_url() function that returns the results of visit_url() if the page has yet to be visited and saves it in the map, otherwise it just returns the saved result. You could use file_to_map() and map_to_file() for additional caching.
 

ereinion

Member
Thanks for the tip. Might try to implement that in my script later on, even if it runs fast enough for my tastes at the moment :) But right now I'm too tired to actually concentrate on what I need to change to make it work. Will probably have to add some kind of check for what areas/urls may have changed since I last checked them - probably another global map containing a boolean for each of the areas, maybe?

Another issue I ran into which doesn't really have anything to do with the topic of this thread is when trying to collect semi-rares from the side-zones - I've come up with this three options, only one of which works:
Code:
(1) adv1(semiRarePriority, 0, "");
(2) adventure(1, semiRarePriority);
(3) if (adventure(1, semiRarePriority)) {}
I can understand why option 2 doesn't work, but according to adv1's wiki page option 1 ought to work. Am I missing something, or is the information on the wiki-page wrong/the function bugged? I guess I can just use option 3, but it looks messier in its output than I'd have preferred :p

Hopefully someone will come by and give me a pointer, even if it has nothing to do with what I originally asked for help with ;)
 

Theraze

Active member
Easiest thing to do (if you have conditions set) is
Code:
(!adventure(1, semiRarePriority));
Looks like it shouldn't work, but all it really means is that, regardless of the outcome of your adventure, keep running automation.
 

ereinion

Member
Thanks! Will it work even if I haven't got conditions set? I probably will have most of the time, but not necesarrily... Also, have you got an idea why adv1(loc, 0, "") doesnt't work, it seems to me it should? :p
 
Last edited:
Top