Regular Expressions/Matchers

Alhifar

Member
I know that the Matcher datatype is for regular expressions, however, short of trying to understand the source code (Which would be rather hard for me; I'm horrible with Java) I haven't been able to find anywhere where it is described how they are to be used.

Is anyone able to help me out?

PS. For anyone who might be able to help but doesn't want to run to ashref, here's the (presumably) relevant functions:
Code:
buffer append_tail( matcher, buffer )
buffer append_replacement( matcher, buffer, string )
matcher create_matcher( string, string )
boolean find( matcher )
boolean start( matcher )
boolean end( matcher )
string group( matcher )
string group( matcher, int )
int group_count( matcher )
string replace_first( matcher, string )
string replace_all( matcher, string )
matcher reset( matcher )
matcher reset( matcher, string )
 

jasonharper

Developer
Other than create_matcher (which takes a regex pattern and a string to be matched), all of these are just passed through to the underlying Java implementation, and therefore the Java documentation is going to be the definitive source of info about them.

Unfortunately, the online version of the Java documentation is in the form of a frameset, making it impossible to link directly to specific sections. What you'll need to do is:
1. Go to http://java.sun.com/j2se/1.4.2/docs/api/.
2. In the top left frame, scroll down to java.util.regex and click on it.
3. In the bottom left frame, you can now click on Matcher for info on the matcher methods, or Pattern for info on the allowable forms of regular expressions.

Note that the first 'matcher' parameter in all these functions won't actually be shown in this documentation, it's implicit in Java.
 

Bale

Minion
Could someone tell me exactly what the buffer datatype is and how it is used? It seems to carry character data like a string, so I'm puzzled why there are there two datatypes for character data.
 

jasonharper

Developer
A buffer is basically a string that can still be modified. They're primarily used to build up a string from individual pieces, much more efficiently than concatenating strings together a piece at a time. (Which is actually done with buffers internally: a new one is created, the pieces being concatenated are appended to it, then the buffer is converted to a string.)

Strings, being unmodifiable, are more efficiently stored in memory. They're also suitable for usage where modifiability would be undesirable, such as keys in a map. If a buffer was allowed to be used as a key, and was later modified, the corresponding item would likely never be found again, as it would no longer be in the right place in the map.
 

Bale

Minion
Thanks, that's interesting. So, when I do this:

Code:
string first = "I am";
string second = "here.";
first = first + " "+ here;

It is actually being inefficiently converted to a buffer, then back into a string again?
 

Veracity

Developer
Staff member
No. That is string concatenation passed right through to java. I suspect that it looks at the lengths of the two strings, allocates memory sufficient to hold the concatenation, and copies the two strings into it at the appropriate place.

Now, imagine if you had this:

string result = "";
result = result + "...";
result = result + "...";
result = result + "...";
result = result + "...";
...
...and on and on.

Each time through, it allocates a string big enough to hold what was there before and what is being added.

If you did it with a buffer, it might, say, allocate a space for 100 characters and keep adding things to the end of it. If you exceeded 100 characters, it would allocate a space for 200, copy in the first 100, and you would continue to fill up the extra space in the 200 characters.

Only at the end would it take how ever many characters were written into your buffer and make a "string" out of them.

Net result: lots less data copying (i.e., less time) and a lot less allocation and reallocation (also, less time).

Using buffers is faster and, probably, more memory efficient (in the sense of less throw-away temporary memory being allocated) than string concatenation. If you are only appending 2 or 3 strings, just use string concatenation. But if you are adding lots and lots of little pieces to a string, it's more efficient and more readable to use a buffer.
 

Bale

Minion
I see. Then at the end, like in this:
Code:
int queue = 0;
string tosell ="";
foreach i in auto_sell
	if(item_amount(i) > auto_sell[i] && auto_sell[i]>=0) {
		if(tosell.length() != 0)
			tosell= tosell + ", ";
		if(auto_sell[i] != 0)
			tosell = tosell + to_string(item_amount(i)-auto_sell[i])+ " "+ to_string(i);
		else tosell= tosell + "* "+ to_string(i);
		queue = queue + 1;
		if(queue == 11) {
			print("autosell "+tosell, "blue");
			cli_execute("autosell "+tosell);
			tosell ="";
			queue = 0;
		}
	}
if (tosell.length() != 0) {
	print("autosell "+tosell, "blue");
	cli_execute("autosell "+tosell);
}
Since the cli_execute takes a string, if I made tosell into a buffer, I would need to use to_string(buffer) to convert it?
 

Alhifar

Member
Buffers are implicitly converted to strings when necessary. Actually, quite a few of the data types can be implicitly cast to strings.
 

mredge73

Member
Alright, the Java link above does not explain how to use this:

matcher create_matcher( string, string )

From what I have looked at on Alhifar's scripts:
The first string is the pattern to look for and the second is where to look.

Alhifar also uses:
(.+?) group to capture everything but for me it sometimes only captures the first letter and (\\d+) group captures integers but it does not capture "." so I don't think it will work on floats. This if what I was able to find through experimentation.
Running a search on the java sight referenced above did not return any results for (\\d+) or for (.+?)


Problem found: A problem can occur if you don't do the first string perfect, you can still adventure manually but abort does not stop mafia from continuing to execute the script so no new scripts can be executed until you restart mafia.

This is terribly confusing, I shouldn't have tried working with strings, I hate strings.
 

Catch-22

Active member
Thanks, that's interesting. So, when I do this:

Code:
string first = "I am";
string second = "here.";
first = first + " "+ here;

It is actually being inefficiently converted to a buffer, then back into a string again?

Haha, well it's a dead thread by now, but you were actually on the right track. Strings are immutable, so each time you concatenate strings they are getting converted to StringBuffers and appended to. So if you're doing a lot of concatenation it works out to be much less expensive to convert the string to a StringBuffer, append to the buffer as many times as needed. At the end convert back using to_string.
 
Top