the DOM, regex, scalability, and other jargony words

Bale

Minion
HTML:
> test xpath charpane.html //center[3]//td[@valign="center"]/text()

1: The Sonata of Sneakiness (2)
2: Smooth Movements (7)
3: Peeled Eyeballs (7)
4: Pisces in the Skyces (12)
5: Elemental Saucesphere (52)
6: Empathy (815)
7: Fat Leon's Phat Loot Lyric (822)
8: Springy Fusilli (825)
9: Leash of Linguini (825)
10: Polka of Plenty (873)
11: Spirit of Bacon Grease (∞)

Effects.

Whoa. That was cool!! Is there a way to get the familiar block regardless of it being above or below effects?
 

roippi

Developer
Whoa. That was cool!! Is there a way to get the familiar block regardless of it being above or below effects?

Sure.

HTML:
> test xpath charpane.html //tr[td[a[@class="familiarpick"]]]/text()

1: Tron, the 35 pound Rogue Program

> test xpath charpane-fambelow.html //tr[td[a[@class="familiarpick"]]]/text()

1: Tron, the 35 pound Rogue Program
 

roippi

Developer
(missed this post on previous page)

It's a shame that compare won't also work with compact panel! (but I'm sure you can quickly do one that would)

It's the one really annoying thing about character pane in particular, there are different styles to support and different classes/avatars with different resources to handle. And the html is not consistent !

Oh. Yeah. I could possibly construct an xpath that would work for both compact/expanded... but it would be quite the hack. Comparing the HTML from both, the DOM looks completely different. (this is why you should do styling in CSS, people!)
 

Bale

Minion
Sure.

HTML:
> test xpath charpane.html //tr[td[a[@class="familiarpick"]]]/text()

1: Tron, the 35 pound Rogue Program

> test xpath charpane-fambelow.html //tr[td[a[@class="familiarpick"]]]/text()

1: Tron, the 35 pound Rogue Program

Now THAT is impressive

I see that I can grab the entire familiar block with

Code:
[COLOR="#808000"]> test xpath charpane.html //table[tbody[tr[td[a[@class="familiarpick"]]]]]/text()[/COLOR]

1: Familiar:(0/1 next @ 43)Dismal Jasper, the 22 pound Grimstone Golem

Now, how do I get that with the html included? Or am I going to have to go without any html and just convert all of ChIT at once? It's just that I am already set up to parse...
Code:
<table width=90%><tr><td colspan=2 align=center><font size=2><b>Familiar:</b><br>(0/1 next @ 43)</font></td></tr><tr><td align=center valign=center><a target=mainpane href="familiar.php" class="familiarpick"><img src="/images/itemimages/grimgolem.gif" width=30 height=30 border=0></a></td><td valign=center align=left><a target=mainpane href="familiar.php" class="familiarpick"><b><font size=2>Dismal Jasper</a></b>, the  <b>22</b> pound Grimstone Golem</font></td></tr></table>

Do I have to change everything when I start to use xpath? I'm kinda hoping to do it a bit at a time. Well, I suppose it is easier if I start changing it from the bottom up instead of the top down.
 

roippi

Developer
Now, how do I get that with the html included?

Well, you can. But that's the thing I haven't figured out how port to ash yet.

Normally, an xpath query returns an array of node objects. (TagNode objects, in HtmlCleaner, that have all the methods of the root object, including xpath evaluation) But when you apply the text() function, you're asking the parser to turn each node into its (recursively concatenated) text contents. So really you're dealing with two completely separate behaviors and I've only written the "test" command to work with the simpler one.

So, I have to figure out an ASH API that can handle xpath expressions which can return an array of strings OR an array of node objects. It's a bit icky.
 

Bale

Minion
So, once you figure that out, I'd just leave off the "/text()" at the end to get it with full html?

It's pretty darn amazing and I'm looking forward to getting my hands on this after the next release.
 

roippi

Developer
So, once you figure that out, I'd just leave off the "/text()" at the end to get it with full html?

Couldn't you just walk the nodes and reconstruct the innerHTML?

Huh. I was previously thinking of adding a new type to ASH (blah) but I think I like this option way more. Just always return strings; convert TagNode objects to their innerHTML equivalent when necessary. Yeah, that can work.

It's pretty darn amazing and I'm looking forward to getting my hands on this after the next release.

Well, I'm not averse to adding provisional new ASH functions before the point release. I'm just not keen on messing with mafia internals before then.
 

roippi

Developer
I'm sure I'll make changes to it, but you can play around with the xpath( str ) function I just added. Only works in relay scripts after you've invoked visit_url().

(or, well, you will be able to eventually, once I sort out build.xml)
 
Last edited:

Bale

Minion
How is xpath( string ) going to work? It simply responds to xpath commands parsing the last html visited in the browser? I figured I'd pass it two strings, one would be the html and the other would be the xpath command.

I'm curious about why you do it that way.
 

roippi

Developer
Not sure, really. I had this thought that I would eventually be clever and fetch the pre-parsed tree (since I'm going to use parsers as part of many requests, might as well reuse them) but... I don't know, I'm unable to brain tonight, it's obviously better if you can feed it arbitrary html. Need sleep.
 

Bale

Minion
Just to continue considering alternatives...

You could have both a two parameter and one parameter version where the one parameter version always queries the most recently parsed tree. That way I can switch html any time I desire, but still have the speed advantage of not needing to parse the same darn html every single time.
 
Top