Bug xpath() does not recognize contains()

philmasterplus

Active member
I tried to do something like this:

Code:
string page = '<p>foo bar</p>';
for _, node in xpath(string, '//p[contains(text(),"foo")]') {
  print(node);
}

But this yields an error message: invalid xpath expression (test.ash, line 34)

Possibly related: I noticed a discussion about implementing a proper XPath support. Has this been actually implemented?
 

MCroft

Developer
Staff member
This came up again in this thread, where users are talking about Jaxen.

We've also moved from Java 6 to Java 8, so perhaps the built-in javax.xml.xpath is more robust and can solve the problems we have.

If the usage is just for ash, we might want to consider a new command, so that scripts that depend on the old, broken behavior don't suddenly start working differently (and better, but hey...).

There was also some talk of replacing HtmlCleaner's source with the HtmlCleaner jar in src/jar. I don't know if that's still being pursued.
 

philmasterplus

Active member
I just dabbled in org.w3c.dom APIs and it is barebone. I haven't tried it yet, but serializing them seems to be a big pain point as well. Using it to reimplement ASH xpath() would be possible, but not pretty.

While not essential, using Jaxen (+ JDOM) would make using XPath easier for KoLmafia devs and ASH script authors alike. Using JDOM & Jaxen would allow us to easily manipulate XML nodes (for use within KoLmafia) or serialize them to strings (for the ASH xpath() function).

That said, I have very little experience in this area. Perhaps Java 8 has some decent tools for XML DOM serialization hidden away somewhere.
 
Last edited:

MCroft

Developer
Staff member
I just dabbled in org.w3c.dom APIs and it is barebone. I haven't tried it yet, but serializing them seems to be a big pain point as well. Using it to reimplement ASH xpath() would be possible, but not pretty.
I was thinking javax.xml.xpath, but that may just use org.w3c.dom.

No point in re-implementing if it doesn't gain anything, and I'm also not experienced here. Just saw a reference in HtmlCleaner from a while back saying "we should really consider using the new built-in xpath stuff".
 
Top