Every now and then, I’ll end up having to scrape HTML pages for some content. I know, I know, there’s like a bazillion different ways to do this, but I *really* like doing it in PHP so I can jack right into Symfony. Usually, I just get down and dirty with the PHP DOM and use XPath to select nodes within the document. The problem with this is that the XPath sucks and the PHP implementation sucks…alot.
But hold on, we know a selector engine that doesn’t suck! The jQuery selector engine, called Sizzle is probably one of the best CSS/DOM selector engines to use. Turns out there is a PHP port! Enter phpQuery
At its root, phpQuery is a port of the jQuery selector syntax to PHP. Additionally, phpQuery includes dozens of the jQuery traversal methods like next(), prev(), find(), and so on. It also implements the CSS3 filters like :first, :last, :eq, ect.
Anyway, if you’re tired of suffering through the PHP XPath implementation and dig jQuery then you should definitely give phpQuery a whirl.