Zend_Dom_Query::loadHTML() problems with UTF-8

in order to prevent damaging charset while usnig Zend_Dom_Query please see the hack below. It happens that you experience wrong characters even if the document is UTF-8 and you specified this encoding when you creted your Zend_Dom_Query instance. The problem is in Zend_Dom_Query::queryXpath() method on line

 
$domDoc->loadHTML($document);

If your html has meta tag written like this: it might have garbage in latter dom queries in output. To fix that there is a dirty hack mentioned here :https://ru.php.net/manual/en/domdocument.loadhtml.php#95251. To fix you can change the mentioned above line to:

 
$domDoc->loadHTML('' .$document);

Assuming you’re running 1.11 ZF version where encodings had been introduced in the component. Otherwise you should either download the latest component version or set the encoding yourself.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.

This site uses Akismet to reduce spam. Learn how your comment data is processed.