Zend_Dom_Query::loadHTML() problems with UTF-8

February 17, 2011 Zend Framework

in order to prevent damaging charset while usnig Zend_Dom_Query please see the hack below. It happens that you experience wrong characters even if the document is UTF-8 and you specified this encoding when you creted your Zend_Dom_Query instance. The problem is in Zend_Dom_Query::queryXpath() method on line

If your html has meta tag written like this: <meta charset=’UTF-8′ /> it might have garbage in latter dom queries in output. To fix that there is a dirty hack mentioned here :https://ru.php.net/manual/en/domdocument.loadhtml.php#95251. To fix you can change the mentioned above line to:

Assuming you’re running 1.11 ZF version where encodings had been introduced in the component. Otherwise you should either download the latest component version or set the encoding yourself.


January 29, 2012 at 5:04 pm

That was close, would have worked immediately if you had placed the quotes correctly, like this: $domDoc->loadHTML(' '.$document); Thanks, I was pulling my hair out trying to figure out why the special characters were getting corrupted.


April 24, 2012 at 8:54 am

Thank u so much :)))


February 4, 2014 at 7:12 pm

Thanks for this - ditto much hair-pulling...

Leave a Reply

Your email address will not be published