Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fix incorrectly displayed encoding on an html document with php

Is there a way to fix the characters that display improperly after running this html markup through phpquery::newDocument? There are slated double quotes around -Classics with modern Woman- in the original document that end up displaying improperly after creating the new doc with phpquery.

    //Original document is UTF-8 encoded
$raw_html = '<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body><p>Mr. Smith of Bangkok celebrated the “Classics with modern Woman”.</p></body></html>';
print($raw_html);

$aNew_document = phpQuery::newDocument($raw_html);
print($aNew_document);

Original Output: Mr. Smith of Bangkok celebrated the “Classics with modern Woman”.

New Document Output: Mr. Smith of Bangkok celebrated the �Classics with modern Woman.

like image 582
JMC Avatar asked Aug 28 '10 03:08

JMC


People also ask

How do I fix the character encoding of the HTML document was not declared?

Adding <meta charset="utf-8"/> in the html code solved my issue in Firefox. Save this answer.

What is the correct way to declare character encoding in HTML5?

Always declare the encoding of your document using a meta element with a charset attribute, or using the http-equiv and content attributes (called a pragma directive).


4 Answers

  1. You need to save the page with UTF-8 without BOM encoding.
  2. Add this header on top of your script:

    header("Content-Type: text/html; charset=UTF-8");

[EDIT]: How to Save Files as UTF-8 without BOM :

On OP request, here's how you can do on Windows:

  1. Download Notepad++. It is an awesome text-editor that you should be using.
  2. Install it.
  3. open the PHP script in Notepad++ that contains this code. The page where you are doing all the coding. Yes, that file on your computer.
  4. In Notepad++, from the Encoding menu at the top, select "Convert to UTF-8 without BOM".
  5. Save the file.
  6. Upload to your webserver by FTP or whatever you use.
  7. Now, run that script.
like image 99
shamittomar Avatar answered Oct 06 '22 03:10

shamittomar


i had the same problem but when i added

ob_start();

to first line

ob_end_flush();

to the end it seem to be working

like image 38
Sujay sreedhar Avatar answered Oct 06 '22 03:10

Sujay sreedhar


You have this in the <head> element:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> 

The next course would be to use HTML entities to display these characters.

like image 39
Cody Snider Avatar answered Oct 06 '22 02:10

Cody Snider


I had same problem using phpQuery class. Problem IS as mentioned above, in top voted answer - script file is saved as UTF8-with BOM.

As i had no no chance getting notepad++ on mac osX,

every output i prepared like this utf8_decode()

BOM is meant for MS-windows.

like image 39
animaacija Avatar answered Oct 06 '22 02:10

animaacija