Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Webserver overriding page encoding?

I have pages that I have manually coded in PHP more than 10 years ago.

They are encoded in the old Hebrew encoding - windows-1255

Lately, they were all broken - text is shown as unrecognized UTF-8 characters - diamond with a question mark inside.

When I manually change the encoding in the browser - any browser - the text is displayed correctly.

I thought that maybe the server at my host is forcing UFT-8 encoding. I have change .htaccess to force windows-1255 encoding but it didn't work.

I tried validating the page via W3C and it sees the page as UTF-8

I downloaded XAMPP and ran the server locally and it's still happening.

Has anything changed in the last year(s) in the way that Apache is serving web pages regarding the encoding?

Before I go over all my pages and change their encoding, I would like to know if there is a global "switch" I can flip in order for my pages to display correctly.

Thanks.

like image 202
Hanan Cohen Avatar asked Oct 30 '22 14:10

Hanan Cohen


1 Answers

After much struggling, a kind soul helped me realize that it's not Apache overriding the charset in the header but PHP itself.

Adding

header('Content-Type: text/html; charset=windows-1255');

to the top of the PHP file fixed the problem.

As far as I understand the chain is as follows:

  1. By default, Apache sends the content with a UTF-8 header
  2. If another charset is set in .htaccess, the browser accepts it
  3. Then comes PHP and forces the charset in the header as specified by default_charset in php.ini

Anything you define before the page reaches the browser in ignored.

like image 105
Hanan Cohen Avatar answered Jan 02 '23 19:01

Hanan Cohen