Using UTF-8 charset with PHP - are mb functions required?

Question

These past few days I've been working toward converting my PHP code base from latin1 to UTF-8. I've read the two main solutions are to either replace the single byte functions with the built in multibyte functions, or set the mbstring.func_overload value in the php.ini file.

But then I came across this thread on stack overflow, where the post by thomasrutter seems to indicate that the multibyte functions aren't actually necessary for UTF-8, as long as the script and string literals are encoded in UTF-8.

I haven't found any other evidence whether this is true or not, and if it turns out I don't need to convert my code to the mb_functions then that would be a real time saver! Anyone able to shed some light on this?

Pekka · Accepted Answer

As far as I understand the issue, as long as all your data is 100% in utf-8 - and that means user input, database, and also the encoding of the PHP files themselves if you have special characters in them - this is ~~true~~ true for search and comparison operations. As @ntd points out, a non-multibyte strlen() will produce wrong results when run on a string that contains multibyte characters.

THis is a great article on the basics of encoding.

Using UTF-8 charset with PHP - are mb functions required?

Tags:

php

utf-8

multibyte-functions

Spoonface

1 Answers

Pekka

Recent Activity

Donate For Us

Using UTF-8 charset with PHP - are mb functions required?

Tags:

php

utf-8

multibyte-functions

Spoonface

1 Answers

Pekka

Related questions

Recent Activity

Donate For Us