Should I use multi-byte overloading (mbstring.func_overload)?

Tags:

I'm in the process of making my PHP site Unicode-aware. I'm wondering if anyone has experience with the mbstring.func_overload setting, which replaces the normal string functions (e.g. strlen) with their multi-byte equivalents (mb_strlen). There aren't any comments on the PHP manual page.

Are there any potential problems that I should be aware of? Any cases where calling the multi-byte version is a bad idea?

I suppose one example would be functions that deal with encryption, since they may expect to deal with strings of bytes, rather than strings of characters.

Also, the manual page includes a note: "It is not recommended to use the function overloading option in the per-directory context, because it's not confirmed yet to be stable enough in a production environment and may lead to undefined behaviour."

Does that mean that it's not stable in a per-directory context, or it's generally not stable? The wording is unclear.

347

asked Oct 21 '08 16:10

JW.

2 Answers

My answer is: definitely not!

The problem is that there is no easy way to "reset" str* functions once they are overloaded.

For some time this can work well with your project, but almost surely you will run into an external library that uses string functions to, for example, implement a binary protocol, and they will fail. They will fail and you will spend hours trying to find out why they are failing.

After you have found that it's mbstring.func_overload, you don't have too much option. You can ini_set the mbstring.internal_encoding to some one-byte-per-char encoding every time you call the external library and set it back right after, but if your library makes callbacks to your application, it will just mess up things.

Another option is to tweak the library manually, changing all str* functions to their mb_string counterpart and passing a one-byte-per-char as encoding parameter. This, however, isn't a great idea either, because you lose the ability to easily update your external, and you might cause some performance issues as well.

So, again, don't use func_overload. If you work with multi-byte strings, use the appropriate mb_ functions.

199

answered Oct 21 '22 07:10

gphilip

one issue you should definitely watch for is 3rd party scripts (perhaps a library or pear extension) which uses non mb-aware versions of functions. for example, libraries that use strlen() could cause issues if you overload it.

as well, this bug report shows that the virtual host bleeding of mb_overloaded functions have been corrected in 5.2/5.3 CVS versions. the bug is specific to per-directory configurations.

answered Oct 21 '22 08:10

Owen

Related questions
                            
                                update column collation with laravel
                            
                                Laravel Cors Middleware not working with POST Request
                            
                                Which is the best way of multiple inheritance in php?
                            
                                Is there a difference between PDO::exec and PDO::query when using PDO::ATTR_PERSISTENT = true?
                            
                                PHP: how to make a GET request with HTTP-Basic authentication
                            
                                Laravel - Package can't recognise Auth functions?
                            
                                How to remove Shipping section from Woocommerce cart page
                            
                                xDebug not working using docker, vscode and WSL 2
                            
                                Alpha vantage API Not working for NSE while the same query is giving output for foreign stocks
                            
                                Google chrome Version 84.0.4147.125 (Official Build) (64-bit) destroying application session when redirecting on callback function from third party
                            
                                Sylius install crashes because "No identifier defined"
                            
                                Laravel Pusher array_merge: Expected parameter 2 to be an array, null given
                            
                                Laravel WebSocket with Pusher Error when making POST Request
                            
                                How to get all values of an enum in PHP?
                            
                                To use views or not to use views
                            
                                PHP DOMDocument stripping HTML tags
                            
                                Pulling both the text and attribute of a given node using Xpath
                            
                                Zend Framework fetchAll
                            
                                Converting a Word document into usable HTML in PHP
                            
                                How to build large MySQL INSERT query in PHP without wasting memory

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Should I use multi-byte overloading (mbstring.func_overload)?

Tags:

php

unicode

JW.

People also ask

2 Answers

gphilip

Owen

Recent Activity

Donate For Us