Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does WordPress still use addslashes(), register_globals() and magic_quotes?

In order to gain more experience in Wordpress I delved into its code base to study its inner working and its workflow, and I was quite astonished when I saw that:

  1. They implement register_globals (an excerpt from wp-includes/class-wp.php):

     // The query_vars property will be extracted to the GLOBALS. So care should  // be taken when naming global variables that might interfere with the  // WordPress environment.  function register_globals() {     global $wp_query;     // Extract updated query vars back into global namespace.     foreach ( (array) $wp_query->query_vars as $key => $value) {         $GLOBALS[$key] = $value;     } 
  2. They rely on magic quotes (exerpt from wp-includes/functions.php. magic_quotes_gpc is turned off at bootstrapping, before calling this function):

    function add_magic_quotes( $array ) {     foreach ( (array) $array as $k => $v ) {     if ( is_array( $v ) ) {         $array[$k] = add_magic_quotes( $v );     } else {             $array[$k] = addslashes( $v );     } 
  3. They rely on addslashes (but since 2.8.0 they introduced also mysql_real_escape_string, but the _weak_escape() function that uses addslashes() still exists in the wpdb class)
    UPDATE: I see they emulate prepared statements by using sprintf() and custom placedholders, so queries should be safe I think. Still I'm puzzled on why they don't provide at least mysqli, after all the detection of Mysql and PHP version happens early in the bootstrapping sequence.

Now, from the year-long frequentation of SO I learned a lot of things, especially that the above three function are "deprecated" and show security issues, and are watched in horror by many.

But WP must have a reason to use them. I'd like to know from more experienced programmers if there are really security issues, or if sometimes their usage is just too clouded in rumors and false convinctions. I know that magic_quotes is an heritage from the past, and the same could be said for addslashes (at least when used for databases), but while googling before asking this I found many websites talking about using addslashes() over mysql_real_escape_string().

I'm interested in knowing a clear, detailed reason on why those badly depicted functions are used; Wordpress had had many improvements over the years, addressing different aspects, and yet these functions are still used; I'm looking, therefore, to a concrete explanation over the positive aspects that somehow override the negative ones and justify the usage of those functions.

I'm not looking for opinions (I perfectly know they're offtopic here), nor I am ranting about Wordpress, I hope this is clear. I'd just like to know why many php programmers consider these functions "bad", and yet a worldwide giant like Wordpress, who's at the 3rd version now, still uses them.

Is this for compatibility with different servers and php versions? (they check very earl for those, though).
Is there something I miss about this functions, how important they can be in an environment like wordpress (or in general)? I'm quite confused, to be honest.

like image 566
Damien Pirsy Avatar asked Feb 22 '12 22:02

Damien Pirsy


2 Answers

Wordpress Open Tickets over Time
(Wordpress Open Tickets over Time)

Don't rely on the Wordpress codebase to do assumptions about good practice or current standards in PHP coding. I'm saying this as someone who has fiddled with wordpress development over a longer period of time.

Wordpress codebase is about 10 years old, it's full of legacy code[1]. The program can not evolve on the code-level much because of that, so you find a lot of workarounds for problems that are already solved nowadays much better.

Just take this story: PHP had magic quotes. Wordpress developers thought it was useful. So for those hosts that did not have it configured, they added it. Ending up whith code that expects slashed input data often and at various places. The simple thing is, now they just can't change it to proper input processing and sanitization easily because of the usage of (super)globals introducing static global state nearly everywhere.

You can not easily refactor such code.

Same for the database class. It has a long history, originally based on an early version of ezSQL. At that time there was not mysql_real_escape_string and when it was introduced, the WP devs had the problem that not all installation bases support it.

So don't wonder about the coding practice you find inside the Wordpress code. You'll learn how things could have been done years ago and with more or less outdated PHP versions. It's not that long ago that Wordpress switched to PHP 5 for example.

  • Backwards compatibility.
  • Target a large amount of (technically more or less outdated) hosts.
  • Don't break what works with defects.

This might not be your list of priorities (hopefully), projects differ here a lot. But having a legacy code-base alone is a burden regardless how project priorities are set. Wordpress is only one example.


[1] see Milestones of WordPress: Early Project Timeline (ca. 2000 to 2005))

like image 72
hakre Avatar answered Oct 05 '22 13:10

hakre


In complement to @tom answer.

Magic Quotes

Automatically parsing the whole entries and adding magic quotes is both creating bugs and useless.

  • Useless as you cannot rely on magic quotes to secure your input (multy-bytes encoding bugs for SQL injections is an example). So you need to apply a real filter before saving your data to a database
  • Creating bugs: If you need to really escape your data before a save in database you have to check that it's not already escaped (and the simple fact this settings exists and may be enforced by the hosting environment makes that you have to check this setting was set or not).
  • Creating bugs: All the data sent by the user is not always dedicated to a database storage. Escaping it may break the content, think about a json content for example, or even file content with the dangerous magic_quote_runtime
  • Creating bugs: All database storage are not escaping quotes the same way...

So Why?, why do we see such function in a CMS?

  • see that here it's an add_magic_quotes function, that can be used on a dedicated array, maybe not on _GET or _POST. But effectively the fact this function is just using addslashes and not a database dedicated function makes it quite bad.
  • The fact the hosting provider may enforce an automatic magic quotes is a nightmare for a CMS developper. Either you detect it and tell the user you refuse to run, or you have to manage the fact the content may or may have not be magically-addslahed... and to put everyone in the same state, you run the non-addslashed content in this function so that at least everyone is in the same (bad) state.
  • From what I can see on Wordpress, before the save a stripslahes_deep is performed in the wp_insert_post. And add_magic_quotes is usually performed on data pulled from Db before this data is send to the wp_insert_post. This may me think the problem is effectively to add slashes before removing them... maybe because sanitize filters which happen before the save expect content with slashes, or maybe because no one remember why the code is running in this way :-)

register_globals

Seems that this is the way to implement a Registry pattern in wordpress... They wanted to make the code simple to understand, and to allow a simple way to access importants objects like the query or the post. And an object oriented Registry class was not in the simple PHP way, where the $_GLOBALS array is already an existing registry.

Having a Registry is a perfectly valid thing in an application. a register_global thing is dangerous only if you allow some user input to override your valid secure input. And of course only if this secure input is taken from $_GLOBALS elsewhere (or with global keyword).

The dangerous part in the function here is the part of the function you have extracted, the loop on $query->query_vars. You will have to track the calls to see if user injected keys could run throught wp_parse_args and end in that function. But the next part of this function is fixing $_GLOBALS content for several objects:

$GLOBALS['query_string'] = $this->query_string; $GLOBALS['posts'] = & $wp_query->posts; $GLOBALS['post'] = (isset($wp_query->post)) ? $wp_query->post : null; $GLOBALS['request'] = $wp_query->request; 

So at least theses globals cannot be overwritten by user input and are safe.

So, theses functions are bad. But you can use them if you understand what they do and what you need to do to prevent the bad effects. And when you want to implement a simple framework for developpers, available on a very wide environments you sometimes have to use them.

But for sure it's a bad practice, you can certainly find bad wordpress plugins using $_GLOBALS in the wrong way or misusing the add_magic_quotes to data pulled from db wordpress concept. But there will be years before a Zend Framework CMS gained such a big number of contributions.

like image 21
regilero Avatar answered Oct 05 '22 13:10

regilero