Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse PHP file variables from Python script

Tags:

python

regex

php

I need to get some data from PHP(Wordpress) config files from my Python script. How I can parse config data? For example, how I can get $wp_version value? Config example:

/**
 * The WordPress version string
 *
 * @global string $wp_version
 */
$wp_version = '3.5.1';

/**
 * Holds the WordPress DB revision, increments when changes are made to the WordPress DB schema.
 *
 * @global int $wp_db_version
 */
$wp_db_version = 22441;

/**
 * Holds the TinyMCE version
 *
 * @global string $tinymce_version
 */
$tinymce_version = '358-23224';

/**
 * Holds the required PHP version
 *
 * @global string $required_php_version
 */
$required_php_version = '5.2.4';

/**
 * Holds the required MySQL version
 *
 * @global string $required_mysql_version
 */
$required_mysql_version = '5.0';

$wp_local_package = 'en_EN';
like image 620
inlanger Avatar asked Feb 16 '23 09:02

inlanger


2 Answers

You know that a simple variable in PHP is like $foo = 'bar';, let's create a regex that does not take in account something like $_GET or $foo['bar']:

  1. Start with $, note that we need to escape it:
    \$
  2. The first character after $ can't be a number and has to be a letter or underscore:
    \$[a-z]
  3. Then there may be a letter or digits or underscore after it:
    \$[a-z]\w*
  4. Let's put the parenthesis:
    \$([a-z]\w*)
  5. Now then there should be the "equal sign", but to make it more compatible, let's make the spaces optional:
    \$([a-z]\w*)\s*=\s*
  6. After this there should be a value and it ends with a ;:
    \$([a-z]\w*)\s*=\s*(.*?);$
  7. We will use the m modifier which make ^$ match start and end of line respectively.
  8. You can then use a trimming function to get ride of the single and double quotes.

Online demo

Note 1: This regex will fail at nested variables $fail = 'en_EN'; $fail2 = 'en_EN';
Note 2: Don't forget to use the i modifier to make it case insensitive.

like image 114
HamZa Avatar answered Feb 23 '23 17:02

HamZa


I've written a little python script to get pull database login information from wordpress's wp-config.php file for doing automatic site backups.

Here is the relevant part of my code (GitHub's syntax highlighting has trouble with Python's triple quoted strings):

#!/usr/bin/env python3
import re

define_pattern = re.compile(r"""\bdefine\(\s*('|")(.*)\1\s*,\s*('|")(.*)\3\)\s*;""")
assign_pattern = re.compile(r"""(^|;)\s*\$([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*)\s*=\s*('|")(.*)\3\s*;""")

php_vars = {}
for line in open("wp-config.php"):
  for match in define_pattern.finditer(line):
    php_vars[match.group(2)]=match.group(4)
  for match in assign_pattern.finditer(line):
    php_vars[match.group(2)]=match.group(4)
like image 44
pix Avatar answered Feb 23 '23 17:02

pix