I would like to know how I could transform the given string into the specified array:
String
all ("hi there \(option\)", (this, that), other) another
Result wanted (Array)
[0] => all,
[1] => Array(
[0] => "hi there \(option\)",
[1] => Array(
[0] => this,
[1] => that
),
[2] => other
),
[2] => another
This is used for a kind of console that I'm making on PHP.
I tried to use preg_match_all
but, I don't know how I could find parentheses inside parentheses in order to "make arrays inside arrays".
EDIT
All other characters that are not specified on the example should be treated as String
.
EDIT 2
I forgot to mention that all parameter's outside the parentheses should be detected by the space
character.
To pass command line arguments to the script, we simply put them right after the script name like so... Note that the 0th argument is the name of the PHP script that is run. The rest of the array are the values passed in on the command line. The values are accessed via the $argv array.
Introduction. When a PHP script is run from command line, $argv superglobal array contains arguments passed to it. First element in array $argv[0] is always the name of script. This variable is not available if register_argc_argv directive in php. ini is disabled.
Definition and Usage. The parse_str() function parses a query string into variables. Note: If the array parameter is not set, variables set by this function will overwrite existing variables of the same name.
PHP's Command Line Interface (CLI) allows you to execute PHP scripts when logged in to your server through SSH. ServerPilot installs multiple versions of PHP on your server so there are multiple PHP executables available to run.
You need to do this with a small custom parser: code takes input of this form and transforms it to the form you want.
In practice I find it useful to group parsing problems like this in one of three categories based on their complexity:
I classify this particular problem as belonging into the second category, which means that you can approach it like this:
To do this, you must first define -- at least informally, with a few quick notes -- the grammar that you want to parse. Keep in mind that most grammars are defined recursively at some point. So let's say our grammar is:
You can see that we have recursion in one place: a sequence can contain arrays, and an array is also defined in terms of a sequence (so it can contain more arrays etc).
Treating the matter informally as above is easier as an introduction, but reasoning about grammars is easier if you do it formally.
With the grammar in hand you know need to break the input down into tokens so that it can be processed. The component that takes user input and converts it to individual pieces defined by the grammar is called a lexer. Lexers are dumb; they are only concerned with the "outside appearance" of the input and do not attempt to check that it actually makes sense.
Here's a simple lexer I wrote to parse the above grammar (don't use this for anything important; may contain bugs):
$input = 'all ("hi there", (this, that) , other) another';
$tokens = array();
$input = trim($input);
while($input) {
switch (substr($input, 0, 1)) {
case '"':
if (!preg_match('/^"([^"]*)"(.*)$/', $input, $matches)) {
die; // TODO: error: unterminated string
}
$tokens[] = array('string', $matches[1]);
$input = $matches[2];
break;
case '(':
$tokens[] = array('open', null);
$input = substr($input, 1);
break;
case ')':
$tokens[] = array('close', null);
$input = substr($input, 1);
break;
case ',':
$tokens[] = array('comma', null);
$input = substr($input, 1);
break;
default:
list($word, $input) = array_pad(
preg_split('/(?=[^a-zA-Z])/', $input, 2),
2,
null);
$tokens[] = array('word', $word);
break;
}
$input = trim($input);
}
print_r($tokens);
Having done this, the next step is to build a parser: a component that inspects the lexed input and converts it to the desired format. A parser is smart; in the process of converting the input it also makes sure that the input is well-formed by the grammar's rules.
Parsers are commonly implemented as state machines (also known as finite state machines or finite automata) and work like this:
¹ Parser generators are programs whose input is a formal grammar and whose output is a lexer and a parser you can "just add water" to: just extend the code to perform "take some action" depending on the type of token; everything else is already taken care of. A quick search on this subject gives led PHP Lexer and Parser Generator?
There's no question that you should write parser if you are building syntax tree. But if you just need to parse this sample input regex
still might be a tool:
<?php
$str = 'all, ("hi there", (these, that) , other), another';
$str = preg_replace('/\, /', ',', $str); //get rid off extra spaces
/*
* get rid off undefined constants with surrounding them with quotes
*/
$str = preg_replace('/(\w+),/', '\'$1\',', $str);
$str = preg_replace('/(\w+)\)/', '\'$1\')', $str);
$str = preg_replace('/,(\w+)/', ',\'$1\'', $str);
$str = str_replace('(', 'array(', $str);
$str = 'array('.$str.');';
echo '<pre>';
eval('$res = '.$str); //eval is evil.
print_r($res); //print the result
Demo.
Note: If input will be malformed regex will definitely fail. I am writing this solution just in a case you need fast script. Writing lexer and parser is time-consuming work, that will need lots of research.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With