Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A simpler regular expression to parse quoted strings

The question is simple. I have a string that contains multiple elements which are embedded in single-quotation marks:

var str = "'alice'   'anna marie' 'benjamin' 'christin'     'david' 'muhammad ali'"

And I want to parse it so that I have all those names in an array:

result = [
 'alice',
 'anna marie',
 'benjamin',
 'christin',
 'david',
 'muhammad ali'
]

Currently I'm using this code to do the job:

var result = str.match(/\s*'(.*?)'\s*'(.*?)'\s*'(.*?)'\s*'(.*?)'/);

But this regular expression is too long and it's not flexible, so if I have more elements in the str string, I have to edit the regular expression.

What is the fastest and most efficient way to do this parsing? Performance and felxibility is important in our web application.

I have looked at the following question but they are not my answer:

  • Regular Expression For Quoted String
  • Regular Expression - How To Find Words and Quoted Phrases
like image 280
AlexStack Avatar asked Jun 27 '12 13:06

AlexStack


3 Answers

Define the pattern once and use the global g flag.

var matches = str.match(/'[^']*'/g);

If you want the tokens without the single quotes around them, the normal approach would be to use sub-matches in REGEX - however JavaScript doesn't support the capturing of sub-groups when the g flag is used. The simplest (though not necessarily most efficient) way around this would be to remove them afterwards, iteratively:

if (matches)
    for (var i=0, len=matches.length; i<len; i++)
        matches[i] = matches[i].replace(/'/g, '');

[EDIT] - as the other answers say, you could use split() instead, but only if you can rely on there always being a space (or some common delimiter) between each token in your string.

like image 56
Mitya Avatar answered Oct 06 '22 14:10

Mitya


When a regex object has the the global flag set, you can execute it multiple times against a string to find all matches. It works by starting the next search after the last character matched in the last run:

var buf = "'abc' 'def' 'ghi'";
var exp = /'(.*?)'/g;
for(var match=exp.exec(buf); match!=null; match=exp.exec(buf)) {
  alert(match[0]);
}

Personally, I find it a really good way to parse strings.

EDIT: the expression /'(.*?)'/g matches any content between single-quote ('), the modifier *? is non-greedy and it greatly simplifies the pattern.

like image 33
Gerardo Lima Avatar answered Oct 06 '22 13:10

Gerardo Lima


A different approach

I came here needing an approach that could parse a string for quotes and non quotes, preserve the order of quotes and non quotes, then output it with specific tags wrapped around them for React or React Native so I ended up not using the answers here because I wasn't sure how to get them to fit my need then did this instead.

function parseQuotes(str) {
  var openQuote = false;
  var parsed = [];
  var quote = '';
  var text = '';
  var openQuote = false;

  for (var i = 0; i < str.length; i++) {
    var item = str[i];
    if (item === '"' && !openQuote) {
      openQuote = true;
      parsed.push({ type: 'text', value: text });
      text = '';
    }
    else if (item === '"' && openQuote) {
      openQuote = false;
      parsed.push({ type: 'quote', value: quote });
      quote = '';
    }
    else if (openQuote) quote += item;
    else text += item;  
  }

  if (openQuote) parsed.push({ type: 'text', value: '"' + quote });
  else parsed.push({ type: 'text', value: text });

  return parsed;
}

That when given this:

'Testing this "shhhh" if it "works!" " hahahah!'

produces that:

[
  {
    "type": "text",
    "value": "Testing this "
  },
  {
    "type": "quote",
    "value": "shhhh"
  },
  {
    "type": "text",
    "value": " if it "
  },
  {
    "type": "quote",
    "value": "works!"
  },
  {
    "type": "text",
    "value": " "
  },
  {
    "type": "text",
    "value": "\" hahahah!"
  }
]

which allows you to easily wrap tags around it depending on what it is.

https://jsfiddle.net/o6seau4e/4/

like image 20
King Friday Avatar answered Oct 06 '22 14:10

King Friday