Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript split at multiple delimters while keeping delimiters

Is there a better way than what I have (through regex, for instance) to turn

"div#container.blue"

into this

["div", "#container", ".blue"];

Here's what I've have...

var arr = [];
function process(h1, h2) {
    var first = h1.split("#");
    arr.push(first[0]);
    var secondarr = first[1].split(".");
    secondarr[0] = "#" + secondarr[0];
    arr.push(secondarr[0]);
    for (i = 1; i< secondarr.length; i++) {
        arr.push(secondarr[i] = "." + secondarr[i]);
    }
    return arr;
}
like image 948
natecraft1 Avatar asked Mar 01 '14 23:03

natecraft1


2 Answers

Why not something like this?

'div#container.blue'.split(/(?=[#.])/);

Because it's simply looking for a place where the next character is either # or the literal ., this does not capture anything, which makes it a zero length match. Because it's zero-length match, nothing is removed.

like image 75
Qantas 94 Heavy Avatar answered Sep 29 '22 20:09

Qantas 94 Heavy


As you've probably found, the issue is that split removes the item you're splitting on. You can solve that with regex capturing groups (the parenthesis):

var result = 'div#container.blue'.split(/(#[^#|^.]*)|(\.[^#|^.]*)/);

Now we've got the issue that result contains a lot of falsy values you don't want. A quick filter fixes that:

var result = 'div#container.blue'.split(/(#[^#|^.]*)|(\.[^#|^.]*)/).filter(function(x) {
  return !!x;
});

Appendix A: What the heck is that regex

I'm assuming you're only concerned with # and . as characters. That still gives us this monster: /(#[^#|^.]*)|(\.[^#|^.]*)/

This means we'll capture either a # or ., and then all the characters up until the next # or . (remembering that a period is significant in regex, so we need to escape it, unless we're inside the brackets).

like image 32
SomeKittens Avatar answered Sep 29 '22 20:09

SomeKittens