Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting string to array while ignoring content between apostrophes

Tags:

javascript

I need something that takes a string, and divides it into an array. I want to split it after every space, so that this -

"Hello everybody!" turns into ---> ["Hello", "Everybody!"]

However, I want it to ignore spaces inbetween apostrophes. So for examples -

"How 'are you' today?" turns into ---> ["How", "'are you'", "today?"]

Now I wrote the following code (which works), but something tells me that what I did is pretty much horrible and that it can be done with probably 50% less code. I'm also pretty new to JS so I guess I still don't adhere to all the idioms of the language.

function getFixedArray(text) {

        var textArray = text.split(' '); //Create an array from the string, splitting by spaces.

        var finalArray = [];
        var bFoundLeadingApostrophe = false;
        var bFoundTrailingApostrophe = false;
        var leadingRegExp = /^'/;
        var trailingRegExp = /'$/;
        var concatenatedString = "";

        for (var i = 0; i < textArray.length; i++) {
            var text = textArray[i];

            //Found a leading apostrophe
             if(leadingRegExp.test(text) && !bFoundLeadingApostrophe && !trailingRegExp.test(text)) {
                concatenatedString =concatenatedString + text;
                bFoundLeadingApostrophe = true;
             }

             //Found the trailing apostrophe
             else if(trailingRegExp.test(text ) && !bFoundTrailingApostrophe) {

                concatenatedString = concatenatedString + ' ' + text;
                finalArray.push(concatenatedString);

                concatenatedString = "";

                bFoundLeadingApostrophe = false;
                bFoundTrailingApostrophe = false;
             }

             //Found no trailing apostrophe even though the leading flag indicates true, so we want this string.
             else if (bFoundLeadingApostrophe && !bFoundTrailingApostrophe) {
                concatenatedString = concatenatedString + ' ' + text;
             }

             //Regular text
             else {
                finalArray.push(text);
             }

        }

        return finalArray;

    }

I would deeply appreciate it if somebody could go through this and teach me how this should be rewritten, in a more correct & efficient way (and perhaps a more "JS" way).

Thanks!

Edit -

Well I just found a few problems, some of which I fixed, and some I'm not sure how to handle without making this code too complex (for example the string "hello 'every body'!" doesn't split properly....)

like image 338
thomas Avatar asked Aug 10 '14 20:08

thomas


1 Answers

You could try matching instead of splitting:

string.match(/(?:['"].+?['"])|\S+/g)

The above regex will match anything in between quotes (including the quotes), or anything that's not a space otherwise.

If you want to also match characters after the quotes, like ? and ! you can try:

/(?:['"].+?['"]\W?)|\S+/g

For "hello 'every body'!" it will give you this array:

["hello", "'every body'!"]

Note that \W matches space as well, if you want to match punctuation you could be explicit by using a character class in place of \W

[,.?!]

Or simply trim the strings after matching:

string.match(regex).map(function(x){return x.trim()})
like image 129
elclanrs Avatar answered Oct 27 '22 17:10

elclanrs