Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ Tokenize a string with spaces and quotes

Tags:

c++

string

token

I would like to write something in C++ that tokenize a string. For the sake of clarity, consider the following string:

add string "this is a string with spaces!"

This must be split as follows:

add
string
this is a string with spaces!

Is there a quick and standard-library-based approach?

like image 946
the_candyman Avatar asked Sep 07 '13 16:09

the_candyman


3 Answers

No library is needed. An iteration can do the task ( if it is as simple as you describe).

string str = "add string \"this is a string with space!\"";

for( size_t i=0; i<str.length(); i++){

    char c = str[i];
    if( c == ' ' ){
        cout << endl;
    }else if(c == '\"' ){
        i++;
        while( str[i] != '\"' ){ cout << str[i]; i++; }
    }else{
        cout << c;
    }
}

that outputs

add
string
this is a string with space!
like image 95
Leo Zhuang Avatar answered Nov 18 '22 16:11

Leo Zhuang


I wonder why this simple and C++ style solution is not presented here. It's based on fact that if we first split string by \", then each even chunk is "inside" quotes, and each odd chunk should be additionally splitted by whitespaces.

No possibility for out_of_range or anything else.

unsigned counter = 0;
std::string segment;
std::stringstream stream_input(input);
while(std::getline(stream_input, segment, '\"'))
{
    ++counter;
    if (counter % 2 == 0)
    {
        if (!segment.empty())
            std::cout << segment << std::endl;
    }
    else
    {
        std::stringstream stream_segment(segment);
        while(std::getline(stream_segment, segment, ' '))
            if (!segment.empty())
                std::cout << segment << std::endl;
    }
}
like image 7
Arkady Avatar answered Nov 18 '22 16:11

Arkady


Here is a complete function for it. Modify it according to need, it adds parts of string to a vector strings(qargs).

void split_in_args(std::vector<std::string>& qargs, std::string command){
        int len = command.length();
        bool qot = false, sqot = false;
        int arglen;
        for(int i = 0; i < len; i++) {
                int start = i;
                if(command[i] == '\"') {
                        qot = true;
                }
                else if(command[i] == '\'') sqot = true;

                if(qot) {
                        i++;
                        start++;
                        while(i<len && command[i] != '\"')
                                i++;
                        if(i<len)
                                qot = false;
                        arglen = i-start;
                        i++;
                }
                else if(sqot) {
                        i++;
                        start++;
                        while(i<len && command[i] != '\'')
                                i++;
                        if(i<len)
                                sqot = false;
                        arglen = i-start;
                        i++;
                }
                else{
                        while(i<len && command[i]!=' ')
                                i++;
                        arglen = i-start;
                }
                qargs.push_back(command.substr(start, arglen));
        }
        for(int i=0;i<qargs.size();i++){
                std::cout<<qargs[i]<<std::endl;
        }
        std::cout<<qargs.size();
        if(qot || sqot) std::cout<<"One of the quotes is open\n";
}
like image 5
ayushgp Avatar answered Nov 18 '22 14:11

ayushgp