Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I tokenize a string in C++?

Java has a convenient split method:

String str = "The quick brown fox"; String[] results = str.split(" "); 

Is there an easy way to do this in C++?

like image 690
Bill the Lizard Avatar asked Sep 10 '08 12:09

Bill the Lizard


People also ask

How does string tokenizer work in C?

The C function strtok() is a string tokenization function that takes two arguments: an initial string to be parsed and a const -qualified character delimiter. It returns a pointer to the first character of a token or to a null pointer if there is no token.

What is Tokenizing a string with example?

The string tokenizer class allows an application to break a string into tokens. The tokenization method is much simpler than the one used by the StreamTokenizer class. The StringTokenizer methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.

What does it mean to tokenize a string?

Tokenization is the act of breaking up a sequence of strings into pieces such as words, keywords, phrases, symbols and other elements called tokens. Tokens can be individual words, phrases or even whole sentences. In the process of tokenization, some characters like punctuation marks are discarded.


1 Answers

The Boost tokenizer class can make this sort of thing quite simple:

#include <iostream> #include <string> #include <boost/foreach.hpp> #include <boost/tokenizer.hpp>  using namespace std; using namespace boost;  int main(int, char**) {     string text = "token, test   string";      char_separator<char> sep(", ");     tokenizer< char_separator<char> > tokens(text, sep);     BOOST_FOREACH (const string& t, tokens) {         cout << t << "." << endl;     } } 

Updated for C++11:

#include <iostream> #include <string> #include <boost/tokenizer.hpp>  using namespace std; using namespace boost;  int main(int, char**) {     string text = "token, test   string";      char_separator<char> sep(", ");     tokenizer<char_separator<char>> tokens(text, sep);     for (const auto& t : tokens) {         cout << t << "." << endl;     } } 
like image 71
Ferruccio Avatar answered Sep 23 '22 08:09

Ferruccio