Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C: Parse empty tokens from a string with strtok

Tags:

c

string

My application produces strings like the one below. I need to parse values between the separator into individual values.

2342|2sd45|dswer|2342||5523|||3654|Pswt

I am using strtok to do this in a loop. For the fifth token, I am getting 5523. However, I need to account for the empty value between the two separators || as well. 5523 should be the sixth token, as per my requirement.

token = (char *)strtok(strAccInfo, "|");

for (iLoop=1;iLoop<=106;iLoop++) { 
            token = (char *)strtok(NULL, "|");
}

Any suggestions?

like image 966
Bash Avatar asked Jul 30 '10 21:07

Bash


People also ask

Can strtok return empty string?

This function returns an empty string when no more tokens are found. Each call modifies sTokenString by substituting a null character for each delimiter that is encountered. Defines the string to search in. Each call modifies this parameter by substituting a null character for each delimiter that is encountered.

How do you get tokens from strtok?

To get the first token from s1, strtok() is called with s1 as its first parameter. Remaining tokens from s1 are obtained by calling strtok() with a null pointer for the first parameter. The string of delimiters, s2, can differ from call to call.

Does strtok work with string?

The first time strtok() is called, it returns a pointer to the first token in string1. In later calls with the same token string, strtok() returns a pointer to the next token in the string. A NULL pointer is returned when there are no more tokens.

What does strtok () do in C?

The C function strtok() is a string tokenization function that takes two arguments: an initial string to be parsed and a const -qualified character delimiter. It returns a pointer to the first character of a token or to a null pointer if there is no token.


2 Answers

In that case I often prefer a p2 = strchr(p1, '|') loop with a memcpy(s, p1, p2-p1) inside. It's fast, does not destroy the input buffer (so it can be used with const char *) and is really portable (even on embedded).

It's also reentrant; strtok isn't. (BTW: reentrant has nothing to do with multi-threading. strtok breaks already with nested loops. One can use strtok_r but it's not as portable.)

like image 85
Patrick Schlüter Avatar answered Sep 23 '22 21:09

Patrick Schlüter


That's a limitation of strtok. The designers had whitespace-separated tokens in mind. strtok doesn't do much anyway; just roll your own parser. The C FAQ has an example.

like image 42
Gilles 'SO- stop being evil' Avatar answered Sep 24 '22 21:09

Gilles 'SO- stop being evil'