I found this sample program which explains the strtok
function:
#include <stdio.h> #include <string.h> int main () { char str[] ="- This, a sample string."; char * pch; printf ("Splitting string \"%s\" into tokens:\n",str); pch = strtok (str," ,.-"); while (pch != NULL) { printf ("%s\n",pch); pch = strtok (NULL, " ,.-"); } return 0; }
However, I don't see how this is possible to work.
How is it possible that pch = strtok (NULL, " ,.-");
returns a new token. I mean, we are calling strtok
with NULL
. This doesen't make a lot sense to me.
The strtok() function parses the string up to the first instance of the delimiter character, replaces the character in place with a null byte ( '\0' ), and returns the address of the first character in the token. Subsequent calls to strtok() begin parsing immediately after the most recently placed null character.
The strtok() function reads string1 as a series of zero or more tokens, and string2 as the set of characters serving as delimiters of the tokens in string1. The tokens in string1 can be separated by one or more of the delimiters from string2.
strtok() returns a NULL pointer. The token ends with the first character contained in the string pointed to by string2. If such a character is not found, the token ends at the terminating NULL character.
The function strtok breaks a string into a smaller strings, or tokens, using a set of delimiters. The string of delimiters may contain one or more delimiters and different delimiter strings may be used with each call to strtok .
Two things to know about strtok
. As was mentioned, it "maintains internal state". Also, it messes up the string you feed it. Essentially, it will write a '\0'
where it finds the token you supplied, and returns a pointer to the start of the string. Internally it maintains the location of the last token; and next time you call it, it starts from there.
The important corollary is that you cannot use strtok
on a const char* "hello world";
type of string, since you will get an access violation when you modify contents of a const char*
string.
The "good" thing about strtok
is that it doesn't actually copy strings - so you don't need to manage additional memory allocation etc. But unless you understand the above, you will have trouble using it correctly.
Example - if you have "this,is,a,string", successive calls to strtok
will generate pointers as follows (the ^
is the value returned). Note that the '\0'
is added where the tokens are found; this means the source string is modified:
t h i s , i s , a , s t r i n g \0 this,is,a,string t h i s \0 i s , a , s t r i n g \0 this ^ t h i s \0 i s \0 a , s t r i n g \0 is ^ t h i s \0 i s \0 a \0 s t r i n g \0 a ^ t h i s \0 i s \0 a \0 s t r i n g \0 string ^
Hope it makes sense.
strtok
maintains internal state. When you call it with non-NULL it re-initializes itself to use the string you supply. When you call it with NULL
it uses that string, and any other state its currently got to return the next token.
Because of the way strtok
works you need to ensure that you link with a multithreaded version of the C runtime if you're writing a multithreaded application. This will ensure that each thread get its own internal state for strtok
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With