Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the strtok function in C work? [duplicate]

Tags:

c

strtok

I found this sample program which explains the strtok function:

#include <stdio.h> #include <string.h>  int main () {     char str[] ="- This, a sample string.";     char * pch;     printf ("Splitting string \"%s\" into tokens:\n",str);     pch = strtok (str," ,.-");     while (pch != NULL)     {         printf ("%s\n",pch);         pch = strtok (NULL, " ,.-");     }     return 0; } 

However, I don't see how this is possible to work.

How is it possible that pch = strtok (NULL, " ,.-"); returns a new token. I mean, we are calling strtokwith NULL . This doesen't make a lot sense to me.

like image 762
user2426316 Avatar asked Jan 13 '14 17:01

user2426316


People also ask

How does strtok () work in C?

The strtok() function parses the string up to the first instance of the delimiter character, replaces the character in place with a null byte ( '\0' ), and returns the address of the first character in the token. Subsequent calls to strtok() begin parsing immediately after the most recently placed null character.

What is the purpose of strtok () function?

The strtok() function reads string1 as a series of zero or more tokens, and string2 as the set of characters serving as delimiters of the tokens in string1. The tokens in string1 can be separated by one or more of the delimiters from string2.

What does strtok function return?

strtok() returns a NULL pointer. The token ends with the first character contained in the string pointed to by string2. If such a character is not found, the token ends at the terminating NULL character.

Can strtok take multiple delimiters?

The function strtok breaks a string into a smaller strings, or tokens, using a set of delimiters. The string of delimiters may contain one or more delimiters and different delimiter strings may be used with each call to strtok .


2 Answers

Two things to know about strtok. As was mentioned, it "maintains internal state". Also, it messes up the string you feed it. Essentially, it will write a '\0' where it finds the token you supplied, and returns a pointer to the start of the string. Internally it maintains the location of the last token; and next time you call it, it starts from there.

The important corollary is that you cannot use strtok on a const char* "hello world"; type of string, since you will get an access violation when you modify contents of a const char* string.

The "good" thing about strtok is that it doesn't actually copy strings - so you don't need to manage additional memory allocation etc. But unless you understand the above, you will have trouble using it correctly.

Example - if you have "this,is,a,string", successive calls to strtok will generate pointers as follows (the ^ is the value returned). Note that the '\0' is added where the tokens are found; this means the source string is modified:

t  h  i  s  ,  i  s  ,  a  ,  s  t  r  i  n  g \0         this,is,a,string  t  h  i  s  \0 i  s  ,  a  ,  s  t  r  i  n  g \0         this ^ t  h  i  s  \0 i  s  \0 a  ,  s  t  r  i  n  g \0         is                ^ t  h  i  s  \0 i  s  \0 a  \0 s  t  r  i  n  g \0         a                         ^ t  h  i  s  \0 i  s  \0 a  \0 s  t  r  i  n  g \0         string                               ^ 

Hope it makes sense.

like image 163
Floris Avatar answered Oct 04 '22 06:10

Floris


strtok maintains internal state. When you call it with non-NULL it re-initializes itself to use the string you supply. When you call it with NULL it uses that string, and any other state its currently got to return the next token.

Because of the way strtok works you need to ensure that you link with a multithreaded version of the C runtime if you're writing a multithreaded application. This will ensure that each thread get its own internal state for strtok.

like image 41
Sean Avatar answered Oct 04 '22 04:10

Sean