id   string   
0    31672;0           
1    31965;0
2    0;78464
3      51462
4    31931;0
Hi, I have that table. i would like to split the string table by the ';', and store it to the new column. the final column shold be like this
 id   string   word_count
0    31672;0    2       
1    31965;0    2
2    0;78464    2
3      51462    1
4    31931;0    2
it would be nice if someone knows how to do it with python.
Option 1
The basic solution using str.split + str.len - 
df['word_count'] = df['string'].str.split(';').str.len()
df
     string  word_count
id                     
0   31672;0           2
1   31965;0           2
2   0;78464           2
3     51462           1
4   31931;0           2
Option 2
The clever (efficient, less space consuming) solution with str.count - 
df['word_count'] = df['string'].str.count(';') + 1
df
     string  word_count
id                     
0   31672;0           2
1   31965;0           2
2   0;78464           2
3     51462           1
4   31931;0           2
Caveat - this would ascribe a word count of 1 even for an empty string (in which case, stick with option 1).
If you want each word occupying a new column, there's a quick and simple way using tolist, loading the splits into a new dataframe, and concatenating the new dataframe with the original using concat - 
v = pd.DataFrame(df['string'].str.split(';').tolist())\
        .rename(columns=lambda x: x + 1)\
        .add_prefix('string_')
pd.concat([df, v], 1)
     string  word_count string_1 string_2
id                                       
0   31672;0           2    31672        0
1   31965;0           2    31965        0
2   0;78464           2        0    78464
3     51462           1    51462     None
4   31931;0           2    31931        0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With