Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split strings in tuples into columns, in Pandas

I have the following DataFrame, where Track ID is the row index. How can I split the string in the stats column into 5 columns of numbers?

Track ID    stats 14.0    (-0.00924175824176, 0.41, -0.742016492568, 0.0036830094242, 0.00251748449963) 28.0    (0.0411538461538, 0.318230769231, 0.758717081514, 0.00264000622468, 0.0106535783677) 42.0    (-0.0144351648352, 0.168438461538, -0.80870348637, 0.000816872566404, 0.00316572586742) 56.0    (0.0343461538462, 0.288730769231, 0.950844962874, 6.1608706775e-07, 0.00337262030771) 70.0    (0.00905164835165, 0.151030769231, 0.670257006716, 0.0121790506745, 0.00302182567957) 84.0    (-0.0047967032967, 0.171615384615, -0.552879463981, 0.0500316517755, 0.00217970256969) 
like image 788
t_n Avatar asked Mar 31 '15 13:03

t_n


People also ask

How do you split a tuple into two columns in pandas?

To split a column of tuples in a Python Pandas data frame, we can use the column's tolist method. We create the df data frame with the pd. DataFrame class and a dictionary. Then we create a new data frame from df by using df['b'].

How do I separate strings in pandas?

split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.

How do I split a single column into multiple columns in Python?

We can use str. split() to split one column to multiple columns by specifying expand=True option. We can use str. extract() to exract multiple columns using regex expression in which multiple capturing groups are defined.

How do I split a string into multiple rows in pandas?

To split cell into multiple rows in a Python Pandas dataframe, we can use the apply method. to call apply with a lambda function that calls str. split to split the x string value. And then we call explode to fill new rows with the split values.


1 Answers

And for the other case, assuming it are strings that look like tuples:

In [74]: df['stats'].str[1:-1].str.split(',', expand=True).astype(float) Out[74]:           0         1         2         3         4 0 -0.009242  0.410000 -0.742016  0.003683  0.002517 1  0.041154  0.318231  0.758717  0.002640  0.010654 2 -0.014435  0.168438 -0.808703  0.000817  0.003166 3  0.034346  0.288731  0.950845  0.000001  0.003373 4  0.009052  0.151031  0.670257  0.012179  0.003022 5 -0.004797  0.171615 -0.552879  0.050032  0.002180 

(note: for older versions of pandas (< 0.16.1), you need to use return_type='frame' instead of the expand keyword)

By the way, if it are tuples and not strings, you can simply do the following:

pd.DataFrame(df['stats'].tolist(), index=df.index) 
like image 80
joris Avatar answered Sep 21 '22 15:09

joris