pandas.series.split(' ',expand =True) With Column Names

Question

I have a Pandas Data Frame with two string columns, which I would like to split on space, like this:

 df =
        A                                   B
        0.1  0.5  0.01 ...                    0.3  0.1  0.4 ...

I would like to split both these columns and form new columns for as many values, which result out of the split.

So, the result:

df =
       A1      A2.    A3  ...               B1        B2        B3
       0.1     0.5   0.01 ...               0.3       0.1       0.4

Currently, I am doing:

 df = df.join(df['A'].str.split(' ', expand = True)
 df = df.join(df['B'].str.split(' ', expand = True)

But, I get the following error:

 columns overlap but no suffix specified

This is because I guess columns names of 1st and 2nd split overlap?

So, my question is how to split multiple columns by providing column names or suffixes for multiple splits?

jezrael · Accepted Answer

Use DataFrame.add_prefix for columns names by splitted column:

df = df.join(df['A'].str.split(expand = True).add_prefix('A'))
df = df.join(df['B'].str.split(expand = True).add_prefix('B'))
print (df)
              A            B   A0   A1    A2   B0   B1   B2
0  0.1 0.5 0.01  0.3 0.1 0.4  0.1  0.5  0.01  0.3  0.1  0.4

Another idea is use list comprehension:

cols = ['A','B']
df1 = pd.concat([df[c].str.split(expand=True).add_prefix(c) for c in cols], axis=1)
print (df1)
    A0   A1    A2   B0   B1   B2
0  0.1  0.5  0.01  0.3  0.1  0.4

And for add all original columns:

df = df.join(df1)

pandas.series.split(' ',expand =True) With Column Names

Tags:

split

python-3.x

pandas

learner

1 Answers

jezrael

Recent Activity

Donate For Us

pandas.series.split(' ',expand =True) With Column Names

Tags:

split

python-3.x

pandas

learner

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us