I have a Pandas DataFrame with one column: <pre class="prettyprint"><code>import pandas as pd df = pd.DataFrame({"teams": [["SF", "NYG"] for _ in range(7)]}) teams 0 [SF, NYG] 1 [SF, NYG] 2 [SF, NYG] 3 [SF, NYG] 4 [SF, NYG] 5 [SF, NYG] 6 [SF, NYG] </code></pre> How can split this column of lists into two columns? Desired result: <pre class="prettyprint"><code> team1 team2 0 SF NYG 1 SF NYG 2 SF NYG 3 SF NYG 4 SF NYG 5 SF NYG 6 SF NYG </code></pre>

You can use the <code>DataFrame</code> constructor with <code>lists</code> created by <code>to_list</code>: <pre class="prettyprint"><code>import pandas as pd d1 = {'teams': [['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'], ['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG']]} df2 = pd.DataFrame(d1) print (df2) teams 0 [SF, NYG] 1 [SF, NYG] 2 [SF, NYG] 3 [SF, NYG] 4 [SF, NYG] 5 [SF, NYG] 6 [SF, NYG] </code></pre> <hr> <pre class="prettyprint"><code>df2[['team1','team2']] = pd.DataFrame(df2.teams.tolist(), index= df2.index) print (df2) teams team1 team2 0 [SF, NYG] SF NYG 1 [SF, NYG] SF NYG 2 [SF, NYG] SF NYG 3 [SF, NYG] SF NYG 4 [SF, NYG] SF NYG 5 [SF, NYG] SF NYG 6 [SF, NYG] SF NYG </code></pre> And for a new <code>DataFrame</code>: <pre class="prettyprint"><code>df3 = pd.DataFrame(df2['teams'].to_list(), columns=['team1','team2']) print (df3) team1 team2 0 SF NYG 1 SF NYG 2 SF NYG 3 SF NYG 4 SF NYG 5 SF NYG 6 SF NYG </code></pre> A solution with <code>apply(pd.Series)</code> is very slow: <pre class="prettyprint"><code>#7k rows df2 = pd.concat([df2]*1000).reset_index(drop=True) In [121]: %timeit df2['teams'].apply(pd.Series) 1.79 s ± 52.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) In [122]: %timeit pd.DataFrame(df2['teams'].to_list(), columns=['team1','team2']) 1.63 ms ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) </code></pre>

Much simpler solution: <pre class="prettyprint"><code>pd.DataFrame(df2["teams"].to_list(), columns=['team1', 'team2']) </code></pre> Yields, <pre class="prettyprint"><code> team1 team2 ------------- 0 SF NYG 1 SF NYG 2 SF NYG 3 SF NYG 4 SF NYG 5 SF NYG 6 SF NYG 7 SF NYG </code></pre> If you wanted to split a column of delimited strings rather than lists, you could similarly do: <pre class="prettyprint"><code>pd.DataFrame(df["teams"].str.split('<delim>', expand=True).values, columns=['team1', 'team2']) </code></pre>

Split a Pandas column of lists into multiple columns

import pandas as pd

df = pd.DataFrame({"teams": [["SF", "NYG"] for _ in range(7)]})

       teams
0  [SF, NYG]
1  [SF, NYG]
2  [SF, NYG]
3  [SF, NYG]
4  [SF, NYG]
5  [SF, NYG]
6  [SF, NYG]

How can split this column of lists into two columns?

Desired result:

  team1 team2
0    SF   NYG
1    SF   NYG
2    SF   NYG
3    SF   NYG
4    SF   NYG
5    SF   NYG
6    SF   NYG

711

asked Oct 09 '22 19:10

bgame2498

2 Answers

You can use the DataFrame constructor with lists created by to_list:

import pandas as pd

d1 = {'teams': [['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'],
                ['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG']]}
df2 = pd.DataFrame(d1)
print (df2)
       teams
0  [SF, NYG]
1  [SF, NYG]
2  [SF, NYG]
3  [SF, NYG]
4  [SF, NYG]
5  [SF, NYG]
6  [SF, NYG]

df2[['team1','team2']] = pd.DataFrame(df2.teams.tolist(), index= df2.index)
print (df2)
       teams team1 team2
0  [SF, NYG]    SF   NYG
1  [SF, NYG]    SF   NYG
2  [SF, NYG]    SF   NYG
3  [SF, NYG]    SF   NYG
4  [SF, NYG]    SF   NYG
5  [SF, NYG]    SF   NYG
6  [SF, NYG]    SF   NYG

And for a new DataFrame:

df3 = pd.DataFrame(df2['teams'].to_list(), columns=['team1','team2'])
print (df3)
  team1 team2
0    SF   NYG
1    SF   NYG
2    SF   NYG
3    SF   NYG
4    SF   NYG
5    SF   NYG
6    SF   NYG

A solution with apply(pd.Series) is very slow:

#7k rows
df2 = pd.concat([df2]*1000).reset_index(drop=True)

In [121]: %timeit df2['teams'].apply(pd.Series)
1.79 s ± 52.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [122]: %timeit pd.DataFrame(df2['teams'].to_list(), columns=['team1','team2'])
1.63 ms ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

432

answered Oct 12 '22 10:10

jezrael

Much simpler solution:

pd.DataFrame(df2["teams"].to_list(), columns=['team1', 'team2'])

Yields,

  team1 team2
-------------
0    SF   NYG
1    SF   NYG
2    SF   NYG
3    SF   NYG
4    SF   NYG
5    SF   NYG
6    SF   NYG
7    SF   NYG

If you wanted to split a column of delimited strings rather than lists, you could similarly do:

pd.DataFrame(df["teams"].str.split('<delim>', expand=True).values,
             columns=['team1', 'team2'])

answered Oct 12 '22 09:10

Joe Davison

Related questions
                            
                                Split / Explode a column of dictionaries into separate columns with pandas
                            
                                Autoreload of modules in IPython [duplicate]
                            
                                Python pandas Filtering out nan from a data selection of a column of strings
                            
                                Converting int to bytes in Python 3
                            
                                How to serialize SqlAlchemy result to JSON?
                            
                                How to get a complete list of object's methods and attributes? [duplicate]
                            
                                How can I check for Python version in a program that uses new language features?
                            
                                Pythonic way to combine datetime.date and datetime.time objects
                            
                                Take the content of a list and append it to another list
                            
                                How to create a numpy array of all True or all False?
                            
                                Count number of occurrences of a substring in a string
                            
                                Checking if a string can be converted to float in Python
                            
                                Running Python on Windows for Node.js dependencies
                            
                                How do I read image data from a URL in Python?
                            
                                Unable to set default python version to python3 in ubuntu
                            
                                How to "perfectly" override a dict?
                            
                                What is the preferred syntax for initializing a dict: curly brace literals {} or the dict() function?
                            
                                Python argparse ignore unrecognised arguments
                            
                                e.printStackTrace equivalent in python
                            
                                What is the difference between shallow copy, deepcopy and normal assignment operation?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Split a Pandas column of lists into multiple columns

Tags:

python

list

split

pandas

dataframe

bgame2498

People also ask

2 Answers

jezrael

Joe Davison

Recent Activity

Donate For Us