I have a Pandas DataFrame column with multiple lists within a list. Something like this: <pre class="prettyprint"><code>df col1 0 [[1,2], [2,3]] 1 [[a,b], [4,5], [x,y]] 2 [[6,7]] </code></pre> I want to split the list over multiple columns so the output should be something like: <pre class="prettyprint"><code> col1 col2 col3 0 [1,2] [2,3] 1 [a,b] [4,5] [x,y] 2 [6,7] </code></pre> Please help me with this. Thanks in advance

You can use <code>pd.Series.apply</code>: <pre class="prettyprint"><code>df = pd.DataFrame({'col1': [[[1, 2], [2, 3]], [['a', 'b'], [4, 5], ['x', 'y']], [[6, 7]]]}) res = df['col1'].apply(pd.Series) print(res) 0 1 2 0 [1, 2] [2, 3] NaN 1 [a, b] [4, 5] [x, y] 2 [6, 7] NaN NaN </code></pre>

I think need <code>DataFrame</code> contructor if performance is important: <pre class="prettyprint"><code>df = pd.DataFrame(df['col1'].values.tolist()) print (df) 0 1 2 0 [1, 2] [2, 3] None 1 [a, b] [4, 5] [x, y] 2 [6, 7] None None </code></pre> If need remove <code>NaN</code>s - missing values first add <code>dropna</code>: <pre class="prettyprint"><code>df = pd.DataFrame(df['col1'].dropna().values.tolist()) </code></pre>

Split lists within dataframe column into multiple columns [duplicate]

Tags:

python

pandas

I have a Pandas DataFrame column with multiple lists within a list. Something like this:

df
     col1
0    [[1,2], [2,3]]
1    [[a,b], [4,5], [x,y]] 
2    [[6,7]]

I want to split the list over multiple columns so the output should be something like:

    col1    col2     col3
0   [1,2]   [2,3]   
1   [a,b]   [4,5]    [x,y]
2   [6,7]

Please help me with this. Thanks in advance

540

asked May 22 '18 08:05

Ronnie

2 Answers

You can use pd.Series.apply:

df = pd.DataFrame({'col1': [[[1, 2], [2, 3]],
                            [['a', 'b'], [4, 5], ['x', 'y']],
                            [[6, 7]]]})

res = df['col1'].apply(pd.Series)

print(res)

        0       1       2
0  [1, 2]  [2, 3]     NaN
1  [a, b]  [4, 5]  [x, y]
2  [6, 7]     NaN     NaN

answered Nov 15 '22 00:11

jpp

I think need DataFrame contructor if performance is important:

df = pd.DataFrame(df['col1'].values.tolist())
print (df)
        0       1       2
0  [1, 2]  [2, 3]    None
1  [a, b]  [4, 5]  [x, y]
2  [6, 7]    None    None

If need remove NaNs - missing values first add dropna:

df = pd.DataFrame(df['col1'].dropna().values.tolist())

answered Nov 14 '22 23:11

jezrael

Related questions
                            
                                docx-python word doc page break
                            
                                How to constantly run Google Colaboratory at a specific time every day?
                            
                                Assigning (yield) to a variable
                            
                                Activating conda environment from c# code (or what is the differences between manually opening cmd and opening it from c#?)
                            
                                How to generate multiple airflow dags through a single script?
                            
                                How to share numpy random state of a parent process with child processes?
                            
                                Understanding Self Internally in Python
                            
                                Extracting items out of an element.ResultSet
                            
                                How to parallelize python api calls?
                            
                                Replace negative values in single DataFrame column
                            
                                Find the maximum values of a column in multiindex dataframe and return all its values
                            
                                Getting Flask JSON response as an HTML Table?
                            
                                Python Numpy vectorize nested for-loops for combinatorics
                            
                                Python error: FileNotFoundError: [Errno 2] No such file or directory
                            
                                Creating an RGB picture in Python with OpenCV from a randomized array
                            
                                Tweepy check if a tweet is a retweet
                            
                                Python pysftp get_r from Linux works fine on Linux but not on Windows
                            
                                Python - Matplotlib / matplotlib.cbook.TimeoutError: LOCKERROR
                            
                                Tensorflow: how to use pretrained weights in new graph?
                            
                                'jupyter notebook' command not working on Linux

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With