Say I have the following Pandas Dataframe: <pre class="prettyprint"><code>df = pd.DataFrame({"a" : [1,2,3], "b" : [[1,2],[2,3,4],[5]]}) a b 0 1 [1, 2] 1 2 [2, 3, 4] 2 3 [5] </code></pre> How would I "unstack" the lists in the "b" column in order to transform it into the dataframe: <pre class="prettyprint"><code> a b 0 1 1 1 1 2 2 2 2 3 2 3 4 2 4 5 3 5 </code></pre>

UPDATE: generic vectorized approach - will work also for multiple columns DFs: assuming we have the following DF: <pre class="prettyprint"><code>In [159]: df Out[159]: a b c 0 1 [1, 2] 5 1 2 [2, 3, 4] 6 2 3 [5] 7 </code></pre> Solution: <pre class="prettyprint"><code>In [160]: lst_col = 'b' In [161]: pd.DataFrame({ ...: col:np.repeat(df[col].values, df[lst_col].str.len()) ...: for col in df.columns.difference([lst_col]) ...: }).assign(**{lst_col:np.concatenate(df[lst_col].values)})[df.columns.tolist()] ...: Out[161]: a b c 0 1 1 5 1 1 2 5 2 2 2 6 3 2 3 6 4 2 4 6 5 3 5 7 </code></pre> Setup: <pre class="prettyprint"><code>df = pd.DataFrame({ "a" : [1,2,3], "b" : [[1,2],[2,3,4],[5]], "c" : [5,6,7] }) </code></pre> Vectorized NumPy approach: <pre class="prettyprint"><code>In [124]: pd.DataFrame({'a':np.repeat(df.a.values, df.b.str.len()), 'b':np.concatenate(df.b.values)}) Out[124]: a b 0 1 1 1 1 2 2 2 2 3 2 3 4 2 4 5 3 5 </code></pre> OLD answer: Try this: <pre class="prettyprint"><code>In [89]: df.set_index('a', append=True).b.apply(pd.Series).stack().reset_index(level=[0, 2], drop=True).reset_index() Out[89]: a 0 0 1 1.0 1 1 2.0 2 2 2.0 3 2 3.0 4 2 4.0 5 3 5.0 </code></pre> Or bit nicer solution provided by @Boud: <pre class="prettyprint"><code>In [110]: df.set_index('a').b.apply(pd.Series).stack().reset_index(level=-1, drop=True).astype(int).reset_index() Out[110]: a 0 0 1 1 1 1 2 2 2 2 3 2 3 4 2 4 5 3 5 </code></pre>

"unstack" a pandas column containing lists into multiple rows [duplicate]

df = pd.DataFrame({"a" : [1,2,3], "b" : [[1,2],[2,3,4],[5]]})    a          b 0  1     [1, 2] 1  2  [2, 3, 4] 2  3        [5]

How would I "unstack" the lists in the "b" column in order to transform it into the dataframe:

   a  b 0  1  1 1  1  2 2  2  2 3  2  3 4  2  4 5  3  5

360

asked Feb 02 '17 20:02

Alex

1 Answers

UPDATE: generic vectorized approach - will work also for multiple columns DFs:

assuming we have the following DF:

In [159]: df Out[159]:    a          b  c 0  1     [1, 2]  5 1  2  [2, 3, 4]  6 2  3        [5]  7

Solution:

In [160]: lst_col = 'b'  In [161]: pd.DataFrame({      ...:     col:np.repeat(df[col].values, df[lst_col].str.len())      ...:     for col in df.columns.difference([lst_col])      ...: }).assign(**{lst_col:np.concatenate(df[lst_col].values)})[df.columns.tolist()]      ...: Out[161]:    a  b  c 0  1  1  5 1  1  2  5 2  2  2  6 3  2  3  6 4  2  4  6 5  3  5  7

Setup:

df = pd.DataFrame({     "a" : [1,2,3],     "b" : [[1,2],[2,3,4],[5]],     "c" : [5,6,7] })

Vectorized NumPy approach:

In [124]: pd.DataFrame({'a':np.repeat(df.a.values, df.b.str.len()),                         'b':np.concatenate(df.b.values)}) Out[124]:    a  b 0  1  1 1  1  2 2  2  2 3  2  3 4  2  4 5  3  5

OLD answer:

Try this:

In [89]: df.set_index('a', append=True).b.apply(pd.Series).stack().reset_index(level=[0, 2], drop=True).reset_index() Out[89]:    a    0 0  1  1.0 1  1  2.0 2  2  2.0 3  2  3.0 4  2  4.0 5  3  5.0

Or bit nicer solution provided by @Boud:

In [110]: df.set_index('a').b.apply(pd.Series).stack().reset_index(level=-1, drop=True).astype(int).reset_index() Out[110]:    a  0 0  1  1 1  1  2 2  2  2 3  2  3 4  2  4 5  3  5

199

answered Sep 20 '22 02:09

MaxU - stop WAR against UA

Related questions
                            
                                python : can reduce be translated into list comprehensions like map, lambda and filter?
                            
                                Make Flask's url_for use the 'https' scheme in an AWS load balancer without messing with SSLify
                            
                                Keras: Binary_crossentropy has negative values
                            
                                How to install python in a docker image?
                            
                                The workers in ThreadPoolExecutor is not really daemon
                            
                                django model Form. Include fields from related models
                            
                                Testing Python Decorators?
                            
                                How to make an internal hyperlink in Sphinx documentation [duplicate]
                            
                                are user defined classes mutable
                            
                                Beginner Python: AttributeError: 'list' object has no attribute
                            
                                "Reduce" function for Series
                            
                                Python 3 urllib ignore SSL certificate verification
                            
                                pip install -r requirements.txt [Errno 2] No such file or directory: 'requirements.txt'
                            
                                Google Coding Challenge Question 2020 : Unspecified Words
                            
                                Python mechanize - two buttons of type 'submit'
                            
                                Matplotlib: simultaneous plotting in multiple threads
                            
                                TypeError: 'NoneType' object has no attribute '__getitem__'
                            
                                Difference between Kivy and PY4A
                            
                                Correct way to test for numpy.dtype
                            
                                Pandas ".convert_objects(convert_numeric=True)" deprecated [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

"unstack" a pandas column containing lists into multiple rows [duplicate]

Tags:

python

list

pandas

dataframe

Alex

People also ask

1 Answers

MaxU - stop WAR against UA

Recent Activity

Donate For Us