Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas explode function not working for list of string column

Tags:

python

pandas

To explode list like column to row, we can use pandas explode() function. My pandas' version '0.25.3'

The given example worked for me and another answer of Stackoverflow.com works as expected but it doesn't work for my dataset.

    city        nested_city
0   soto        ['Soto']
1   tera-kora   ['Daniel']
2   jan-thiel   ['Jan Thiel']
3   westpunt    ['Westpunt']
4   nieuwpoort  ['Nieuwpoort', 'Santa Barbara Plantation']

What I have tried:

test_data['nested_city'].explode()

and

test_data.set_index(['nested_city']).apply(pd.Series.explode).reset_index()

Output

0    ['Soto']                                  
1    ['Daniel']                                
2    ['Jan Thiel']                             
3    ['Westpunt']                              
4    ['Nieuwpoort', 'Santa Barbara Plantation']
Name: neighbors, dtype: object
like image 888
Always Sunny Avatar asked Aug 18 '20 16:08

Always Sunny


People also ask

How do you explode a list in Python?

The explode() function is used to transform each element of a list-like to a row, replicating the index values. Returns: Series- Exploded lists to rows; index will be duplicated for these rows. Notes: This routine will explode list-likes including lists, tuples, Series, and np.

How do you explode a list inside a DataFrame cell into separate rows?

For most cases, the correct answer is to now use pandas. DataFrame. explode() as shown in this answer, or pandas. Series.

What is the opposite of explode in pandas?

implode` as a opposite function to `df. explode` · Issue #45459 · pandas-dev/pandas · GitHub.

How do you explode two columns in Python?

The explode() function of Python will return the exploded DataFrame, which means each element of a DataFrame is converted to a row. In other words, it will return the exploded lists to rows of the subset columns while maintaining or changing the original index, and that depends on the input provided.


1 Answers

You need to ensure that your column is of list type to be able to use pandas' explode(). Here is a working solution:

from ast import literal_eval

test_data['nested_city'] = test_data['nested_city'].apply(literal_eval) #convert to list type
test_data['nested_city'].explode()

To explode multiple columns at a time, you can do the following:

not_list_cols = [col for col in test_data.columns if col not in ['col1', 'col2']] #list of columns you are not exploding (assume col1 and col2 are being exploded)
test_data = test_data.set_index(not_list_cols).apply(pd.Series.explode).reset_index()
like image 141
Akanksha Atrey Avatar answered Oct 02 '22 20:10

Akanksha Atrey