I have a dataframe column with the following format: <pre class="prettyprint"><code>col1 col2 A [{'Id':42,'prices':['30',’78’]},{'Id': 44,'prices':['20','47',‘89’]}] B [{'Id':47,'prices':['30',’78’]},{'Id':94,'prices':['20']},{'Id':84,'prices':['20','98']}] </code></pre> How can I transform it to the following ? <pre class="prettyprint"><code>col1 Id price A 42 ['30',’78’] A 44 ['20','47',‘89’] B 47 ['30',’78’] B 94 ['20'] B 84 ['20','98'] </code></pre> I was thinking of using apply and lambda as a solution but I am not sure how. Edit : In order to recreate this dataframe I use the following code : <pre class="prettyprint"><code>data = [['A', "[{'Id':42,'prices':['30','78']},{'Id': 44,'prices':['20','47','89']}]"], ['B', "[{'Id':47,'prices':['30','78']},{'Id':94,'prices':['20']},{'Id':84,'prices':['20','98']}]"]] df = pd.DataFrame(data, columns = ['col1', 'col2']) </code></pre>

You can use <code>df.explode</code> here with <code>pd.Series.apply</code> and <code>df.set_index</code> and <code>df.reset_index</code> <pre class="prettyprint"><code>df.set_index('col1').explode('col2')['col2'].apply(pd.Series).reset_index() col1 Id prices 0 A 42 [30, 78] 1 A 44 [20, 47, 89] 2 B 47 [30, 78] 3 B 94 [20] 4 B 84 [20, 98] </code></pre> When <code>col2</code> is string, use <code>ast.literal_eval</code> <pre class="prettyprint"><code>import ast data = [['A', "[{'Id':42,'prices':['30','78']},{'Id': 44,'prices':['20','47','89']}]"], ['B', "[{'Id':47,'prices':['30','78']},{'Id':94,'prices':['20']},{'Id':84,'prices':['20','98']}]"]] df = pd.DataFrame(data, columns = ['col1', 'col2']) df['col2'] = df['col2'].map(ast.literal_eval) df.set_index('col1').explode('col2')['col2'].apply(pd.Series).reset_index() col1 Id prices 0 A 42 [30, 78] 1 A 44 [20, 47, 89] 2 B 47 [30, 78] 3 B 94 [20] 4 B 84 [20, 98] </code></pre>

Pandas Dataframe split multiple key values to different columns

I have a dataframe column with the following format:

col1    col2   
 A     [{'Id':42,'prices':['30',’78’]},{'Id': 44,'prices':['20','47',‘89’]}]
 B     [{'Id':47,'prices':['30',’78’]},{'Id':94,'prices':['20']},{'Id':84,'prices':['20','98']}]

How can I transform it to the following ?

col1    Id            price   
  A     42         ['30',’78’]
  A     44         ['20','47',‘89’]
  B     47         ['30',’78’]
  B     94         ['20']
  B     84         ['20','98']

I was thinking of using apply and lambda as a solution but I am not sure how.

Edit : In order to recreate this dataframe I use the following code :

data = [['A', "[{'Id':42,'prices':['30','78']},{'Id': 44,'prices':['20','47','89']}]"], 
        ['B', "[{'Id':47,'prices':['30','78']},{'Id':94,'prices':['20']},{'Id':84,'prices':['20','98']}]"]] 

df = pd.DataFrame(data, columns = ['col1', 'col2'])

How do I split a column with multiple values in pandas?

Split column by delimiter into multiple columns Apply the pandas series str. split() function on the “Address” column and pass the delimiter (comma in this case) on which you want to split the column. Also, make sure to pass True to the expand parameter.

How do you split items into multiple columns in a data frame?

We can use the pandas Series. str. split() function to break up strings in multiple columns around a given separator or delimiter. It's similar to the Python string split() method but applies to the entire Dataframe column.

How do pandas separate values?

split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.

Can I use Loc and ILOC together?

loc and iloc are interchangeable when labels are 0-based integers.

Solution if there are lists in column col2:

print (type(df['col2'].iat[0]))
<class 'list'>

L = [{**{'col1': a}, **x} for a, b in df[['col1','col2']].to_numpy() for x in b]

df = pd.DataFrame(L)
print (df)
  col1  Id        prices
0    A  42      [30, 78]
1    A  44  [20, 47, 89]
2    B  47      [30, 78]
3    B  94          [20]
4    B  84      [20, 98]

If there are strings:

print (type(df['col2'].iat[0]))
<class 'str'>

import ast

L = [{**{'col1': a}, **x} for a, b in df[['col1','col2']].to_numpy() for x in ast.literal_eval(b)]
df = pd.DataFrame(L)
print (df)
  col1  Id        prices
0    A  42      [30, 78]
1    A  44  [20, 47, 89]
2    B  47      [30, 78]
3    B  94          [20]
4    B  84      [20, 98]

For better understanding is possible use:

import ast

L = []
for a, b in df[['col1','col2']].to_numpy():
    for x in ast.literal_eval(b):
        d = {'col1': a}
        out = {**d, **x}
        L.append(out)

df = pd.DataFrame(L)
print (df)
  col1  Id        prices
0    A  42      [30, 78]
1    A  44  [20, 47, 89]
2    B  47      [30, 78]
3    B  94          [20]
4    B  84      [20, 98]

Considering second parameter of "data" as list.

data= [
  ['A', [{'Id':42,'prices':['30','78']},{'Id': 44,'prices':['20','47','89']}]], 
  ['B', [{'Id':47,'prices':['30','78']}, {'Id':94,'prices':['20']},{'Id':84,'prices': 
        ['20','98']}]]
  ]

t_list = []

for i in range(len(data)):
    for j in range(len(data[i][1])):
        t_list.append((data[i][0], data[i][1][j]['Id'], data[i][1][j]['prices']))

df = pd.DataFrame(t_list, columns=['col1', 'id', 'price'])
print(df)

     col1  id         price
0    A     42      [30, 78]
1    A     44  [20, 47, 89]
2    B     47      [30, 78]
3    B     94          [20]
4    B     84      [20, 98]

You can use df.explode here with pd.Series.apply and df.set_index and df.reset_index

df.set_index('col1').explode('col2')['col2'].apply(pd.Series).reset_index()

  col1  Id        prices
0    A  42      [30, 78]
1    A  44  [20, 47, 89]
2    B  47      [30, 78]
3    B  94          [20]
4    B  84      [20, 98]

When col2 is string, use ast.literal_eval

import ast

data = [['A', "[{'Id':42,'prices':['30','78']},{'Id': 44,'prices':['20','47','89']}]"], 
        ['B', "[{'Id':47,'prices':['30','78']},{'Id':94,'prices':['20']},{'Id':84,'prices':['20','98']}]"]] 

df = pd.DataFrame(data, columns = ['col1', 'col2'])
df['col2'] = df['col2'].map(ast.literal_eval)

df.set_index('col1').explode('col2')['col2'].apply(pd.Series).reset_index()

  col1  Id        prices
0    A  42      [30, 78]
1    A  44  [20, 47, 89]
2    B  47      [30, 78]
3    B  94          [20]
4    B  84      [20, 98]

Pandas Dataframe split multiple key values to different columns

Tags:

python

python-3.x

pandas

dataframe

colla

People also ask

3 Answers

jezrael

Dhananjay kumar Singh

Ch3steR

Recent Activity

Donate For Us

Pandas Dataframe split multiple key values to different columns

Tags:

python

python-3.x

pandas

dataframe

colla

People also ask

3 Answers

jezrael

Dhananjay kumar Singh

Ch3steR

Related questions

Recent Activity

Donate For Us