Splitting DataFrame rows with pandas

Question

I am currently trying to figure out efficient way to split single panadas DataFrame rows into multiple, slightly changed rows. Imagine such structure:

    A  C1  C2  C3  C4
1   a   b   c   a
2   b   a   e   b   a
3   g   c
4   d   e

and I want to end up with structure like that:

    A   C
1   a   b
2   a   c
3   a   a
4   b   a
5   b   e
6   b   b
7   b   a
8   g   c
9   d   e
10  d   e

So far I've been using for loops and create dictionaries like that (df is my DataFrame):

rows = []
for i, r in df.iterrows():
  tmp = r[1:].dropna()
  for c in tmp.values:
    dict = {'A': r[0], 'C': c}
    rows.append(dict)

Unfortunately this approach is extremly slow. So far after my work with pandas I see that when using only it execution time can be significantly improved, but I don't have so much experience to figure out how to make this case faster.

Can someone advice, what can be done to speed this up?

MaxU - stop WAR against UA · Accepted Answer

try this:

In [10]: pd.melt(df, id_vars='A', value_vars=['C1','C2','C3','C4'])
Out[10]:
    A variable value
0   a       C1     b
1   b       C1     a
2   g       C1     c
3   d       C1     e
4   a       C2     c
5   b       C2     e
6   g       C2   NaN
7   d       C2   NaN
8   a       C3     a
9   b       C3     b
10  g       C3   NaN
11  d       C3   NaN
12  a       C4   NaN
13  b       C4     a
14  g       C4   NaN
15  d       C4   NaN

if you want to get rid of NaN's:

In [15]: pd.melt(df, id_vars='A', value_vars=['C1','C2','C3','C4'], value_name='C')[['A','C']].dropna()
Out[15]:
    A  C
0   a  b
1   b  a
2   g  c
3   d  e
4   a  c
5   b  e
8   a  a
9   b  b
13  b  a

the same, but selecting C* columns dynamically:

In [21]: (pd.melt(df, id_vars='A',
   ....:          value_vars=df.filter(like='C').columns.tolist(),
   ....:          value_name='C')[['A','C']]
   ....:    .dropna()
   ....: )
Out[21]:
    A  C
0   a  b
1   b  a
2   g  c
3   d  e
4   a  c
5   b  e
8   a  a
9   b  b
13  b  a

Splitting DataFrame rows with pandas

Tags:

python

pandas

sebap123

1 Answers

MaxU - stop WAR against UA

Recent Activity

Donate For Us

Splitting DataFrame rows with pandas

Tags:

python

pandas

sebap123

1 Answers

MaxU - stop WAR against UA

Related questions

Recent Activity

Donate For Us