Rearranging a Pandas dataframe

Question

I have this input worksheet:

raw = pd.read_excel("raw.xlsx", header=None)

>Out[4]: 
        0     1     2     3     4
0      48  59.0  28.0   6.0  36.0
1      41  36.0  52.0   3.0  22.0
2      32  86.0  66.0  68.0   9.0
3      71  23.0   6.0  98.0  19.0
4      18  92.0  66.0   6.0  54.0
5    Andy   NaN   NaN   NaN   NaN
6      56  89.0   6.0  32.0  50.0
7       3  68.0  49.0  93.0  15.0
8      27  65.0  94.0  96.0  66.0
9      40  96.0  71.0  22.0  83.0
10     96  23.0   5.0  49.0  14.0
11    Bob   NaN   NaN   NaN   NaN
12     43  34.0  42.0  11.0  73.0
13     42  41.0  17.0  91.0  35.0
14     81  74.0  24.0  95.0  95.0
15     89  57.0  35.0  66.0  56.0
16     54  76.0  55.0  72.0  63.0
17  David   NaN   NaN   NaN   NaN
18     58   8.0  62.0  63.0   8.0
19     15  93.0  97.0  38.0   5.0
20     13  96.0  42.0  51.0  48.0
21     23  88.0  20.0  91.0  39.0
22      9  67.0  45.0  58.0  92.0
23   Bill   NaN   NaN   NaN   NaN
24      2   3.0  80.0  28.0  38.0
25    100  68.0  83.0  26.0  45.0
26     79  57.0  40.0  76.0  83.0
27     12  98.0  76.0  63.0  53.0
28     60  88.0  70.0  13.0  50.0
29   Luke   NaN   NaN   NaN   NaN

I have to rearrange the "block' of data on a single line, followed by the name. The format is fixed, and the output should look like this

>Out[6]: 
   0   1   2   3   4    5   6   7   8   9   10  11  12  13  14  15  16  17  18  19  20  21  22  23  24     25
0  48  59  28   6  36   41  36  52   3  22  32  86  66  68   9  71  23   6  98  19  18  92  66   6  54   Andy
1  56  89   6  32  50    3  68  49  93  15  27  65  94  96  66  40  96  71  22  83  96  23   5  49  14    Bob
2  43  34  42  11  73   42  41  17  91  35  81  74  24  95  95  89  57  35  66  56  54  76  55  72  63  David
3  58   8  62  63   8   15  93  97  38   5  13  96  42  51  48  23  88  20  91  39   9  67  45  58  92   Bill
4   2   3  80  28  38  100  68  83  26  45  79  57  40  76  83  12  98  76  63  53  60  88  70  13  50   Luke

Which is the most pythonic way to do this?

Thanks

Arty · Accepted Answer

First part of code is many lines that read-in data from string lines, this is just for example, you need to do your pd.read_excel() instead.

The real part of code is conversion which is in few next lines:

a = df.values
a = a.reshape([a.size // 30, 30])
a = a[:, :-4]
df = pd.DataFrame(a)

This above can be even shortened to one-liner:

df = pd.DataFrame(df.values.reshape((-1, 30))[:, :-4])

Full code down below:

Try it online!

import pandas as pd, numpy as np

# Next is just reading-in my data in a fancy way,
# you do pd.read_excel(file) instead like you did before
df = pd.DataFrame([line.split() for line in """
   48  59.0  28.0   6.0  36.0
   41  36.0  52.0   3.0  22.0
   32  86.0  66.0  68.0   9.0
   71  23.0   6.0  98.0  19.0
   18  92.0  66.0   6.0  54.0
 Andy   NaN   NaN   NaN   NaN
   56  89.0   6.0  32.0  50.0
    3  68.0  49.0  93.0  15.0
   27  65.0  94.0  96.0  66.0
   40  96.0  71.0  22.0  83.0
   96  23.0   5.0  49.0  14.0
  Bob   NaN   NaN   NaN   NaN
   43  34.0  42.0  11.0  73.0
   42  41.0  17.0  91.0  35.0
   81  74.0  24.0  95.0  95.0
   89  57.0  35.0  66.0  56.0
   54  76.0  55.0  72.0  63.0
David   NaN   NaN   NaN   NaN
   58   8.0  62.0  63.0   8.0
   15  93.0  97.0  38.0   5.0
   13  96.0  42.0  51.0  48.0
   23  88.0  20.0  91.0  39.0
    9  67.0  45.0  58.0  92.0
 Bill   NaN   NaN   NaN   NaN
    2   3.0  80.0  28.0  38.0
  100  68.0  83.0  26.0  45.0
   79  57.0  40.0  76.0  83.0
   12  98.0  76.0  63.0  53.0
   60  88.0  70.0  13.0  50.0
 Luke   NaN   NaN   NaN   NaN
""".splitlines() if line.strip()])

a = df.values
a = a.reshape([a.size // 30, 30])
a = a[:, :-4]
df = pd.DataFrame(a)

print(df)

Outputs:

    0     1     2     3     4    5     6     7     8     9  10    11    12    13    14  15    16    17    18    19  20    21    22    23    24     25
0  48  59.0  28.0   6.0  36.0   41  36.0  52.0   3.0  22.0  32  86.0  66.0  68.0   9.0  71  23.0   6.0  98.0  19.0  18  92.0  66.0   6.0  54.0   Andy
1  56  89.0   6.0  32.0  50.0    3  68.0  49.0  93.0  15.0  27  65.0  94.0  96.0  66.0  40  96.0  71.0  22.0  83.0  96  23.0   5.0  49.0  14.0    Bob
2  43  34.0  42.0  11.0  73.0   42  41.0  17.0  91.0  35.0  81  74.0  24.0  95.0  95.0  89  57.0  35.0  66.0  56.0  54  76.0  55.0  72.0  63.0  David
3  58   8.0  62.0  63.0   8.0   15  93.0  97.0  38.0   5.0  13  96.0  42.0  51.0  48.0  23  88.0  20.0  91.0  39.0   9  67.0  45.0  58.0  92.0   Bill
4   2   3.0  80.0  28.0  38.0  100  68.0  83.0  26.0  45.0  79  57.0  40.0  76.0  83.0  12  98.0  76.0  63.0  53.0  60  88.0  70.0  13.0  50.0   Luke

Rearranging a Pandas dataframe

Tags:

python

pandas

dataframe

l3g10n3

1 Answers

Arty

Recent Activity

Donate For Us

Rearranging a Pandas dataframe

Tags:

python

pandas

dataframe

l3g10n3

1 Answers

Arty

Related questions

Recent Activity

Donate For Us