Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add a row in a special form

Tags:

python

pandas

I have a pandas.DataFrame of the form

index     df      df1

0         0       111
1         1       111
2         2       111
3         3       111
4         0       111
5         2       111
6         3       111
7         0       111
8         2       111
9         3       111
10        0       111
11        1       111
12        2       111
13        3       111
14        0       111
15        1       111
16        2       111
17        3       111
18        1       111
19        2       111
20        3       111

I want to create a dataframe in which column df repeats 0,1,2,3. But there is something missing in the data. I'm trying to fill in the blanks with 0 by appending row values. Here is my expected result:

index     df      df1

0         0       111
1         1       111
2         2       111
3         3       111
4         0       111
5         1       0
6         2       111
7         3       111
8         0       111
9         1       0
10        2       111
11        3       111
12        0       111
13        1       111
14        2       111
15        3       111
16        0       111
17        1       111
18        2       111
19        3       111
20        0       0
21        1       111
22        2       111
23        3       111

How can I achieve this?

edit:

What should I do if my input is as below?

index     df1      df2

0          0       111
1          1       111
2          2       111
3          3       111
4          0       111
5          3       111
6          1       111
7          2       111

Here is my expected result:

index  df1   df2

0         0       111
1         1       111
2         2       111
3         3       111
4         0       111
5         1       0
6         2       0
7         3       111
8         0       0       
9         1       111
10        2       111 
11        3       0 
like image 946
김수환 Avatar asked Sep 15 '21 07:09

김수환


People also ask

How do you add a row in a specific column?

You can also right-click a selection, choose Insert, click the Entire Row (or Entire Column) option, and click OK. To eliminate a few clicks, select the entire row (or column) by dragging over the header cells before right-clicking; Excel will insert rows without displaying the Insert dialog.

How do you use special rows in Excel?

Since Ctrl+G is the macro to open the Go To window, it is easy for me to remember Ctrl+Shift+G to run the macro to jump to a row or column. You can use any keyboard shortcut you want though. It does not have to be Ctrl+Shift+G for this to work.

How do I add a row to an existing table in Excel?

To insert a row, pick a cell or row that's not the header row, and right-click. To insert a column, pick any cell in the table and right-click. Point to Insert, and pick Table Rows Above to insert a new row, or Table Columns to the Left to insert a new column.

How to quickly add new rows to existing row in Excel?

Firstly, you need to insert a Command Button. Please click Developer > Insert > Command Button (ActiveX Control). See screenshot: 2. Then draw a Command Button in to the worksheet you need to add new rows, right click the Command Button and click Properties from the right-clicking menu.

How to submit a form based on the Order of rows?

At the end of each row, there is a submit button about action to be performed on the order and the form ends there. A new form begins with the next row <tr>

How to insert a blank new row automatically by command button?

Insert a blank new row automatically by Command Button 1. Firstly, you need to insert a Command Button. Please click Developer > Insert > Command Button (ActiveX Control). See... 2. Then draw a Command Button in to the worksheet you need to add new rows, right click the Command Button and click... ...


3 Answers

Using @Mozway's idea, and combining with some helper functions from pyjanitor, the missing values can be made explicit, and later filled. Again, this is just another option :

# pip install pyjanitor
import pandas as pd
import janitor as jn
(df.assign(temp = df.df.diff().le(0).cumsum())
   .complete('df', 'temp') # helper function
   .fillna(0)
    # relevant if you care about the order
   .sort_values('temp', kind='mergesort')
    # helper function
   .select_columns('df*') # or .drop(columns='temp')
)
 
    df    df1
0    0  111.0
6    1  111.0
12   2  111.0
18   3  111.0
1    0  111.0
7    1    0.0
13   2  111.0
19   3  111.0
2    0  111.0
8    1    0.0
14   2  111.0
20   3  111.0
3    0  111.0
9    1  111.0
15   2  111.0
21   3  111.0
4    0  111.0
10   1  111.0
16   2  111.0
22   3  111.0
5    0    0.0
11   1  111.0
17   2  111.0
23   3  111.0
like image 200
sammywemmy Avatar answered Oct 21 '22 07:10

sammywemmy


You can set a custom grouping to detect when the increasing numbers in "df" reset to a lower (or equal) value.

Then reindex using the product of the unique values in "df" and the unique groups.

Finally, rework the output with a combination of fillna/reset_index/rename_axis:

# uncomment below if "index" is not the index
# df = df.set_index('index')

# find positions where "df" resets and make groups
groups = df['df'].diff().le(0).cumsum()

(df.set_index([groups, 'df'], drop=True) # set custom groups and "df" as index
   .reindex(pd.MultiIndex.from_product([groups.unique(),   # reindex with all
                                        range(4),          # combinations
                                       ], names=['group', 'df']))
   .fillna(0, downcast='infer') # set missing values as zero
   .reset_index('df')           # all below to restore a range index
   .reset_index(drop=True)
   .rename_axis('index')
)

output:

       df  df1
index         
0       0  111
1       1  111
2       2  111
3       3  111
4       0  111
5       1    0
6       2  111
7       3  111
8       0  111
9       1    0
10      2  111
11      3  111
12      0  111
13      1  111
14      2  111
15      3  111
16      0  111
17      1  111
18      2  111
19      3  111
20      0    0
21      1  111
22      2  111
23      3  111

output on second example:

       df1  df2
index          
0        0  111
1        1  111
2        2  111
3        3  111
4        0  111
5        1    0
6        2    0
7        3  111
8        0    0
9        1  111
10       2  111
11       3    0
like image 33
mozway Avatar answered Oct 21 '22 07:10

mozway


You can set group on increasing sequence of column df. Then use .unstack() and .stack(), as follows:

group = df['df'].le(df['df'].shift()).cumsum()   # new group if column `df` <= `df` last entry

df_out = (df.set_index([group, 'df'])    # set `group` and column `df` as index
            .unstack(fill_value=0)       # unstack `df` and fill missing entry of `df` in [0,1,2,3] as 0 for `df1`
            .stack()                     # stack back to original shape
            .droplevel(0)                # drop `group` from index
            .reset_index()               # restore `df` from index back to data column
         )

Result:

print(df_out)


    df  df1
0    0  111
1    1  111
2    2  111
3    3  111
4    0  111
5    1    0
6    2  111
7    3  111
8    0  111
9    1    0
10   2  111
11   3  111
12   0  111
13   1  111
14   2  111
15   3  111
16   0  111
17   1  111
18   2  111
19   3  111
20   0    0
21   1  111
22   2  111
23   3  111

For the edited input, use similar codes:

group = df['df1'].le(df['df1'].shift()).cumsum()

df_out2 = (df.set_index([group, 'df1'])
             .unstack(fill_value=0)
             .stack()
             .droplevel(0)
             .reset_index()
         )

Result:

print(df_out2)


    df1  df2
0     0  111
1     1  111
2     2  111
3     3  111
4     0  111
5     1    0
6     2    0
7     3  111
8     0    0
9     1  111
10    2  111
11    3    0
like image 26
SeaBean Avatar answered Oct 21 '22 07:10

SeaBean