Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas equivalent of dplyr everything()

Tags:

pandas

r

dplyr

In R I frequently use dplyr's select in combination with everything()

df %>% select(var4, var17, everything())

The above for example would reorder the columns of the dataframe, such that var4 is the first, var17 is the second and subsequently all remaining columns are listed. What is the most pandathonic way of doing this? Working with many columns makes explicitly spelling them out a pain as well as keeping track of their position.

The ideal solution is short, readable and can be used in pandas chaining.

like image 580
safex Avatar asked Jun 05 '20 07:06

safex


People also ask

Is there a dplyr equivalent in Python?

Dplython. Package dplython is dplyr for Python users. It provide infinite functionality for data preprocessing.

Is pandas similar to dplyr?

Both Pandas and dplyr can connect to virtually any data source, and read from any file format. That's why we won't spend any time exploring connection options but will use a build-in dataset instead. There's no winner in this Pandas vs. dplyr comparison, as both libraries are near identical with the syntax.

What does DF All () do?

Pandas DataFrame all() Method The all() method returns one value for each column, True if ALL values in that column are True, otherwise False. By specifying the column axis ( axis='columns' ), the all() method returns True if ALL values in that axis are True.

Does Python have Tidyverse?

One of the great things about the R world has been a collection of R packages called tidyverse that are easy for beginners to learn and provide a consistent data manipulation and visualisation space. The value of these tools has been so great that many of them have been ported to Python.


2 Answers

Use Index.difference for all values without specified in list and join together:

df = pd.DataFrame({
        'G':list('abcdef'),
         'var17':[4,5,4,5,5,4],
         'A':[7,8,9,4,2,3],
         'var4':[1,3,5,7,1,0],
         'E':[5,3,6,9,2,4],
         'F':list('aaabbb')
})

cols = ['var4','var17']
another = df.columns.difference(cols, sort=False).tolist()
df = df[cols + another]
print (df)
   var4  var17  G  A  E  F
0     1      4  a  7  5  a
1     3      5  b  8  3  a
2     5      4  c  9  6  a
3     7      5  d  4  9  b
4     1      5  e  2  2  b
5     0      4  f  3  4  b

EDIT: For chaining is possible use DataFrame.pipe with passed DataFrame:

def everything_after(df, cols):
    another = df.columns.difference(cols, sort=False).tolist()
    return df[cols + another]

df = df.pipe(everything_after, ['var4','var17']))
print (df)
   var4  var17  G  A  E  F
0     1      4  a  7  5  a
1     3      5  b  8  3  a
2     5      4  c  9  6  a
3     7      5  d  4  9  b
4     1      5  e  2  2  b
5     0      4  f  3  4  b
like image 107
jezrael Avatar answered Oct 14 '22 07:10

jezrael


Now how smoothly you can do it with datar!

>>> from datar import f
>>> from datar.datasets import iris
>>> from datar.dplyr import select, everything, slice_head
>>> iris >> slice_head(5)
   Sepal_Length  Sepal_Width  Petal_Length  Petal_Width Species
0           5.1          3.5           1.4          0.2  setosa
1           4.9          3.0           1.4          0.2  setosa
2           4.7          3.2           1.3          0.2  setosa
3           4.6          3.1           1.5          0.2  setosa
4           5.0          3.6           1.4          0.2  setosa
>>> iris >> select(f.Species, everything()) >> slice_head(5)
  Species  Sepal_Length  Sepal_Width  Petal_Length  Petal_Width
0  setosa           5.1          3.5           1.4          0.2
1  setosa           4.9          3.0           1.4          0.2
2  setosa           4.7          3.2           1.3          0.2
3  setosa           4.6          3.1           1.5          0.2
4  setosa           5.0          3.6           1.4          0.2

I am the author of the package. Feel free to submit issues if you have any questions.

like image 26
Panwen Wang Avatar answered Oct 14 '22 08:10

Panwen Wang