I'm trying to rearrange columns in a DataFrame, by putting a few columns first, and then all the others after. With R's <code>dplyr</code>, this would look like: <pre class="prettyprint lang-r prettyprint-override"><code>library(dplyr) df = tibble(col1 = c("a", "b", "c"), id = c(1, 2, 3), col2 = c(2, 4, 6), date = c("1 Feb", "2 Feb", "3 Feb")) df2 = select(df, id, date, everything()) </code></pre> Easy. With Python's <code>pandas</code>, here's what I've tried: <pre class="prettyprint lang-py prettyprint-override"><code>import pandas as pd df = pd.DataFrame({ "col1": ["a", "b", "c"], "id": [1, 2, 3], "col2": [2, 4, 6], "date": ["1 Feb", "2 Feb", "3 Feb"] }) # using sets cols = df.columns.tolist() cols_1st = {"id", "date"} cols = set(cols) - cols_1st cols = list(cols_1st) + list(cols) # wrong column order df2 = df[cols] # using lists cols = df.columns.tolist() cols_1st = ["id", "date"] cols = [c for c in cols if c not in cols_1st] cols = cols_1st + cols # right column order, but is there a better way? df3 = df[cols] </code></pre> The <code>pandas</code> way is more tedious, but I'm fairly new to this. Is there a better way?

You can use <code>df.drop</code>: <pre class="prettyprint"><code>>>> df = pd.DataFrame({ "col1": ["a", "b", "c"], "id": [1, 2, 3], "col2": [2, 4, 6], "date": ["1 Feb", "2 Feb", "3 Feb"] }) >>> df col1 id col2 date 0 a 1 2 1 Feb 1 b 2 4 2 Feb 2 c 3 6 3 Feb >>> cols_1st = ["id", "date"] >>> df[cols_1st + list(df.drop(cols_1st, 1))] id date col1 col2 0 1 1 Feb a 2 1 2 2 Feb b 4 2 3 3 Feb c 6 </code></pre>

Usually, the best translation between R and Python Pandas is with base R which follow the same semantics such as logical indexing on a vector, here being column names. Notice the similarity below with negation and <code>in</code> functions: <pre class="prettyprint"><code># R mycols <- c("id", "date") df2 <- df[c(mycols, colnames(df)[!colnames(df) %in% c(mycols)])] # PANDAS (OLDER, NON-RECOMMENDED WAY) mycols = ["id", "date"] df2 = df[mycols + df.columns[~df.columns.isin(mycols)].tolist()] # PANDAS (CURRENT, RECOMMENDED WAY WITH reindex) df2 = df.reindex(mycols + df.columns[~df.columns.isin(mycols)].tolist(), axis='columns') </code></pre>

As easy as you do it in R with <code>datar</code>: <pre class="prettyprint lang-py prettyprint-override"><code>>>> from datar.all import c, f, tibble, select, everything >>> df = tibble(col1 = c("a", "b", "c"), ... id = c(1, 2, 3), ... col2 = c(2, 4, 6), ... date = c("1 Feb", "2 Feb", "3 Feb")) >>> >>> df2 = select(df, ... f.id, f.date, everything()) >>> >>> df2 id date col1 col2 <int64> <object> <object> <int64> 0 1 1 Feb a 2 1 2 2 Feb b 4 2 3 3 Feb c 6 </code></pre> I am the author of the package. Feel free to submit issues if you have any questions.

Rearranging columns with pandas: Is there an equivalent to dplyr's select(..., everything())?

Tags:

python

pandas

dataframe

r

dplyr

I'm trying to rearrange columns in a DataFrame, by putting a few columns first, and then all the others after.

With R's dplyr, this would look like:

library(dplyr)

df = tibble(col1 = c("a", "b", "c"),
            id = c(1, 2, 3),
            col2 = c(2, 4, 6),
            date = c("1 Feb", "2 Feb", "3 Feb"))

df2 = select(df,
             id, date, everything())

Easy. With Python's pandas, here's what I've tried:

import pandas as pd

df = pd.DataFrame({
    "col1": ["a", "b", "c"],
    "id": [1, 2, 3],
    "col2": [2, 4, 6],
    "date": ["1 Feb", "2 Feb", "3 Feb"]
    })

# using sets
cols = df.columns.tolist()
cols_1st = {"id", "date"}
cols = set(cols) - cols_1st
cols = list(cols_1st) + list(cols)

# wrong column order
df2 = df[cols]

# using lists
cols = df.columns.tolist()
cols_1st = ["id", "date"]
cols = [c for c in cols if c not in cols_1st]
cols = cols_1st + cols

# right column order, but is there a better way?
df3 = df[cols]

The pandas way is more tedious, but I'm fairly new to this. Is there a better way?

501

asked Mar 01 '20 18:03

ardaar

3 Answers

You can use df.drop:

>>> df = pd.DataFrame({
    "col1": ["a", "b", "c"],
    "id": [1, 2, 3],
    "col2": [2, 4, 6],
    "date": ["1 Feb", "2 Feb", "3 Feb"]
    })

>>> df

  col1  id  col2   date
0    a   1     2  1 Feb
1    b   2     4  2 Feb
2    c   3     6  3 Feb

>>> cols_1st = ["id", "date"]

>>> df[cols_1st + list(df.drop(cols_1st, 1))]

   id   date col1  col2
0   1  1 Feb    a     2
1   2  2 Feb    b     4
2   3  3 Feb    c     6

answered Nov 10 '22 00:11

Sayandip Dutta

Usually, the best translation between R and Python Pandas is with base R which follow the same semantics such as logical indexing on a vector, here being column names. Notice the similarity below with negation and in functions:

# R 
mycols <- c("id", "date")
df2 <- df[c(mycols, colnames(df)[!colnames(df) %in% c(mycols)])]


# PANDAS (OLDER, NON-RECOMMENDED WAY)
mycols = ["id", "date"]
df2 = df[mycols + df.columns[~df.columns.isin(mycols)].tolist()]

# PANDAS (CURRENT, RECOMMENDED WAY WITH reindex)
df2 = df.reindex(mycols + df.columns[~df.columns.isin(mycols)].tolist(),
                 axis='columns')

answered Nov 09 '22 23:11

Parfait

As easy as you do it in R with datar:

>>> from datar.all import c, f, tibble, select, everything
>>> df = tibble(col1 = c("a", "b", "c"),
...             id = c(1, 2, 3),
...             col2 = c(2, 4, 6),
...             date = c("1 Feb", "2 Feb", "3 Feb"))
>>>             
>>> df2 = select(df,
...              f.id, f.date, everything())
>>>              
>>> df2
       id     date     col1    col2
  <int64> <object> <object> <int64>
0       1    1 Feb        a       2
1       2    2 Feb        b       4
2       3    3 Feb        c       6

I am the author of the package. Feel free to submit issues if you have any questions.

answered Nov 10 '22 00:11

Panwen Wang

Related questions
                            
                                How to use torchvision.transforms for data augmentation of segmentation task in Pytorch?
                            
                                How to divide a rectangle in specific number of rows and columns?
                            
                                How can I make a recursive Python type defined over several aliases?
                            
                                python3: extract IP address from compiled pattern
                            
                                How does @pytest.mark.filterwarnings work?
                            
                                Upload file from memory to S3
                            
                                Fast Fourier Transform in Python
                            
                                Can I define an action on file upload when using ipywidgets FileUpload widget
                            
                                python how to find the number of days in each month from Dec 2019 and forward between two date columns
                            
                                TensorFlow 2 custom loss: "No gradients provided for any variable" error
                            
                                Fetching data with snowflake connector throws EmptyPyArrowIterator error
                            
                                SimpleCookie generic type
                            
                                Block network access of a test/process on Travis?
                            
                                How to use the Language Server Protocol for Python in Neovim
                            
                                Multi GPU training slower than single GPU on Tensorflow
                            
                                How does one check if conda develop installed my project/packages?
                            
                                Split train data to train and validation by using tensorflow_datasets.load (TF 2.1)
                            
                                What is _md5.md5 and why is hashlib.md5 so much slower?
                            
                                How can I change the Python interpreter in virtual environment (Ubuntu 18.04LTS)?
                            
                                FastAPI/uvicorn not working when specifying host

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With