I fear greatly that this has been asked and will be downvoted, but I have not found the answer in the docs (?"["), and discovered that it is hard to search for. <pre class="prettyprint"><code>data(wines) # This is allowed: alcoholic <- wines[, 1] alcoholic <- wines[, "alcohol"] nonalcoholic <- wines[, -1] # But this is not: fail <- wines[, -"alcohol"] </code></pre> I know of two solutions, but am frustrated for need of them. <pre class="prettyprint"><code>win <- wines[, !colnames(wines) %in% "alcohol"] # snappy win <- wines[, -which(colnames(wines) %in% "alcohol")] # snappier! </code></pre>

When you do <pre class="prettyprint"><code>wines[, -1] </code></pre> <code>-1</code> is evaluated before it is used by <code>[</code>. As you know, the <code>-</code> unary operator won't work with object of class <code>character</code>, so doing the same with "alcohol" will lead you to: <pre class="prettyprint"><code>Error in -"alcohol" : invalid argument to unary operator </code></pre> You can add the following to your alternatives: <pre class="prettyprint"><code>wines[, -match("alcohol", colnames(wines))] wines[, setdiff(colnames(wines), "alcohol")] </code></pre> but you should know about the risks of negative indexing, e.g., see what happens if you mistype "alcool" (sic.) So your first suggestion and the last one here (@Ananda's) should be preferred. You might also want to write a function that will error out if you provide a name that is not part of your data.

Why is [- subsetting (i.e. deletion) of columns not possible with names?

Tags:

dataframe

r

subset

I fear greatly that this has been asked and will be downvoted, but I have not found the answer in the docs (?"["), and discovered that it is hard to search for.

data(wines)
# This is allowed:
alcoholic <- wines[, 1]
alcoholic <- wines[, "alcohol"]
nonalcoholic <- wines[, -1]
# But this is not:
fail <- wines[, -"alcohol"]

I know of two solutions, but am frustrated for need of them.

win <- wines[, !colnames(wines) %in% "alcohol"]  # snappy
win <- wines[, -which(colnames(wines) %in% "alcohol")]  # snappier!

498

asked Sep 05 '13 10:09

a different ben

2 Answers

When you do

wines[, -1]

-1 is evaluated before it is used by [. As you know, the - unary operator won't work with object of class character, so doing the same with "alcohol" will lead you to:

Error in -"alcohol" : invalid argument to unary operator

You can add the following to your alternatives:

wines[, -match("alcohol", colnames(wines))]
wines[, setdiff(colnames(wines), "alcohol")]

but you should know about the risks of negative indexing, e.g., see what happens if you mistype "alcool" (sic.) So your first suggestion and the last one here (@Ananda's) should be preferred. You might also want to write a function that will error out if you provide a name that is not part of your data.

answered Oct 15 '22 21:10

flodel

Another possibility:

subset(wines,select=-alcohol)

You can even do

subset(wines,select=-c(alcohol,other_drop))

In fact, if you have a contiguous set of columns you want to drop, you can even

subset(wines,select=-(first_drop:last_drop))

which can be handy (although IMO it depends dangerously on the order of columns, which is something that might be fragile: I might prefer a grep-based solution if there were some way to identify columns, or a more explicit separate definition of column groups).

In this case subset is using non-standard evaluation, which as has been discussed elsewhere can be dangerous in some contexts. But I still like it for simple, top-level data manipulation because of its readability.

answered Oct 15 '22 20:10

Ben Bolker

Related questions
                            
                                ggplot2 colour geom_point by factor but geom_smooth based on all data
                            
                                Using non-ASCII characters inside functions for packages
                            
                                Functionality of probability=TRUE in svm function of e1071 package in R
                            
                                R tm package vcorpus: Error in converting corpus to data frame
                            
                                Expand data frame into combinations of row pairs
                            
                                Error when using dplyr inside of a function
                            
                                How to turn off the "Hit <Return> to see next plot" prompt plot3D?
                            
                                How to change column data type of a tibble
                            
                                Non-standard file/directory found at top level: 'README.Rmd' persists even after implementing suggested solutions
                            
                                How to suppress output
                            
                                Search for packages by a particular author
                            
                                Interpolating timeseries
                            
                                Row sum for large term-document matrix / simple_triplet_matrix ?? {tm package}
                            
                                select one row per group with ifelse in data.table
                            
                                R: how to change lattice (levelplot) color theme?
                            
                                Plotting during a loop in RStudio
                            
                                Converting multiple data.table columns to factors in R
                            
                                Spreading a two column data frame with tidyr
                            
                                Why does rendering a pdf from rmarkdown require closing rstudio between renders?
                            
                                Labelling logarithmic scale display in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With