Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I extract a single column from a data.frame as a data.frame?

Say I have a data.frame:

df <- data.frame(A=c(10,20,30),B=c(11,22,33), C=c(111,222,333))   A  B  C 1 10 11 111 2 20 22 222 3 30 33 333 

If I select two (or more) columns I get a data.frame:

x <- df[,1:2]    A  B  1 10 11  2 20 22  3 30 33 

This is what I want. However, if I select only one column I get a numeric vector:

x <- df[,1] [1] 1 2 3 

I have tried to use as.data.frame(), which does not change the results for two or more columns. it does return a data.frame in the case of one column, but does not retain the column name:

x <- as.data.frame(df[,1])      df[, 1] 1       1 2       2 3       3 

I don't understand why it behaves like this. In my mind it should not make a difference if I extract one or two or ten columns. IT should either always return a vector (or matrix) or always return a data.frame (with the correct names). what am I missing? thanks!

Note: This is not a duplicate of the question about matrices, as matrix and data.frame are fundamentally different data types in R, and can work differently with dplyr. There are several answers that work with data.frame but not matrix.

like image 638
rs028 Avatar asked Jan 09 '14 16:01

rs028


People also ask

How do I get one column from a DataFrame in R?

To select a column in R you can use brackets e.g., YourDataFrame['Column'] will take the column named “Column”. Furthermore, we can also use dplyr and the select() function to get columns by name or index. For instance, select(YourDataFrame, c('A', 'B') will take the columns named “A” and “B” from the dataframe.

How do you slice a column out of a DataFrame?

To slice the columns, the syntax is df. loc[:,start:stop:step] ; where start is the name of the first column to take, stop is the name of the last column to take, and step as the number of indices to advance after each extraction; for example, you can select alternate columns.


2 Answers

Use drop=FALSE

> x <- df[,1, drop=FALSE] > x    A 1 10 2 20 3 30 

From the documentation (see ?"[") you can find:

If drop=TRUE the result is coerced to the lowest possible dimension.

like image 115
Jilber Urbina Avatar answered Oct 10 '22 09:10

Jilber Urbina


Omit the ,:

x <- df[1]     A 1 10 2 20 3 30 

From the help page of ?"[":

Indexing by [ is similar to atomic vectors and selects a list of the specified element(s).

A data frame is a list. The columns are its elements.

like image 39
Sven Hohenstein Avatar answered Oct 10 '22 09:10

Sven Hohenstein