Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract Column from data.frame as a Vector

I'm new to R.

I have a a Data.frame with a column called "Symbol".

   Symbol
1   "IDEA"
2   "PFC"
3   "RPL"
4   "SOBHA"

I need to store its values as a vector(x = c("IDEA","PFC","RPL","SOBHA")). Which is the most concise way of doing this?

like image 532
st0le Avatar asked Oct 13 '10 09:10

st0le


2 Answers

your.data <- data.frame(Symbol = c("IDEA","PFC","RPL","SOBHA"))
new.variable <- as.vector(your.data$Symbol) # this will create a character vector

VitoshKa suggested to use the following code.

new.variable.v <- your.data$Symbol # this will retain the factor nature of the vector

What you want depends on what you need. If you are using this vector for further analysis or plotting, retaining the factor nature of the vector is a sensible solution.

How these two methods differ:

cat(new.variable.v)
#1 2 3 4

cat(new.variable)
#IDEA PFC RPL SOBHA
like image 113
Roman Luštrik Avatar answered Nov 19 '22 06:11

Roman Luštrik


Roman Luštrik provided an excellent answer, however, the $ notation often proves hard to use in a pipe. In a pipe, use the dplyr function pull().

# setting up
library(tidyverse)
# import tidyverse for dplyr, tibble, and pipe
   
df <- data.frame(Symbol = c("IDEA","PFC","RPL","SOBHA"))
> df
  Symbol
1   IDEA
2    PFC
3    RPL
4  SOBHA

Now that the data frame is set up, we will first do some random mutates to the data frame just to show that it will work in a pipe, and at the end, we will use pull().

myvector <- df %>%
  mutate(example_column_1 = 1:4, example_column_2 = example_column_1^2) %>% #random example function
  arrange(example_column_1) %>% #random example function
  pull(Symbol) # finally, the pull() function; make sure to give just the column name as an argument

You can even further manipulate the vector in the pipe after the pull() function.

> myvector
[1] IDEA  PFC   RPL   SOBHA
Levels: IDEA PFC RPL SOBHA
> typeof(myvector)
[1] "integer"

typeof(myvector) returns integer because that is how factors are stored, where the different levels of the factor are stored as integers (I'm think that is how they are stored, at least). If you want to convert to character vector, just use as.character(myvector).

In conclusion, use dplyr's pull() function (and input just the column name you want to extract) when you want to extract a vector from a data frame or tibble while in a pipe.

like image 32
Phillip Long Avatar answered Nov 19 '22 07:11

Phillip Long