Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String as factor in R

Tags:

r

When creating a data frame in R, strings are by default converted to factors (which I don't mind). But when I want to create a new row to my data frame, I can't find a way to encode a string as a factor. If I use factor(), the string is converted to a numeral, but is still not a factor. Under either scenario, I can't append my new row to the data frame because the new row is not a factor. What I want is to have my new row behave just like my data frame, that is with a string converted to a factor.

> data.frame(c("Name one", "Name two")) -> my.data
> colnames(my.data) <- "Names"
> is.factor(my.data$Names)
[1] TRUE
> new.row1 <- c("Name three")
> is.factor(new.row1)[1]
[1] FALSE
> new.row2 <- c(factor("Name three"))
> new.row2
[1] 1
> is.factor(new.row2)[1]
[1] FALSE
> rbind(my.data, new.row1)
     Names
1 Name one
2 Name two
3     <NA>
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = "Name three") :
  invalid factor level, NA generated
> rbind(my.data, new.row2)
     Names
1 Name one
2 Name two
3     <NA>
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = 1L) :
  invalid factor level, NA generated
> 
like image 973
Sverre Avatar asked Jun 03 '13 11:06

Sverre


People also ask

What does string as factors mean in R?

The argument 'stringsAsFactors' is an argument to the 'data. frame()' function in R. It is a logical that indicates whether strings in a data frame should be treated as factor variables or as just plain strings. The argument also appears in 'read.

What does strings as factors false mean in R?

Using stringsAsFactors=FALSE. By default, when building or importing a data frame, the columns that contain characters (i.e., text) are coerced (=converted) into the factor data type. Depending on what you want to do with the data, you may want to keep these columns as character. To do so, read. csv() and read.

How do I convert something to a factor in R?

In R, you can convert multiple numeric variables to factor using lapply function. The lapply function is a part of apply family of functions. They perform multiple iterations (loops) in R. In R, categorical variables need to be set as factor variables.

What is the difference between a string and a factor in R?

Practical differences: If x is a string it can take any value. If x is a factor it can only take a values from a list of all levels.


2 Answers

The trick is to only rbind a data.frame with another data.frame and not, as you have done, with a simple vector:

my.data <- data.frame(Names = c("Name one", "Name two"))
new.row1 <- data.frame(Names = c("Name three"))
rbind(my.data, new.row1)

##        Names
## 1   Name one
## 2   Name two
## 3 Name three
like image 199
Henrik Avatar answered Sep 21 '22 15:09

Henrik


Maybe something like this is going to help you?

data.frame(c("Name one", "Name two")) -> my.data
colnames(my.data) <- "Names"

rbind(my.data, data.frame(Names="name three"))
like image 42
storaged Avatar answered Sep 20 '22 15:09

storaged