Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: losing column names when adding rows to an empty data frame

I am just starting with R and encountered a strange behaviour: when inserting the first row in an empty data frame, the original column names get lost.

example:

a<-data.frame(one = numeric(0), two = numeric(0)) a #[1] one two #<0 rows> (or 0-length row.names) names(a) #[1] "one" "two" a<-rbind(a, c(5,6)) a #  X5 X6 #1  5  6 names(a) #[1] "X5" "X6" 

As you can see, the column names one and two were replaced by X5 and X6.

Could somebody please tell me why this happens and is there a right way to do this without losing column names?

A shotgun solution would be to save the names in an auxiliary vector and then add them back when finished working on the data frame.

Thanks

Context:

I created a function which gathers some data and adds them as a new row to a data frame received as a parameter. I create the data frame, iterate through my data sources, passing the data.frame to each function call to be filled up with its results.

like image 972
cdmihai Avatar asked Mar 08 '11 11:03

cdmihai


People also ask

How do I create an empty DataFrame in R with column names?

If you want to create an empty data. frame with dynamic names (colnames in a variable), this can help: names <- c("v","u","w") df <- data. frame() for (k in names) df[[k]]<-as. numeric() You can change the type as well if you need so.

How do I populate an empty DataFrame in R?

To create an empty Data Frame in R, call data. frame() function, and pas no arguments to it. The function returns an empty Data Frame with zero rows and zero columns.

How do I fix column names in R?

The easiest option to replace spaces in column names is with the clean. names() function. This R function creates syntactically correct column names by replacing blanks with an underscore. Moreover, you can use this function in combination with the %>%-operator from the Tidyverse package.


2 Answers

The rbind help pages specifies that :

For ‘cbind’ (‘rbind’), vectors of zero length (including ‘NULL’) are ignored unless the result would have zero rows (columns), for S compatibility. (Zero-extent matrices do not occur in S3 and are not ignored in R.)

So, in fact, a is ignored in your rbind instruction. Not totally ignored, it seems, because as it is a data frame the rbind function is called as rbind.data.frame :

rbind.data.frame(c(5,6)) #  X5 X6 #1  5  6 

Maybe one way to insert the row could be :

a[nrow(a)+1,] <- c(5,6) a #  one two #1   5   6 

But there may be a better way to do it depending on your code.

like image 90
juba Avatar answered Oct 03 '22 08:10

juba


was almost surrendering to this issue.

1) create data frame with stringsAsFactor set to FALSE or you run straight into the next issue

2) don't use rbind - no idea why on earth it is messing up the column names. simply do it this way:

df[nrow(df)+1,] <- c("d","gsgsgd",4)

df <- data.frame(a = character(0), b=character(0), c=numeric(0))  df[nrow(df)+1,] <- c("d","gsgsgd",4)  #Warnmeldungen: #1: In `[<-.factor`(`*tmp*`, iseq, value = "d") : #  invalid factor level, NAs generated #2: In `[<-.factor`(`*tmp*`, iseq, value = "gsgsgd") : #  invalid factor level, NAs generated  df <- data.frame(a = character(0), b=character(0), c=numeric(0), stringsAsFactors=F)  df[nrow(df)+1,] <- c("d","gsgsgd",4)  df #  a      b c #1 d gsgsgd 4 
like image 40
Raffael Avatar answered Oct 03 '22 07:10

Raffael