Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in dataframe *tmp* replacement has x data has y

Tags:

r

regression

lm

I'm a beginner in R. Here is a very simple code where I'm trying to save the residual term:

# Create variables for child's EA:

dat$cldeacdi <- rowMeans(dat[,c('cdcresp', 'cdcinv')],na.rm=T)
dat$cldeacu <- rowMeans(dat[,c('cucresp', 'cucinv')],na.rm=T)

# Create a residual score for child EA:

dat$cldearesid <- resid(lm(cldeacu ~ cldeacdi, data = dat))

I'm getting the following message:

Error in `$<-.data.frame`(`*tmp*`, cldearesid, value = c(-0.18608488908881,  : 
  replacement has 366 rows, data has 367

I searched for this error but couldn't find anything that could resolve this. Additionally, I've created the exact same code for mom's EA, and it saved the residual just fine, with no errors. I'd be grateful if someone could help me resolve this.

like image 741
Marishka Usacheva Avatar asked Nov 10 '17 23:11

Marishka Usacheva


People also ask

How do I create an empty DataFrame in R?

One simple approach to creating an empty DataFrame in the R programming language is by using data. frame() method without any params. This creates an R DataFrame without rows and columns (0 rows and 0 columns).

How do I add a column to a DataFrame in R?

1 Adding new columns. You can add new columns to a dataframe using the $ and assignment <- operators. To do this, just use the df$name notation and assign a new vector of data to it. As you can see, survey has a new column with the name sex with the values we specified earlier.

How do I add a row to a DataFrame in R?

To add row to R Data Frame, append the list or vector representing the row, to the end of the data frame. nrow(df) returns the number of rows in data frame. nrow(df) + 1 means the next row after the end of data frame. Assign the new row to this row position in the data frame.

How do I add a column to an empty DataFrame in R?

The easiest way to add an empty column to a dataframe in R is to use the add_column() method: dataf %>% add_column(new_col = NA) . Note, that this includes installing dplyr or tidyverse.


1 Answers

I have a feeling you have NAs in your data. Look at this example:

#mtcars data set
test <- mtcars
#adding just one NA in the cyl column
test[2, 2] <- NA

#running linear model and adding the residuals to the data.frame
test$residuals <- resid(lm(mpg ~ cyl, test))
Error in `$<-.data.frame`(`*tmp*`, "residuals", value = c(0.382245430809409,  : 
  replacement has 31 rows, data has 32

As you can see this results in a similar error to yours.

As a validation:

length(resid(lm(mpg ~ cyl, test)))
#31
nrow(test)
#32

This happens because lm will run na.omit on the data set prior to running the regression, so if you have any rows with NA these will get eliminated resulting in fewer results.

If you run na.omit on your dat data set (i.e. dat <- na.omit(dat) at the very beginning of your code then your code should work.

like image 131
LyzandeR Avatar answered Oct 20 '22 01:10

LyzandeR