I'm a beginner in R. Here is a very simple code where I'm trying to save the residual term: <pre class="prettyprint"><code># Create variables for child's EA: dat$cldeacdi <- rowMeans(dat[,c('cdcresp', 'cdcinv')],na.rm=T) dat$cldeacu <- rowMeans(dat[,c('cucresp', 'cucinv')],na.rm=T) # Create a residual score for child EA: dat$cldearesid <- resid(lm(cldeacu ~ cldeacdi, data = dat)) </code></pre> I'm getting the following message: <pre class="prettyprint"><code>Error in `$<-.data.frame`(`*tmp*`, cldearesid, value = c(-0.18608488908881, : replacement has 366 rows, data has 367 </code></pre> I searched for this error but couldn't find anything that could resolve this. Additionally, I've created the exact same code for mom's EA, and it saved the residual just fine, with no errors. I'd be grateful if someone could help me resolve this.

I have a feeling you have <code>NA</code>s in your data. Look at this example: <pre class="prettyprint"><code>#mtcars data set test <- mtcars #adding just one NA in the cyl column test[2, 2] <- NA #running linear model and adding the residuals to the data.frame test$residuals <- resid(lm(mpg ~ cyl, test)) Error in `$<-.data.frame`(`*tmp*`, "residuals", value = c(0.382245430809409, : replacement has 31 rows, data has 32 </code></pre> As you can see this results in a similar error to yours. As a validation: <pre class="prettyprint"><code>length(resid(lm(mpg ~ cyl, test))) #31 nrow(test) #32 </code></pre> This happens because <code>lm</code> will run <code>na.omit</code> on the data set prior to running the regression, so if you have any rows with NA these will get eliminated resulting in fewer results. If you run <code>na.omit</code> on your <code>dat</code> data set (i.e. <code>dat <- na.omit(dat)</code> at the very beginning of your code then your code should work.

Error in dataframe tmp replacement has x data has y

Q: How do I add a row to a DataFrame in R?

To add row to R Data Frame, append the list or vector representing the row, to the end of the data frame. nrow(df) returns the number of rows in data frame. nrow(df) + 1 means the next row after the end of data frame. Assign the new row to this row position in the data frame.

Q: How do I add a column to an empty DataFrame in R?

The easiest way to add an empty column to a dataframe in R is to use the add_column() method: dataf %>% add_column(new_col = NA) . Note, that this includes installing dplyr or tidyverse.

Tags:

r

regression

lm

I'm a beginner in R. Here is a very simple code where I'm trying to save the residual term:

# Create variables for child's EA:

dat$cldeacdi <- rowMeans(dat[,c('cdcresp', 'cdcinv')],na.rm=T)
dat$cldeacu <- rowMeans(dat[,c('cucresp', 'cucinv')],na.rm=T)

# Create a residual score for child EA:

dat$cldearesid <- resid(lm(cldeacu ~ cldeacdi, data = dat))

I'm getting the following message:

Error in `$<-.data.frame`(`*tmp*`, cldearesid, value = c(-0.18608488908881,  : 
  replacement has 366 rows, data has 367

I searched for this error but couldn't find anything that could resolve this. Additionally, I've created the exact same code for mom's EA, and it saved the residual just fine, with no errors. I'd be grateful if someone could help me resolve this.

741

asked Nov 10 '17 23:11

Marishka Usacheva

1 Answers

I have a feeling you have NAs in your data. Look at this example:

#mtcars data set
test <- mtcars
#adding just one NA in the cyl column
test[2, 2] <- NA

#running linear model and adding the residuals to the data.frame
test$residuals <- resid(lm(mpg ~ cyl, test))
Error in `$<-.data.frame`(`*tmp*`, "residuals", value = c(0.382245430809409,  : 
  replacement has 31 rows, data has 32

As you can see this results in a similar error to yours.

As a validation:

length(resid(lm(mpg ~ cyl, test)))
#31
nrow(test)
#32

This happens because lm will run na.omit on the data set prior to running the regression, so if you have any rows with NA these will get eliminated resulting in fewer results.

If you run na.omit on your dat data set (i.e. dat <- na.omit(dat) at the very beginning of your code then your code should work.

131

answered Oct 20 '22 01:10

LyzandeR

Related questions
                            
                                R draw (abline + lm) line-of-best-fit through arbitrary point
                            
                                Interpretation of "stat_summary = mean_cl_boot" at ggplot2?
                            
                                Regression and summary statistics by group within a data.table
                            
                                Error with setwd in R
                            
                                grid.arrange using list of plots
                            
                                Side by side Xtables in Rmarkdown
                            
                                How to define more line types for graphs in R (custom linetype)?
                            
                                Adding two vectors by names
                            
                                Filter each column of a data.frame based on a specific value
                            
                                ggplot bar chart for time series
                            
                                R table function - how to remove 0 counts?
                            
                                Update an entire row in data.table in R
                            
                                Can you more clearly explain lazy evaluation in R function operators?
                            
                                Format latitude and longitude axis labels in ggplot
                            
                                Dollar operator as function argument for sapply not working as expected
                            
                                Separating column using separate (tidyr) via dplyr on a first encountered digit
                            
                                What is the difference between the "+" operator in ggplot2 and the "%>%" operator in magrittr?
                            
                                What is the difference between [[]] and $ in list indexing?
                            
                                Changing axis titles for autoplot
                            
                                Make a group_indices based on several columns

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Error in dataframe tmp replacement has x data has y

Tags:

r

regression

lm

Marishka Usacheva

People also ask

1 Answers

LyzandeR

Recent Activity

Donate For Us

Error in dataframe *tmp* replacement has x data has y

Tags:

r

regression

lm

Marishka Usacheva

People also ask

1 Answers

LyzandeR

Related questions

Recent Activity

Donate For Us

Error in dataframe tmp replacement has x data has y