Please advise how can I replace half of values in a column to NA:
# Generate 500 values with a skewed distribution
x1 <- round(rbeta(500,0.5,3)*100,0)
# Assign variable to a data frame
df <- data.frame(x1)
# Replace 250 random values in a column 'x1' to NA
df[sample(x1,250)] <- NA
The following mistake is shown:
Error in `[<-.data.frame`(`*tmp*`, sample(x1, 250), value = NA) :
new columns would leave holes after existing columns
I understand why the mistake is shown, but I would like to force the replacement. Please advise on how can I do that.
replace() function is used to replace values in column (one value with another value on all columns). This method takes to_replace, value, inplace, limit, regex and method as parameters and returns a new DataFrame.
Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df['column name'] = df['column name']. replace(['1st old value', '2nd old value', ...], ['1st new value', '2nd new value', ...])
By default, the Pandas replace method returns a new dataframe. (This is the default behavior because by default, the inplace parameter is set to inplace = False .) If you set inplace = True , the method will return nothing, and will instead directly modify the dataframe that's being operated on.
Within pandas, a missing value is denoted by NaN . In most cases, the terms missing and null are interchangeable, but to abide by the standards of pandas, we'll continue using missing throughout this tutorial.
It seems like you need
df$x1[sample(nrow(df),250)] <- NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With