I have problem manipulating data-frame in R. This is basic thing in R, but I can't find best command to do this type of things.
Dummy example
Var1 20 300 39
Var2 49 23 91
Var3 0 239 210
How can I replace value with 10
in column 2 if value is smaller than 10
;
Or how to replace all values in data-frame with 100
, if they are greater than 200
?
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df['column name'] = df['column name']. replace(['1st old value','2nd old value',...],['1st new value','2nd new value',...])
You can use apply
to replace all values greater than for example 200 in a whole data.frame
apply(df, 2, function(x) ifelse(x > 200, 100, x))
Note: if any columns are not numeric, all columns will be converted to character or factor. To avoid this, you can do the following. If you have a data.frame df
with two numeric columns, columns 1 and 2, which you want to operate on, and two non-numeric columns, which you dont want to operate on, you could do this:
df <- cbind(apply(df[,1:2], 2, function(x) ifelse(x > 200, 100, x)), df[,3:4])
Edit after comment by @GregSnow:
It may be more useful to use lapply
in this situation.
df[] <- lapply(df, function(x) ifelse(x>200, 100, x))
For anyone who didn't know before (including myself), by using df[]
instead of only df
the structure of df
is kept as it was before (thanks @GregSnow for valuable information).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With