Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing values on one dataframe based on data in another dataframe

Tags:

dataframe

r

I have two data frames as follows:

test1 <- structure(list(`0.62m` = c(0.011, 0.043, 0.057, 0.067, 0.095, 
0.121, 0.098, 0.086, 0.103, 0.12, 0.104), `0.87m` = c(0.017, 
0.018, 0.052, 0.062, 0.111, 0.101, 0.112, 0.096, 0.104, 0.108, 
0.111), `1.12m` = c(0.009, 0.016, 0.048, 0.03, 0.085, 0.07, 0.108, 
0.076, 0.078, 0.092, 0.107), `1.37m` = c(0.025, 0.035, 0.035, 
0.048, 0.067, 0.074, 0.095, 0.08, 0.082, 0.091, 0.094)), .Names = c("0.62m", 
"0.87m", "1.12m", "1.37m"), row.names = 25:35, class = c("tbl_df", 
"tbl", "data.frame"))

#Test 2
test2 <- structure(list(`0.62m` = c(235.15, 230.95, 251.95, 261.25, 254.55, 
251.75, 259.85, 257.65, 252.55, 255.55, 254.15), `0.87m` = c(287.95, 
196.35, 275.05, 245.85, 253.35, 259.75, 254.95, 261.75, 253.05, 
264.45, 264.25), `1.12m` = c(36.35, 242.95, 266.65, 266.45, 248.85, 
256.95, 253.75, 268.25, 251.05, 268.85, 259.65), `1.37m` = c(20.65, 
287.95, 260.25, 260.55, 255.25, 258.45, 248.45, 261.95, 253.45, 
263.25, 252.05)), .Names = c("0.62m", "0.87m", "1.12m", "1.37m"
), row.names = 25:35, class = c("tbl_df", "tbl", "data.frame"
))

Test 1 looks like following:

   0.62m 0.87m 1.12m 1.37m
25 0.011 0.017 0.009 0.025
26 0.043 0.018 0.016 0.035
27 0.057 0.052 0.048 0.035
28 0.067 0.062 0.030 0.048
29 0.095 0.111 0.085 0.067
30 0.121 0.101 0.070 0.074
31 0.098 0.112 0.108 0.095
32 0.086 0.096 0.076 0.080
33 0.103 0.104 0.078 0.082
34 0.120 0.108 0.092 0.091
35 0.104 0.111 0.107 0.094

Test 2 looks like following:

    0.62m  0.87m  1.12m  1.37m
25 235.15 287.95  36.35  20.65
26 230.95 196.35 242.95 287.95
27 251.95 275.05 266.65 260.25
28 261.25 245.85 266.45 260.55
29 254.55 253.35 248.85 255.25
30 251.75 259.75 256.95 258.45
31 259.85 254.95 253.75 248.45
32 257.65 261.75 268.25 261.95
33 252.55 253.05 251.05 253.45
34 255.55 264.45 268.85 263.25
35 254.15 264.25 259.65 252.05

Now, I want to create a new dataframe test3 where I need to check if values in test2 is greater than 180 or not. If the value is greater than 180, then the elements should be same as test1 else it should be -1 * test1.

The desired ouput would look like following:

    0.62m 0.87m 1.12m 1.37m
25 0.011 0.017 -0.009 -0.025
26 0.043 0.018 0.016 0.035

I tried following:

test3 <- ifelse(test2 > 180, test1, -1 * test1)

Another method:

test3 <- data.frame(sapply(X = test1, FUN = function(x) (ifelse(test2>180, x, - 1*x))))

I think I need to use apply but not sure how to use that properly.

like image 866
Jd Baba Avatar asked May 20 '15 20:05

Jd Baba


People also ask

How do I replace a value from one DataFrame to another in Python?

replace() method. It is used to replace a regex, string, list, series, number, dictionary, etc. from a DataFrame, Values of the DataFrame method are get replaced with another value dynamically.

How do you update a PySpark DataFrame with new values from another DataFrame?

You can do update a PySpark DataFrame Column using withColum(), select() and sql(), since DataFrame's are distributed immutable collection you can't really change the column values however when you change the value using withColumn() or any approach, PySpark returns a new Dataframe with updated values.

How do you change the value of a column based on another column in pandas?

You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression. The blow example returns a Courses column where the Fee column value matches with 25000.


1 Answers

You just need some as.matrix() calls (note that only the two rightmost cells in the top row are actually less than 180):

ifelse(as.matrix(test2)>180,as.matrix(test1),-as.matrix(test1));
##    0.62m 0.87m  1.12m  1.37m
## 25 0.011 0.017 -0.009 -0.025
## 26 0.043 0.018  0.016  0.035
## 27 0.057 0.052  0.048  0.035
## 28 0.067 0.062  0.030  0.048
## 29 0.095 0.111  0.085  0.067
## 30 0.121 0.101  0.070  0.074
## 31 0.098 0.112  0.108  0.095
## 32 0.086 0.096  0.076  0.080
## 33 0.103 0.104  0.078  0.082
## 34 0.120 0.108  0.092  0.091
## 35 0.104 0.111  0.107  0.094

The return value will be a matrix, but you can get back to a data.frame (if you want) via as.data.frame().

like image 97
bgoldst Avatar answered Nov 08 '22 23:11

bgoldst