I have two data frames as follows:
test1 <- structure(list(`0.62m` = c(0.011, 0.043, 0.057, 0.067, 0.095,
0.121, 0.098, 0.086, 0.103, 0.12, 0.104), `0.87m` = c(0.017,
0.018, 0.052, 0.062, 0.111, 0.101, 0.112, 0.096, 0.104, 0.108,
0.111), `1.12m` = c(0.009, 0.016, 0.048, 0.03, 0.085, 0.07, 0.108,
0.076, 0.078, 0.092, 0.107), `1.37m` = c(0.025, 0.035, 0.035,
0.048, 0.067, 0.074, 0.095, 0.08, 0.082, 0.091, 0.094)), .Names = c("0.62m",
"0.87m", "1.12m", "1.37m"), row.names = 25:35, class = c("tbl_df",
"tbl", "data.frame"))
#Test 2
test2 <- structure(list(`0.62m` = c(235.15, 230.95, 251.95, 261.25, 254.55,
251.75, 259.85, 257.65, 252.55, 255.55, 254.15), `0.87m` = c(287.95,
196.35, 275.05, 245.85, 253.35, 259.75, 254.95, 261.75, 253.05,
264.45, 264.25), `1.12m` = c(36.35, 242.95, 266.65, 266.45, 248.85,
256.95, 253.75, 268.25, 251.05, 268.85, 259.65), `1.37m` = c(20.65,
287.95, 260.25, 260.55, 255.25, 258.45, 248.45, 261.95, 253.45,
263.25, 252.05)), .Names = c("0.62m", "0.87m", "1.12m", "1.37m"
), row.names = 25:35, class = c("tbl_df", "tbl", "data.frame"
))
Test 1 looks like following:
0.62m 0.87m 1.12m 1.37m
25 0.011 0.017 0.009 0.025
26 0.043 0.018 0.016 0.035
27 0.057 0.052 0.048 0.035
28 0.067 0.062 0.030 0.048
29 0.095 0.111 0.085 0.067
30 0.121 0.101 0.070 0.074
31 0.098 0.112 0.108 0.095
32 0.086 0.096 0.076 0.080
33 0.103 0.104 0.078 0.082
34 0.120 0.108 0.092 0.091
35 0.104 0.111 0.107 0.094
Test 2 looks like following:
0.62m 0.87m 1.12m 1.37m
25 235.15 287.95 36.35 20.65
26 230.95 196.35 242.95 287.95
27 251.95 275.05 266.65 260.25
28 261.25 245.85 266.45 260.55
29 254.55 253.35 248.85 255.25
30 251.75 259.75 256.95 258.45
31 259.85 254.95 253.75 248.45
32 257.65 261.75 268.25 261.95
33 252.55 253.05 251.05 253.45
34 255.55 264.45 268.85 263.25
35 254.15 264.25 259.65 252.05
Now, I want to create a new dataframe test3 where I need to check if values in test2 is greater than 180 or not. If the value is greater than 180, then the elements should be same as test1 else it should be -1 * test1.
The desired ouput would look like following:
0.62m 0.87m 1.12m 1.37m
25 0.011 0.017 -0.009 -0.025
26 0.043 0.018 0.016 0.035
I tried following:
test3 <- ifelse(test2 > 180, test1, -1 * test1)
Another method:
test3 <- data.frame(sapply(X = test1, FUN = function(x) (ifelse(test2>180, x, - 1*x))))
I think I need to use apply but not sure how to use that properly.
replace() method. It is used to replace a regex, string, list, series, number, dictionary, etc. from a DataFrame, Values of the DataFrame method are get replaced with another value dynamically.
You can do update a PySpark DataFrame Column using withColum(), select() and sql(), since DataFrame's are distributed immutable collection you can't really change the column values however when you change the value using withColumn() or any approach, PySpark returns a new Dataframe with updated values.
You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression. The blow example returns a Courses column where the Fee column value matches with 25000.
You just need some as.matrix()
calls (note that only the two rightmost cells in the top row are actually less than 180):
ifelse(as.matrix(test2)>180,as.matrix(test1),-as.matrix(test1));
## 0.62m 0.87m 1.12m 1.37m
## 25 0.011 0.017 -0.009 -0.025
## 26 0.043 0.018 0.016 0.035
## 27 0.057 0.052 0.048 0.035
## 28 0.067 0.062 0.030 0.048
## 29 0.095 0.111 0.085 0.067
## 30 0.121 0.101 0.070 0.074
## 31 0.098 0.112 0.108 0.095
## 32 0.086 0.096 0.076 0.080
## 33 0.103 0.104 0.078 0.082
## 34 0.120 0.108 0.092 0.091
## 35 0.104 0.111 0.107 0.094
The return value will be a matrix, but you can get back to a data.frame (if you want) via as.data.frame()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With