I have a very strange problem concerning the ifelse function: it does not return a factor (as I want) but something like the position of the factor.
The dataset I use can be downloaded here.
..is to make a new column in df that contains the name of the country IF that country belongs to the top 12 most frequent countries (in the column "answer"). Else it should contain "Other"
... is
... R returns something really strange: it returns the position of the factor level (between 1 and 181) for the top 10 countries, and "Other" for the others (which is ok). It is this line that returns the wrong value:
aDDs$answer, ## then it should be named as aDDs$answer **THIS IS THE PROBLEM**
## create a list with most frequent country names
temp <- row.names(as.data.frame(summary(aDDs$answer, max=12))) # create a df or something else with the summary output.
colnames(temp)[1]="freq"
"India" %in% temp #check if it works (yes)
## create new column that filters top results
aDDs$top <- ifelse(
aDDs$answer %in% temp, ## condition: match aDDs$answer with row.names in summary df
aDDs$answer, ## then it should be named as aDDs$answer **THIS IS THE PROBLEM**
"Other" ## else it should be named "Other"
)
View(aDDs)
PS. This is a follow-up question to this one, because it is somewhat different, and may need a separate question.
In R, the ifelse() function is a shorthand vectorized alternative to the standard if...else statement. Most of the functions in R take a vector as input and return a vectorized output.
ifelse returns a value with the same shape as test which is filled with elements selected from either yes or no depending on whether the element of test is TRUE or FALSE .
The 'ifelse()' function is the alternative and shorthand form of the R if-else statement. Also, it uses the 'vectorized' technique, which makes the operation faster. All of the vector values are taken as an argument at once rather than taking individual values as an argument multiple times.
The field answer
is factor, hence your function returns number (level of factor).
What you need to do is:
aDDs$answer <- as.character(aDDs$answer)
and then it works.
That's because you have a factor:
ifelse(c(T, F), factor(c("a", "b")), "other")
#[1] "1" "other"
Read the warning in help("ifelse")
:
The mode of the result may depend on the value of test (see the examples), and the class attribute (see oldClass) of the result is taken from test and may be inappropriate for the values selected from yes and no.
Sometimes it is better to use a construction such as
(tmp <- yes; tmp[!test] <- no[!test]; tmp) , possibly extended to handle missing values in test.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With