I've been trying to create a very simple function. Essentially I want every element in t$C
changed according to the if then statement in my code, and others stay the same. So here's my code:
set.seed(20)
x1=rnorm(100)
x2=rnorm(100)
x3=rnorm(100)
t=data.frame(a=x1,b=x1+x2,c=x1+x2+x3)
fun1=function(multi1,multi2)
{
v=t$c
s=c()
for (i in v)
{
if (i<0)
{
s[i]=i*multi1
}
else if(i>0)
{
s[i]=i*multi2
}
}
return(s)
}
fun1(multi1=0.5,multi2=2)
But it gave me just a few numbers. I felt I might made some stupid mistakes but I couldn't figure out.
In R, a function is an object so the R interpreter is able to pass control to the function, along with arguments that may be necessary for the function to accomplish the actions. The function in turn performs its task and returns control to the interpreter as well as any result which may be stored in other objects.
Generally speaking, the $ operator is used to extract or subset a specific part of a data object in R. For instance, this can be a data frame object or a list. In this example, I'll explain how to extract the values in a data frame columns using the $ operator.
tl;dr This operation can be vectorized. You can use the following method, assuming you want to leave values that are 0
or NA
alone.
with(t, c * ifelse(c < 0, 0.5, ifelse(c > 0, 2, 1)))
If you want to include them in one side (e.g. on the positive side), it's even more simple.
with(t, c * ifelse(c < 0, 0.5, 2))
As far as your loop goes, you've got a few issues there.
First, you were indexing s
by decimal values, which would likely cause errors in the calculations. This is also the reason why your result vector was so short. When you indexed in the loop, the indices were moved to integer values and since some of them were repeated, s
ended up being very short.
The actual unique index length went something like this -
length(unique(as.integer(t$c)))
# [1] 9
And as a result you got, as a simple example,
s[c(1, 2, 1, 1)] <- something
Since 1 is repeated, only indices 1 and 2 were changed. This is what was happening in your loop. Further illustrated as
x <- 1:5
x[1.2]
# [1] 1
x[1.99]
# [1] 1
Next, notice below that we have allocated the vector s
. We can do that because we know the length of the resulting vector will be the same as v
. This is the recommended, more efficient way rather than building the vector in the loop.
Moving on, I changed for(i in v)
to for(i in seq_along(v))
to correct this. Now we are indexing with a sequence for i
. Then we also need to index v
in the same manner. Finally, we can assign s[i] <- if(...
instead of assigning to the same index inside the if()
statement.
Also note that you haven't accounted for 0
or any other values that may appear in v
(like NA
). I added a final else
where we just leave those values alone. Change that as you see necessary. Furthermore, instead of going to the global environment to get t$c
, we can pass it as an argument and make this function more general (credit to @ShawnMehan for that suggestion). Here's the revised version:
fun1 <- function(vec, multi1, multi2) {
s <- vector("numeric", length(vec))
for (i in seq_along(vec)) {
s[i] <- if (vec[i] < 0) {
vec[i] * multi1
} else if(vec[i] > 0) {
vec[i] * multi2
} else {
vec[i]
}
}
return(s)
}
So now we have a length 100 result
x <- fun1(t$c, 0.5, 2)
str(x)
# num [1:100] 2.657 -0.949 7.423 -0.749 5.664 ...
I wrote this long explanation because I figure you are learning how to write a loop. In R though, we can vectorize this entire operation and put it into one line of code. The following line gives the same result as fun1(t$c, 0.5, 2)
.
with(t, c * ifelse(c < 0, 0.5, ifelse(c > 0, 2, 1)))
Thanks to @Frank for catching my calculation oversight.
Hopefully this all makes sense. Sometimes I don't do well with explanations and technical jargon. If there are any questions, please comment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With