I want to apply statistics to the columns of a dataframe in an iterated fashion:
columns number 1: 'A' represents the tags that I want to discriminate:
for (i in names(dataframe)) {
i <- as.name(i)
group1 <- i[A=="locationX"]
group2 <- i[A!="locationX"]
p <- wilcox.test(group1,group2,na.action(na.omit))$p.value
}
however, the as.name()
is to try to remove the inverted commas from the column names generated by names(dataframe)
.
Unfortunately it gives me the error:
Error in i[A == "locationX"] : object of type 'symbol' is not subsettable
I think as.name()
is not the right way to do it.
Any clues?
You can use the for loop to iterate over columns of a DataFrame. You can use multiple methods to iterate over a pandas DataFrame like iteritems() , getitem([]) , transpose(). iterrows() , enumerate() and NumPy. asarray() function.
First of all, create a data frame. Then, use rep function along with cbind function to repeat column values in the matrix by values in another column.
To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.
The only way this makes sense if for "A" to be a vector with multiple instances of "locationX" and multiple instance of the converse and for the length of "A" to be the same as the number of rows in "dataframe". If that is the case then something like this might work:
p <- list()
for (i in names(dataframe)) {
# using as.names not needed and possibly harmful
group1 <- dataframe[[i]][A == "locationX"]
group2 <- dataframe[[i]][A != "locationX"]
p[i] <- wilcox.test(group1,group2,na.action(na.omit))$p.value
}
Note that even if you did not get an error with your code that you would still have been overwriting the "p" every time through the loop.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With