Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I do SQL like operations on a R data frame?

For example, I have a data frame with data across categories and subcategories and I want to be able to get row with maximum value in a particular column etc.

SQL is what comes to mind first. But since I am not interested in joins or indices etc, python's list comprehensions would do the same thing better with a more modern syntax.

What's best practice in R for such operations?

EDIT: For now I think I am fine with which.max. Why I asked the question the way I did is simply that I have come to learn that in R there are many libraries etc doing pretty much the same thing. Just by reading the documentation it's very hard to evaluate how popular (ie how well the library fulfills its purpose). My personal experience with Python is that the day you figure out how to use list comprehensions (with itertools as a bonus), you are pretty much covered. Over time this has evolved as best practice, you don't see lambda and filter for example that often in the general python debate these days as list comprehensions does the same thing easier and more uniform.

like image 851
c00kiemonster Avatar asked Mar 17 '26 21:03

c00kiemonster


1 Answers

If you really mean SQL, a pretty straightforward answer is the 'sqldf' package:

http://cran.at.r-project.org/web/packages/sqldf/index.html

From the help for ?sqldf

library(sqldf)
a1s <- sqldf("select * from warpbreaks limit 6")
like image 74
mdsumner Avatar answered Mar 20 '26 10:03

mdsumner



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!