There are multiple post in the internet regarding the differences and similarities about [ and $. I see some post where $ is recommended only for interactive use but not for programming. However, I am not sure I understand if this is a preference or there is an explanation behind this idea.
Now lets say I am writing a package or function, if I am extracting an element by name (e.g., mtcars[["mpg"]]) why I should avoid using mtcars$mpg?
There are two differences that really matter between [[ and $:
[[ - works with strings (i.e. it supports variable substitution), $ doesn't. If you have my_var = "mpg", you can use mtcars[[my_var]], but there isn't a good way to use my_var with $.
$ auto-completes, if a partial column name is unambiguous. mtcars$m will return the mpg column, mtcars[["m"]] will return NULL. mtcars$d will return NULL because multiple columns start with a "d".
#1 makes [[ more flexible for programming - it's extremely common in programmatic use to be working with column names stored as strings.
#2 makes $ more dangerous - you should not use abbreviated column names in programming, however in interactive use it can be nice and quick. (Though this is largely moot with RStudio's auto-completion features, if you use that IDE.)
$ does partial matching: if you have a column named xxx in a dataframe dat, then dat$xx will return the xxx column (unless you also have a xx column). This can be dangerous.
I always use [["..."]] for another reason: I use RStudio, and there is a nice highlighting for strings, whereas there's no highlighting with $.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With