I have a data.frame
x
with the following format:
species site count
1: A 1.1 25
2: A 1.2 1152
3: A 2.1 26
4: A 3.5 1
5: A 3.7 98
---
101: B 1.2 6
102: B 1.3 10
103: B 2.1 8
104: B 2.2 8
105: B 2.3 5
I also have another data.frame
area
with the following format:
species area
1: A 59.7
2: B 34.4
3: C 37.7
4: D 22.8
I would like to divide the count
column of data.frame
x
by values in the area
column data.frame
area
when the values in the species column of each data.frame
match
I have been trying to make it work with a ddply
function:
density = ddply(x, "species", mutate, density = x$count/area[,2]
But I can't figure out the proper index syntax of the area[]
call to select only the row which matches the values found in x$species
. However, I am super new to the plyr
package (and apply*
functions as a whole) so this may be the completely wrong approach
I'm hoping to return a data.frame
of the following format:
species site count density
1: A 1.1 25 0.419
2: A 1.2 152 2.546
3: A 2.1 26 0.436
4: A 3.5 1 0.017
5: A 3.7 98 1.641
---
101: B 1.2 6 0.174
102: B 1.3 10 0.291
103: B 2.1 8 0.233
104: B 2.2 8 0.233
105: B 2.3 5 0.145
This is easy with data.table
:
library(data.table)
#converting your data to the native type for the package (by reference)
setDT(x); setDT(area)
x[area, density:=count/i.area, on="species"]
:=
is the natural way to add columns in data.table
(by reference, see this vignette & particularly point b) for some more about this and why it's important), so x:=y
adds a column named x
to your data.table
and assigns it the value y
.
When merging in the form X[Y,]
, we can think of Y
as selecting the rows of X
to operate on; further, when Y
is a data.table
, all objects in both X
and Y
are avaiable in j
(i.e., what comes after the comma), so we could have said density:=count/area
; when we want to be sure that we're referring to one of Y
's columns, we prepend its name with i.
so that we know we're referring to one of the columns in i
, i.e., what precedes the comma. There should be a vignette on merges forthcoming.
In general, as soon as you think "match across different data sets" your instinct should be to merge. For more on data.table
, see here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With