error in ddply function sum?

Question

first time posting here! I am having a problem using the ddply function. I have this table that I would like to summarize using the column "LC", and adding the values in the column "Area":

  ID LC  per     Area
1  1  7 0.29  62428.3
2  1  7 0.79 170063.3
3  1  4 0.40  86108.0
4  1  7 0.43  92566.1
5  1  6 1.00 215270.0
6  1  7 0.61 131314.7

Based on this dataframe I would expect exactly this:

LC   Area
4  86108.0
6 215270.0
7 456372.4

Applying the ddply function I get these results:

> ddply(x, 'LC', sum)
  LC       V1
1  4  86113.4
2  6 215278.0
3  7 456406.5

The formatting is perfect, but there is some discrepancies in the values. For example, class 7 should have a value of 456372.4, instead ddply reports a value of 456406.5. A difference of 34.1. All the values are miscalculated.

Can someone explain me why I am having this problem? Am I missing something here? Is my code wrong?

Thank you!

Sven Hohenstein · Accepted Answer

There are two problems with your approach:

You need to tell ddply what to sum (Area). If you don't specify the column, ddply sums the values of all columns (ID, per, and Area).
You could aggregate the data with the summarise argument.

This code works:

x <- read.table(text="  ID LC  per     Area
1  1  7 0.29  62428.3
2  1  7 0.79 170063.3
3  1  4 0.40  86108.0
4  1  7 0.43  92566.1
5  1  6 1.00 215270.0
6  1  7 0.61 131314.7", header = TRUE)


library(plyr)

ddply(x, .(LC), summarise, sum(Area))

The result:

  LC      ..1
1  4  86108.0
2  6 215270.0
3  7 456372.4

error in ddply function sum?

Tags:

r

plyr

user1896882

1 Answers

Sven Hohenstein

Recent Activity

Donate For Us

error in ddply function sum?

Tags:

r

plyr

user1896882

1 Answers

Sven Hohenstein

Related questions

Recent Activity

Donate For Us