Aggregate() example in R

Tags:

I was looking at the help page for the aggregate function in R. I had never used this convenience function but I have a process it should help me speed up. However, I've been totally unable to walk through the example and understand what is going on.

One example is the following:

1> aggregate(state.x77, list(Region = state.region), mean)
         Region Population Income Illiteracy Life Exp Murder HS Grad  Frost   Area
1     Northeast       5495   4570      1.000    71.26  4.722   53.97 132.78  18141
2         South       4208   4012      1.738    69.71 10.581   44.34  64.62  54605
3 North Central       4803   4611      0.700    71.77  5.275   54.52 138.83  62652
4          West       2915   4703      1.023    71.23  7.215   62.00 102.15 134463

The output here is exactly what I would expect. So I try to understand what is going on. So I look at state.x77

1> head(state.x77)
           Population Income Illiteracy Life Exp Murder HS Grad Frost   Area
Alabama          3615   3624        2.1    69.05   15.1    41.3    20  50708
Alaska            365   6315        1.5    69.31   11.3    66.7   152 566432
Arizona          2212   4530        1.8    70.55    7.8    58.1    15 113417
Arkansas         2110   3378        1.9    70.66   10.1    39.9    65  51945
California      21198   5114        1.1    71.71   10.3    62.6    20 156361
Colorado         2541   4884        0.7    72.06    6.8    63.9   166 103766

OK, that's odd to me. I would expect to see a column in state.x77 named state.region or something. So state.region must be its own object. So I do a str() on it:

1> str(state.region)
 Factor w/ 4 levels "Northeast","South",..: 2 4 4 2 4 4 1 2 2 2 ...

It looks like state.region is just a factor. Somehow there HAS to be a connection between state.region and state.x77 in order for aggregate() to group state.x77 by state.region. But that connection is a mystery to me. Can you help me fill in my obvious misunderstandings?

753

asked Jan 20 '11 18:01

JD Long

3 Answers

From an old tampon (was it tampons?) commercial: "Proof, not only promises!"

state.x777 <- as.data.frame(state.x77)
state.x777 <- cbind(state.x777, stejt.ridzn = state.region)
aggregate(state.x77, list(Region = state.x777$stejt.ridzn), mean)

answered Oct 17 '22 07:10

Roman Luštrik

They are likely in the correct order as these objects are documented on the same help page ?state.x77, which has:

Details:

     R currently contains the following “state” data sets.  Note that
     all data are arranged according to alphabetical order of the state
     names.

answered Oct 17 '22 07:10

Gavin Simpson

Try help(state.region) etc --- they are all aligned:

Details:

 R currently contains the following “state” data sets.  Note that
 all data are arranged according to alphabetical order of the state
 names.

answered Oct 17 '22 09:10

Dirk Eddelbuettel

Related questions
                            
                                Conditionally apply pipeline step depending on external value
                            
                                reshaping k columns to 2 columns representing sequential pairs of the values of the k variables
                            
                                dplyr 0.7 equivalent for deprecated mutate_
                            
                                r dplyr ends_with multiple string matches
                            
                                Create a new dataframe according to the contrast between two similar df [duplicate]
                            
                                How to center ggplot plot title
                            
                                In dplyr, what are the intrinsic differences between setdiff and anti_join?
                            
                                dplyr group by, carry forward value from previous group to next
                            
                                How to add icon to webpage tabs in blogdown
                            
                                Check if characters are all equal in a group using dplyr - R
                            
                                Does rmarkdown allow captions and references for code chunks?
                            
                                how to "spread" a list-column?
                            
                                Extract tuples with specified common values in another column in SQL
                            
                                Using a custom template for Rmd pdf without changing current setup
                            
                                standard eval with ggplot2 without `aes_string()`
                            
                                "Unable to run a simple JNI program" error message when installing rJava on R 3.6 for ubuntu bionic beaver
                            
                                How to add two vectors WITHOUT repeating in R?
                            
                                Displaying errors with sweave
                            
                                how can I suppress output from Sweave that is not suppressed by echo=FALSE?
                            
                                Calculating a consecutive streak in data

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With