Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expand ranges defined by "from" and "to" columns

Tags:

I have a data frame containing "name" of U.S. Presidents, the years when they start and end in office, ("from" and "to" columns). Here is a sample:

name           from  to Bill Clinton   1993 2001 George W. Bush 2001 2009 Barack Obama   2009 2012 

...and the output from dput:

dput(tail(presidents, 3)) structure(list(name = c("Bill Clinton", "George W. Bush", "Barack Obama" ), from = c(1993, 2001, 2009), to = c(2001, 2009, 2012)), .Names = c("name",  "from", "to"), row.names = 42:44, class = "data.frame") 

I want to create data frame with two columns ("name" and "year"), with a row for each year that a president was in office. Thus, I need to create a regular sequence with each year from "from", to "to". Here's my expected out:

name           year Bill Clinton   1993 Bill Clinton   1994 ... Bill Clinton   2000 Bill Clinton   2001 George W. Bush 2001 George W. Bush 2002 ...  George W. Bush 2008 George W. Bush 2009 Barack Obama   2009 Barack Obama   2010 Barack Obama   2011 Barack Obama   2012 

I know that I can use data.frame(name = "Bill Clinton", year = seq(1993, 2001)) to expand things for a single president, but I can't figure out how to iterate for each president.

How do I do this? I feel that I should know this, but I'm drawing a blank.

Update 1

OK, I've tried both solutions, and I'm getting an error:

foo<-structure(list(name = c("Grover Cleveland", "Benjamin Harrison", "Grover Cleveland"), from = c(1885, 1889, 1893), to = c(1889, 1893, 1897)), .Names = c("name", "from", "to"), row.names = 22:24, class = "data.frame") ddply(foo, "name", summarise, year = seq(from, to)) Error in seq.default(from, to) : 'from' must be of length 1 
like image 904
edgester Avatar asked Jul 15 '12 18:07

edgester


1 Answers

Here's a data.table solution. It has the nice (if minor) feature of leaving the presidents in their supplied order:

library(data.table) dt <- data.table(presidents) dt[, list(year = seq(from, to)), by = name] #               name year #  1:   Bill Clinton 1993 #  2:   Bill Clinton 1994 #  ... #  ... # 21:   Barack Obama 2011 # 22:   Barack Obama 2012 

Edit: To handle presidents with non-consecutive terms, use this instead:

dt[, list(year = seq(from, to)), by = c("name", "from")] 
like image 62
Josh O'Brien Avatar answered Oct 13 '22 00:10

Josh O'Brien