Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the name of a data.frame within a list?

How can I get a data frame's name from a list? Sure, get() gets the object itself, but I want to have its name for use within another function. Here's the use case, in case you would rather suggest a work around:

lapply(somelistOfDataframes, function(X) {
    ddply(X, .(idx, bynameofX), summarise, checkSum = sum(value))
})

There is a column in each data frame that goes by the same name as the data frame within the list. How can I get this name bynameofX? names(X) would return the whole vector.

EDIT: Here's a reproducible example:

df1 <- data.frame(value = rnorm(100), cat = c(rep(1,50),
    rep(2,50)), idx = rep(letters[1:4],25))
df2 <- data.frame(value = rnorm(100,8), cat2 = c(rep(1,50), 
    rep(2,50)), idx = rep(letters[1:4],25))

mylist <- list(cat = df1, cat2 = df2)
lapply(mylist, head, 5)
like image 270
Matt Bannert Avatar asked Jan 25 '12 11:01

Matt Bannert


People also ask

How do I find the name of a DataFrame in a list?

You can get the Pandas DataFrame Column Names by using DataFrame. columns. values method and to get it as a list use tolist(). Each column in a Pandas DataFrame has a label/name that specifies what type of value it holds/represents.

How do I get the DataFrame name in R?

If you've got more than one dataframe that you want to retrieve the name of, you can use ls. str(mode = "list") . We use list because dataframes are stored as lists. Note that this method will also include the names of other list objects in your global environment.

Can a list contain a DataFrame?

DataFrames are generic data objects of R which are used to store the tabular data. They are two-dimensional, heterogeneous data structures. A list in R, however, comprises of elements, vectors, data frames, variables, or lists that may belong to different data types.

What is data Frame name?

A data frame is a list of variables of the same number of rows with unique row names, given class "data. frame" . If no variables are included, the row names determine the number of rows. The column names should be non-empty, and attempts to use empty names will have unsupported results.


3 Answers

I'd use the names of the list in this fashion:

dat1 = data.frame()
dat2 = data.frame()
l = list(dat1 = dat1, dat2 = dat2)
> str(l)
List of 2
 $ dat1:'data.frame':   0 obs. of  0 variables
 $ dat2:'data.frame':   0 obs. of  0 variables

and then use lapply + ddply like:

lapply(names(l), function(x) {
    ddply(l[[x]], c("idx", x), summarise,checkSum = sum(value))
  })

This remains untested without a reproducible answer. But it should help you in the right direction.

EDIT (ran2): Here's the code using the reproducible example.

l <- lapply(names(mylist), function(x) {
ddply(mylist[[x]], c("idx", x), summarise,checkSum = sum(value))
})
names(l) <- names(mylist); l
like image 141
Paul Hiemstra Avatar answered Oct 23 '22 05:10

Paul Hiemstra


Here is the dplyr equivalent

library(dplyr)

catalog = 
  data_frame(
    data = someListOfDataframes,
    cat = names(someListOfDataframes)) %>%
  rowwise %>%
  mutate(
    renamed = 
      data %>%
      rename_(.dots = 
                cat %>%
                as.name %>% 
                list %>%
                setNames("cat")) %>%
      list)

catalog$renamed %>%
  bind_rows(.id = "number") %>%
  group_by(number, idx, cat) %>%
  summarize(checkSum = sum(value))
like image 22
bramtayl Avatar answered Oct 23 '22 04:10

bramtayl


you could just firstly use names(list)->list_name and then use list_name[1] , list_name[2] etc. to get each list name. (you may also need as.numeric(list_name[x]) if your list names are numbers.

like image 43
flyboyleo Avatar answered Oct 23 '22 06:10

flyboyleo