How can I get a data frame's name from a list? Sure, get()
gets the object itself, but I want to have its name for use within another function. Here's the use case, in case you would rather suggest a work around:
lapply(somelistOfDataframes, function(X) {
ddply(X, .(idx, bynameofX), summarise, checkSum = sum(value))
})
There is a column in each data frame that goes by the same name as the data frame within the list. How can I get this name bynameofX
? names(X)
would return the whole vector.
EDIT: Here's a reproducible example:
df1 <- data.frame(value = rnorm(100), cat = c(rep(1,50),
rep(2,50)), idx = rep(letters[1:4],25))
df2 <- data.frame(value = rnorm(100,8), cat2 = c(rep(1,50),
rep(2,50)), idx = rep(letters[1:4],25))
mylist <- list(cat = df1, cat2 = df2)
lapply(mylist, head, 5)
You can get the Pandas DataFrame Column Names by using DataFrame. columns. values method and to get it as a list use tolist(). Each column in a Pandas DataFrame has a label/name that specifies what type of value it holds/represents.
If you've got more than one dataframe that you want to retrieve the name of, you can use ls. str(mode = "list") . We use list because dataframes are stored as lists. Note that this method will also include the names of other list objects in your global environment.
DataFrames are generic data objects of R which are used to store the tabular data. They are two-dimensional, heterogeneous data structures. A list in R, however, comprises of elements, vectors, data frames, variables, or lists that may belong to different data types.
A data frame is a list of variables of the same number of rows with unique row names, given class "data. frame" . If no variables are included, the row names determine the number of rows. The column names should be non-empty, and attempts to use empty names will have unsupported results.
I'd use the names of the list in this fashion:
dat1 = data.frame()
dat2 = data.frame()
l = list(dat1 = dat1, dat2 = dat2)
> str(l)
List of 2
$ dat1:'data.frame': 0 obs. of 0 variables
$ dat2:'data.frame': 0 obs. of 0 variables
and then use lapply + ddply like:
lapply(names(l), function(x) {
ddply(l[[x]], c("idx", x), summarise,checkSum = sum(value))
})
This remains untested without a reproducible answer. But it should help you in the right direction.
EDIT (ran2): Here's the code using the reproducible example.
l <- lapply(names(mylist), function(x) {
ddply(mylist[[x]], c("idx", x), summarise,checkSum = sum(value))
})
names(l) <- names(mylist); l
Here is the dplyr equivalent
library(dplyr)
catalog =
data_frame(
data = someListOfDataframes,
cat = names(someListOfDataframes)) %>%
rowwise %>%
mutate(
renamed =
data %>%
rename_(.dots =
cat %>%
as.name %>%
list %>%
setNames("cat")) %>%
list)
catalog$renamed %>%
bind_rows(.id = "number") %>%
group_by(number, idx, cat) %>%
summarize(checkSum = sum(value))
you could just firstly use names(list)->list_name and then use list_name[1] , list_name[2] etc. to get each list name. (you may also need as.numeric(list_name[x]) if your list names are numbers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With