I am trying to apply the dplyr package and calculate the number of entries for each card number in a dataset with the following function:
freq<- function(data){
data <- complete.dupremoved[order(-complete.dupremoved$SUMMA),]
aggregate(count ~., data=transform(complete.dupremoved,count=1), length)
complete.dupremoved$count <-complete.dupremoved[complete.dupremoved$KLIENDIKAARDINR,]
sample <- count(complete.dupremoved, vars = "KLIENDIKAARDINR")
complete.dupremoved<- merge(complete.dupremoved,sample, by ="KLIENDIKAARDINR")
return(complete.dupremoved)
}
The error shown is Error: data_frames can only contain 1d atomic vectors and lists.
When I do the : lapply(complete.dupremoved,class)
Some columns are numeric , factors , character , integer. Any solution how to solve this? Also the debugger gives the following:
function (x)
{
stopifnot(is.list(x))
if (length(x) == 0) {
x <- list()
class(x) <- c("tbl_df", "tbl", "data.frame")
attr(x, "row.names") <- .set_row_names(0)
return(x)
}
names_x <- names2(x)
if (any(is.na(names_x) | names_x == "")) {
stop("All columns must be named", call. = FALSE)
}
ok <- vapply(x, is_1d, logical(1))
**if (any(!ok)) {
stop("data_frames can only contain 1d atomic vectors and lists",
call. = FALSE)**
}
n <- unique(vapply(x, NROW, integer(1)))
if (length(n) != 1) {
stop("Columns are not all same length", call. = FALSE)
}
class(x) <- c("tbl_df", "tbl", "data.frame")
attr(x, "row.names") <- .set_row_names(n)
x
}
A list is actually still a vector in R, but it's not an atomic vector. We construct a list explicitly with list() but, like atomic vectors, most lists are created some other way in real life.
There are mainly two types of vectors in R.
A list holds different data such as Numeric, Character, logical, etc. Vector stores elements of the same type or converts implicitly. Lists are recursive, whereas vector is not. The vector is one-dimensional, whereas the list is a multidimensional object.
The reason for this error is that the function is creating a dataframe as a variable within the original dataframe. This is the line that does that:
complete.dupremoved$count <-complete.dupremoved[complete.dupremoved$KLIENDIKAARDINR,]
In future you can check your dataframe with this to identify the class of each variable:
sapply(your_df_here, class)
The main question aside, I hope you were able to calculate entries by factor. There are several existing options out there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With