I want to create a frequency table from a data frame and save it in excel. Using table()
function i can only create frequency of a particular column. But I want to create frequency table for all the columns altogether, and for each column the levels or type of variables may differ too. Like kind of summary of a data frame but there will not be mean or other measures, only frequencies.
I was trying something like this
for(i in 1:230){
rm(tb)
tb<-data.frame(table(mydata[i]))
tb2<-cbind(tb2,tb)
}
But it's showing the following Error
Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 15, 12
In place of cbind()
I also used data.frame()
but the Error didn't changed.
To create a frequency table in R, we can simply use table function but the output of table function returns a horizontal table. If we want to read the table in data frame format then we would need to read the table as a data frame using as. data. frame function.
The table() method in R is used to compute the frequency counts of the variables appearing in the specified column of the dataframe. The result is returned to the form of a two-row tabular structure, where the first row indicates the value of the column and the next indicates its corresponding frequencies.
There are several ways to check data type in R. We can make use of the “typeof()” function, “class()” function and even the “str()” function to check the data type of an entire dataframe.
Get Frequency of All Values into DataFrame In case you wanted to get the frequency count of values in a vector as an R dataframe use as. data. frame() function. This takes the result of table() function as input and returns a dataframe.
Maybe an rbind solution is better as it allows you to handle variables with different levels:
dt = data.frame(x = c("A","A","B","C"),
y = c(1,1,2,1))
dt
# x y
# 1 A 1
# 2 A 1
# 3 B 2
# 4 C 1
dt_res = data.frame()
for (i in 1:ncol(dt)){
dt_temp = data.frame(t(table(dt[,i])))
dt_temp$Var1 = names(dt)[i]
dt_res = rbind(dt_res, dt_temp)
}
names(dt_res) = c("Variable","Levels","Freq")
dt_res
# Variable Levels Freq
# 1 x A 2
# 2 x B 1
# 3 x C 1
# 4 y 1 3
# 5 y 2 1
And an alternative (probably faster) process using apply:
dt = data.frame(x = c("A","A","B","C"),
y = c(1,1,2,1))
dt
ff = function(x){
y = data.frame(t(table(x)))
y$Var1 = NULL
names(y) = c("Levels","Freq")
return(y)
}
dd = do.call(rbind, apply(dt, 2, ff))
dd
# Levels Freq
# x.1 A 2
# x.2 B 1
# x.3 C 1
# y.1 1 3
# y.2 2 1
# extract variable names from row names
dd$Variable = sapply(row.names(dd), function(x) unlist(strsplit(x,"[.]"))[1])
dd
# Levels Freq Variable
# x.1 A 2 x
# x.2 B 1 x
# x.3 C 1 x
# y.1 1 3 y
# y.2 2 1 y
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With