I tried to construct the clustering method as function the following ways:
mydata <- mtcars
# Here I construct hclust as a function
hclustfunc <- function(x) hclust(as.matrix(x),method="complete")
# Define distance metric
distfunc <- function(x) as.dist((1-cor(t(x)))/2)
# Obtain distance
d <- distfunc(mydata)
# Call that hclust function
fit<-hclustfunc(d)
# Later I'd do
# plot(fit)
But why it gives the following error:
Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") :
missing value where TRUE/FALSE needed
What's the right way to do it?
Agglomerative Hierarchical Clustering For 'hclust' function, we require the distance values which can be computed in R by using the 'dist' function. Default measure for dist function is 'Euclidean', however you can change it with the method argument.
Remember from the video that cutree() is the R function that cuts a hierarchical model. The h and k arguments to cutree() allow you to cut the tree based on a certain height h or a certain number of clusters k.
hclust() is a function that belongs to the stats package. You do not have to install it, as it comes 'bundled' with R.
Do read the help for functions you use. ?hclust
is pretty clear that the first argument d
is a dissimilarity object, not a matrix:
Arguments:
d: a dissimilarity structure as produced by ‘dist’.
As the OP has now updated their question, what is need is
hclustfunc <- function(x) hclust(x, method="complete")
distfunc <- function(x) as.dist((1-cor(t(x)))/2)
d <- distfunc(mydata)
fit <- hclustfunc(d)
What you want is
hclustfunc <- function(x, method = "complete", dmeth = "euclidean") {
hclust(dist(x, method = dmeth), method = method)
}
and then
fit <- hclustfunc(mydata)
works as expected. Note you can now pass in the dissimilarity coefficient method as dmeth
and the clustering method.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With