Let's say I had a list of numbers in a vector. I'm trying to come up with a script that will divide or sort the list into (not necessarily even) sets whose numbers are fairly close to each other relative to the other numbers in the vector. you can assume that the numbers in the vector are in ascending order.
my_list<- c(795, 798, 1190, 1191, 2587, 2693, 2796, 3483, 3668)
That is, I need help coming up with a script that will divide and assign these numbers into sets where
set_1<- c(795, 798) # these 2 numbers are fairly close to each other
set_2<- c(1190, 1191) # these numbers would be another set
set_3<- c(2587, 2693, 2796) # these numbers would be another set relative to the other numbers
set_4<- c(3483, 3668) # the last set
any help or suggestions are greatly appreciated.
In general, what you are asking for is called Cluster Analysis, for which there are many possible methods and algorithms, many of which are already available in R packages listed here: http://cran.r-project.org/web/views/Cluster.html.
Here is for example how you can cluster your data using hierarchical clustering.
tree <- hclust(dist(my_list))
groups <- cutree(tree, h = 300)
# [1] 1 1 2 2 3 3 3 4 4
split(my_list, groups)
# $`1`
# [1] 795 798
#
# $`2`
# [1] 1190 1191
#
# $`3`
# [1] 2587 2693 2796
#
# $`4`
# [1] 3483 3668
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With