Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to sort numbers to a list based on proximity in R

Tags:

sorting

r

vector

Let's say I had a list of numbers in a vector. I'm trying to come up with a script that will divide or sort the list into (not necessarily even) sets whose numbers are fairly close to each other relative to the other numbers in the vector. you can assume that the numbers in the vector are in ascending order.

my_list<- c(795, 798, 1190, 1191, 2587, 2693, 2796, 3483, 3668)

That is, I need help coming up with a script that will divide and assign these numbers into sets where

set_1<- c(795, 798) # these 2 numbers are fairly close to each other
set_2<- c(1190, 1191) # these numbers would be another set
set_3<- c(2587, 2693, 2796) # these numbers would be another set relative to the other numbers
set_4<- c(3483, 3668)  # the last set

any help or suggestions are greatly appreciated.

like image 210
user1313954 Avatar asked Feb 18 '23 09:02

user1313954


1 Answers

In general, what you are asking for is called Cluster Analysis, for which there are many possible methods and algorithms, many of which are already available in R packages listed here: http://cran.r-project.org/web/views/Cluster.html.

Here is for example how you can cluster your data using hierarchical clustering.

tree <- hclust(dist(my_list))
groups <- cutree(tree, h = 300)
# [1] 1 1 2 2 3 3 3 4 4
split(my_list, groups)
# $`1`
# [1] 795 798
# 
# $`2`
# [1] 1190 1191
# 
# $`3`
# [1] 2587 2693 2796
# 
# $`4`
# [1] 3483 3668
like image 165
flodel Avatar answered Feb 21 '23 23:02

flodel