Why are all labels_ are -1? Generated by DBSCAN in Python

Question

![enter image description here][1]

from sklearn.cluster import DBSCAN
dbscan = DBSCAN(eps=0.001, min_samples=10) 
clustering = dbscan.fit(X)

Example vectors：

array([[ 0.05811029, -1.089355  , -1.9143777 , ...,  1.235167  ,
    -0.6473859 ,  1.5684978 ],
   [-0.7117326 , -0.31876346, -0.45949244, ...,  0.17786546,
     1.9377285 ,  2.190525  ],
   [ 1.1685177 , -0.18201494,  0.19475089, ...,  0.7026453 ,
     0.3937522 , -0.78675956],
   ...,
   [ 1.4172379 ,  0.01070347, -1.3984257 , ..., -0.70529956,
     0.19471683, -0.6201791 ],
   [ 0.6171041 , -0.8058429 ,  0.44837445, ...,  1.216958  ,
    -0.10003573, -0.19012968],
   [ 0.6433722 ,  1.1571665 , -1.2123466 , ...,  0.592805  ,
     0.23889546,  1.6207514 ]], dtype=float32)

X is model.wv.vectors, generated from model = word2vec.Word2Vec(sent, min_count=1,size= 50,workers=3, window =3, sg = 1)

Results are as follows:

array([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1])

PV8 · Accepted Answer

Based on the docs:

labels_array, shape = [n_samples]

Cluster labels for each point in the dataset given to fit(). Noisy samples are given the label -1.

The answer to this you can find here: What are noisy samples in Scikit's DBSCAN clustering algorithm?

Shortword: These are not exactly part of a cluster. They are simply points that do not belong to any clusters and can be "ignored" to some extent. It seems that you have really different data, which does not have central clustering classes.

What you can try?

DBSCAN(eps=0.5, min_samples=5, metric='euclidean', metric_params=None, algorithm='auto', leaf_size=30, p=None, n_jobs=None)

You can play with the parameters or change the clustering algorithm? Did you try kmeans?

ron_g · Answer

Your eps value is 0.001; try increasing that so that you get clusters forming (or else every point will be considered an outlier / labelled -1 because it's not in a cluster)

Why are all labels_ are -1? Generated by DBSCAN in Python

Tags:

python

cluster-analysis

scikit-learn

word2vec

dbscan

Jing

2 Answers

PV8

ron_g

Recent Activity

Donate For Us

Why are all labels_ are -1? Generated by DBSCAN in Python

Tags:

python

cluster-analysis

scikit-learn

word2vec

dbscan

Jing

2 Answers

PV8

ron_g

Related questions

Recent Activity

Donate For Us