why t-SNE's leaning rate is so large? [closed]

Question

Usually in other machine learning structures, the learning rate value is a very small value, such as 0.01. Why is tSNE using a very large value between 10 and 1000? I don’t understand well even if I look at the formula, so I leave a question. Thank you.

Prayson W. Daniel · Accepted Answer

Scikit-Learn provides this explanation:

The learning rate for t-SNE is usually in the range [10.0, 1000.0]. If the learning rate is too high, the data may look like a ‘ball’ with any point approximately equidistant from its nearest neighbours. If the learning rate is too low, most points may look compressed in a dense cloud with few outliers. If the cost function gets stuck in a bad local minimum increasing the learning rate may help.

T-SNE algorithm begins by computing the similarity probability high-dimensional space points and corresponding low-dimensional space points. The similarity of point X and Y is a conditional probability that a point X would choose point Y as its neighbor if neighbors were picked in proportion to their probability density under a normal distribution centred at X. The objective of the algorithm is thus to minimize the difference between these conditional probabilities in higher and lower dimensional space for the best representation of data points in lower-dimensional space.

Small learning rate can make the difference so small that the cost function gets stuck in undesirable local minimum.

Read:

Kobak, Dmitry, and Philipp Berens. “The art of using t-SNE for single-cell transcriptomics.” bioRxiv (2018): 453449.

why t-SNE's leaning rate is so large? [closed]

Tags:

python

tensorflow

keras

Insung Lee

1 Answers

Prayson W. Daniel

Recent Activity

Donate For Us

why t-SNE's leaning rate is so large? [closed]

Tags:

python

tensorflow

keras

Insung Lee

1 Answers

Prayson W. Daniel

Related questions

Recent Activity

Donate For Us