Algorithm help needed

Question

I have a set of sequences (e.g. 10000 sequences), and generate a matrix (10000x10000) representing the pairwise similarity between every two sequences.

Now the goal is to retrieve a subset (for example 1000 sequences) from the large set and make sure the pairwise similarity between every two sequences in this subset is among a range (e.g. 50%~85%).

Is there any fast algorithm to do that?

Peter Popov · Accepted Answer

You can transform this to the graph theory problem:

Each sequence is a node
If similarity of two nodes is in given range than there is an edge between them
Your goal is to find the larges connected component(if your similarity relation is transitive...) or the larges clique(...if not).

Algorithm help needed

Tags:

algorithm

Mavershang

1 Answers

Peter Popov

Recent Activity

Donate For Us

Algorithm help needed

Tags:

algorithm

Mavershang

1 Answers

Peter Popov

Related questions

Recent Activity

Donate For Us