I am looking for the differences between Dynamic Time Warping and Needleman-Wunsch algorithm. Basically, they both find an alignment score. I need to calculate alignment (similarity) score between short sequence of strings (<20 characters) and there are a couple of thousands of them. I wasn't able to figure out the differences between the two algorithms and decide which one to choose for my work. Can anyone please clear me the differences?

Both of these algorithms use dynamic programming to determine an alignment of sequential data. The major difference here is how the score for <code>i,j</code> is determined. In Dynamic Time Warping, a cost (determined by a function of <code>i, j</code>) is added to the minimum value of the set <code>(i-1, j)</code>, <code>(i-1, j-1)</code>, <code>(j, i-1)</code>. In NW, the maximum of the set <code>(i-1, j) + weight</code>, <code>(i-1, j-1) + S(Ai, Bi)</code>, <code>(j, i-1) + weight</code> is taken, such that <code>S(A, B)</code> is determined by a look up in the similarity matrix. If you would like to make an alignment through enumerable space and can create a similarity matrix, (such as a protein sequence or words), use NW, however, if you are aligning data where you can't make a similarity matrix (like a time series), and need to use a function, go with DTW. Alignments can be a tricky thing, and you may have to tweak parameters to get things right.

What are the differences between Dynamic Time Warping and Needleman-Wunsch algorithm?

2 Answers

Both of these algorithms use dynamic programming to determine an alignment of sequential data. The major difference here is how the score for i,j is determined.

In Dynamic Time Warping, a cost (determined by a function of i, j) is added to the minimum value of the set (i-1, j), (i-1, j-1), (j, i-1).

In NW, the maximum of the set (i-1, j) + weight, (i-1, j-1) + S(Ai, Bi), (j, i-1) + weight is taken, such that S(A, B) is determined by a look up in the similarity matrix.

If you would like to make an alignment through enumerable space and can create a similarity matrix, (such as a protein sequence or words), use NW, however, if you are aligning data where you can't make a similarity matrix (like a time series), and need to use a function, go with DTW.

Alignments can be a tricky thing, and you may have to tweak parameters to get things right.

125

answered Nov 11 '22 04:11

wespiserA

The fundamental difference between Dynamic Time Warping (DTW) and the Needleman-Wunsch algorithm (NW) is in the way the sequence elements are accounted for in the alignment.

A basic assumption of DTW is that one sequence is a "time-warped" version of the other, in the sense that the target sequence is either stretched (one-to-many alignment), condensed (many-to-one alignment), or non-warped (one-to-one alignment) with respect to the source sequence.

Thus, DTW is not compatible with the notion of gaps, where one or more elements in one sequence are not matched by any elements in the other sequence (one-to-none or none-to-one alignment). By contrast, NW accounts for gaps explicitly with a penalty that is not a function of the elements to be inserted/deleted.

If you need to align character sequences, DTW is only appropriate in the unlikely case that the sequences are strictly "time warped" versions of each other, such as "wow" and "wwooowww". As soon as one sequence contains elements that cannot be construed as the result of stretching the other sequence, such as the exclamation marks in "wow" vs "wwooowww!!!", DTW is not appropriate, since it forces you to define the cost of inserting a "!" in terms of the distance with respect to a "w" or an "o".

answered Nov 11 '22 04:11

maarten

Related questions
                            
                                How can I create a column that indicates the observation's lag from another observation in R?
                            
                                rank data over a rolling window in pandas DataFrame
                            
                                How to remove the 'seconds' of Pandas dataframe index?
                            
                                MySql query histogram for time intervals data
                            
                                Python pandas plot time-series with gap
                            
                                Python fbprophet - export values from plot_components() for yearly
                            
                                Time series database for java?
                            
                                Plotting large number of time series using ggplot. Is it possible to speed up?
                            
                                "as of" in numpy
                            
                                Timeseries storage in Mongodb
                            
                                How can I use KNN /K-means to clustering time series in a dataframe
                            
                                Computing Rolling autocorrelation using Pandas.rolling
                            
                                rrd tool alternative for high volume
                            
                                What is the difference the zoo object and ts object in R?
                            
                                Python pandas select rows by list of dates
                            
                                Using sample_weights with fit_generator()
                            
                                Excel or R: Preparing time series from multiple sources?
                            
                                Pandas timeseries resampling and interpolating together
                            
                                Can Google Data Studio do time-series graphs at the hour level?
                            
                                How do you model something-over-time in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What are the differences between Dynamic Time Warping and Needleman-Wunsch algorithm?

Tags:

alignment

time-series

data-mining

bioinformatics

iinception

People also ask

2 Answers

wespiserA

maarten

Recent Activity

Donate For Us