Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

perl: Text::Fuzzy, different string giving same edit distance?

Tags:

perl

edit

Checking distance between $barcode and two strings, first string has same 12 characters at the front and another is completely different but both gives same distance?

#!/usr/bin/perl
use warnings;
use strict;
use Text::Fuzzy;
my $barcode =  "TCCCTTGTCTCC";

foreach my $line1 (<DATA>) {
    print "New string\n";
    print "Barcode length:", length $barcode, "\nSequence length:",
    length $line1, "\n";
    my $tf = Text::Fuzzy->new($barcode);
    my $ed = $tf->distance($line1);
    print "Edit distance: ", $ed ,"\n\n";
}

__DATA__
TCCCTTGTCTCCCCTGATATCCTGTAAAATCCTTTTCTTCTGATGGGTGCCATTTGCCACTAGAGGAAGCTGAACAGACCTGACTACCTGGA
GACGAGACTGATCACCTGATATCCTGTAAAATCCTTTTCTTCTGATGGGTGCCATTTGCCACTAGAGGAAGCTGCAGACCTGACTACCTGGA

Outputs:

New string
Barcode length:12
Sequence length:93
Edit distance: 81

New string
Barcode length:12
Sequence length:93
Edit distance: 81
like image 472
SSh Avatar asked Jun 16 '26 21:06

SSh


1 Answers

That seems right since all the characters of subsequence are present in the longer sequence both would have the same Levenshtein edit distance. This is so because all it would need is deletions to transform the longer to shorter sequence

Example :

artic => arc edit distance 2, i.e deletions 2 arche => arc would have the same edit distance 2 i.e deletions 2

like image 187
keety Avatar answered Jun 22 '26 21:06

keety



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!