How do I make inexact string comparisons with Perl?

Question

Given two strings, I want to find all common substrings of a specified length, but allowing one character to be different.

For example, if s1 is 'ATCAGC', s2 is 'ATAATCGAC', and the specified length is 3, then I'd want output along these lines:

ATC from s1 matches ATA, ATC from s2
TCA from s1 matches TAA, TCG from s2

Questions

Can I do so with a simple regex?
If not, is there module for this in Perl?

Jeff Burdges · Accepted Answer

First, google result for "perl hamming distance" found a perlmonks thread that mentions Text::LevenshteinXS, various typical implementations, and a cute xor trick :

sub hd{ length( $_[ 0 ] ) - ( ( $_[ 0 ] ^ $_[ 1 ] ) =~ tr[\0][\0] ) }

You should skim wikipedia article on String metrics if Levenshtein distance or Hamming distance aren't familiar.

How do I make inexact string comparisons with Perl?

Tags:

string

regex

pattern-matching

match

perl

Mariya

1 Answers

Jeff Burdges

Recent Activity

Donate For Us

How do I make inexact string comparisons with Perl?

Tags:

string

regex

pattern-matching

match

perl

Mariya

1 Answers

Jeff Burdges

Related questions

Recent Activity

Donate For Us