Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# code or algorithm to quickly calculate distance between large strings? [closed]

Hi and thanks for looking!

Background

I have an XML file that contains 1900 nodes which themselves contain a string of encoded data of about 3400 characters.

As part of a use case for an application I am developing, I need to be able to take a "benchmark" string at runtime, and find the closest match from the XML file.

Please note that XML is not germane to the app and that I may go with SQL moving forward, but for today, I just needed an easy place to store the data and prove the concept.

I am using .NET 4.0, C#, forms app, LINQ, etc.

Question

How do I find the closest match? Hamming? Levenshtein? There are plenty of code samples online, but most are geared towards small string comparisons ("ant" vs. "aunt") or exact match. I will rarely have exact matches; I just need closest match.

Thanks in advance!

Matt

like image 254
Matt Cashatt Avatar asked May 15 '26 04:05

Matt Cashatt


1 Answers

You mentioned using Levenhstein's Edit Distance and that your strings were about 3400 characters long.

I did a quick try and using the dynamic programming version of Levenhstein's Edit Distance it seems to be quite fast and cause no issue.

I did this:

        final StringBuilder sb1 = new StringBuilder();
        final StringBuilder sb2 = new StringBuilder();
        final Random r = new Random(42);
        final int n = 3400;
        for (int i = 0; i < n; i++) {
            sb1.append( (char) ('a' + r.nextInt(26)) );
            sb2.append( (char) ('a' + r.nextInt(26)) );
        }
        final long t0 = System.currentTimeMillis();
        System.out.println("LED: " + getLevenshteinDistance(sb1.toString(), sb2.toString()) );
        final long te = System.currentTimeMillis() - t0;
        System.out.println("Took: " + te + " ms");

And it's finding the distance in 215 ms on a Core 2 Duo from 2006 or so.

Would that work for you?

(btw I'm not sure I can paste the code for the DP LED implementation I've got here so you probably should search the Internet for one Java implementation)

like image 59
TacticalCoder Avatar answered May 17 '26 18:05

TacticalCoder



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!