Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java characters alignment algorithm

Tags:

java

algorithm

I have two arrays of 100 characters(maximum, could be less or not the same size) that I want to align. I want to add an "-" when there is a character different than the other. I found the Needleman–Wunsch algorithm, which is based on dynamic programming, and the Smith–Waterman algorithm which is a general local alignment method also based on dynamic programming but they seems too complex for what I want to do. I just need a simple algorithm in Java perhaps about less than 50 lines, this code will be translated to assembly language after, so that why I need a simple algorithm.

Is there a way do this kind of alignment with a diff algorithm ? If yes can someone point me how to do this ? I searched on the biostar section, but it seems pretty much that I need to use the two algorithm I mentioned.

English is not my native language, so perhaps I searched the wrong keywords.

My program already works with the Needleman algorithm and its about 200 (ish) lines of code.

Example of desired input/output:

Input
Array 1 : MKNLASREVNIYVNGKLV
Array 2 : QMASREVNIYVNGKL


Output
Array 1 (or a simple print) : -MKNLASREVNIYVNGKLV 
Array 2 (or a simple print) : QM---ASREVNIYVNGKL-
like image 581
metraon Avatar asked Feb 23 '13 16:02

metraon


1 Answers

Using a variation of Levenshtein distance that does exactly what you want:

Output

-MKNLASREVNIYVNGKLV
QM---ASREVNIYVNGKL-

Code:

public class Main {
    public static void main(String[] args) {
        String[] aligned = align("MKNLASREVNIYVNGKLV", "QMASREVNIYVNGKL");
        System.out.println(aligned[0]);
        System.out.println(aligned[1]);
    }

    public static String[] align(String a, String b) {
        int[][] T = new int[a.length() + 1][b.length() + 1];

        for (int i = 0; i <= a.length(); i++)
            T[i][0] = i;

        for (int i = 0; i <= b.length(); i++)
            T[0][i] = i;

        for (int i = 1; i <= a.length(); i++) {
            for (int j = 1; j <= b.length(); j++) {
                if (a.charAt(i - 1) == b.charAt(j - 1))
                    T[i][j] = T[i - 1][j - 1];
                else
                    T[i][j] = Math.min(T[i - 1][j], T[i][j - 1]) + 1;
            }
        }

        StringBuilder aa = new StringBuilder(), bb = new StringBuilder();

        for (int i = a.length(), j = b.length(); i > 0 || j > 0; ) {
            if (i > 0 && T[i][j] == T[i - 1][j] + 1) {
                aa.append(a.charAt(--i));
                bb.append("-");
            } else if (j > 0 && T[i][j] == T[i][j - 1] + 1) {
                bb.append(b.charAt(--j));
                aa.append("-");
            } else if (i > 0 && j > 0 && T[i][j] == T[i - 1][j - 1]) {
                aa.append(a.charAt(--i));
                bb.append(b.charAt(--j));
            }
        }

        return new String[]{aa.reverse().toString(), bb.reverse().toString()};
    }
}
like image 174
Juan Lopes Avatar answered Oct 22 '22 17:10

Juan Lopes