Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

interviewstreet.com - String similarity

Tags:

java

I'm trying to solve the string similarity question on interviewstreet.com. My code is working for 7/10 cases (and it is exceeding the time limit for the other 3).

Here's my code -

public class Solution {

    public static void main(String[] args) {

        Scanner user_input = new Scanner(System.in);

        String v1 = user_input.next();
        int number_cases = Integer.parseInt(v1);

        String[] cases = new String[number_cases];
        for(int i=0;i<number_cases;i++)
            cases[i] = user_input.next();

        for(int k=0;k<number_cases;k++){
            int similarity = solve(cases[k]);   
            System.out.println(similarity);
        }
    }

    static int solve(String sample){

        int len=sample.length();
        int sim=0;
        for(int i=0;i<len;i++){
            for(int j=i;j<len;j++){
                if(sample.charAt(j-i)==sample.charAt(j))
                    sim++;
                else
                    break;
            }
        }
        return sim;
    }
}

Here's the question -

For two strings A and B, we define the similarity of the strings to be the length of the longest prefix common to both strings. For example, the similarity of strings "abc" and "abd" is 2, while the similarity of strings "aaa" and "aaab" is 3.

Calculate the sum of similarities of a string S with each of it's suffixes.

Input:
The first line contains the number of test cases T. Each of the next T lines contains a string each.

Output:
Output T lines containing the answer for the corresponding test case.

Constraints:
1 <= T <= 10
The length of each string is at most 100000 and contains only lower case characters.

Sample Input:
2
ababaa
aa

Sample Output:
11
3

Explanation:
For the first case, the suffixes of the string are "ababaa", "babaa", "abaa", "baa", "aa" and "a". The similarities of each of these strings with the string "ababaa" are 6,0,3,0,1,1 respectively. Thus the answer is 6 + 0 + 3 + 0 + 1 + 1 = 11.

For the second case, the answer is 2 + 1 = 3.

How can I improve the running speed of the code. It becomes harder since the website does not provide a list of test cases it uses.

like image 238
Ashish Agarwal Avatar asked Jul 13 '12 22:07

Ashish Agarwal


2 Answers

I used char[] instead of strings. It reduced the running time from 5.3 seconds to 4.7 seconds and for the test cases and it worked. Here's the code -

static int solve(String sample){    
        int len=sample.length();
        char[] letters = sample.toCharArray();
        int sim=0;
        for(int i=0;i<len;i++){
            for(int j=i;j<len;j++){
                if(letters[j-i]==letters[j])
                    sim++;
                else
                    break;
            }
        }
    return sim;
}
like image 159
Ashish Agarwal Avatar answered Nov 15 '22 11:11

Ashish Agarwal


used a different algorithm. run a loop for n times where n is equals to length the main string. for each loop generate all the suffix of the string starting for ith string and match it with the second string. when you find unmatched character break the loop add j's value to counter integer c.

import java.io.BufferedReader;
import java.io.InputStreamReader;

class Solution {

    public static void main(String args[]) throws Exception {
    BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
    int T = Integer.parseInt(in.readLine());
    for (int i = 0; i < T; i++) {
        String line = in.readLine();
        System.out.println(count(line));
    }
    }

    private static int count(String input) {
    int c = 0, j;
    char[] array = input.toCharArray();
    int n = array.length;
    for (int i = 0; i < n; i++) {
        for (j = 0; j < n - i && i + j < n; j++)
        if (array[i + j] != array[j])
            break;
        c+=j;
    }
    return c;
    }
}
like image 24
LOGAN Avatar answered Nov 15 '22 09:11

LOGAN