Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding shortest possible substring that contains a String

This was a question asked in a recent programming interview.

Given a random string S and another string T with unique elements, find the minimum consecutive sub-string of S such that it contains all the elements in T. Say,

S='adobecodebanc' 
T='abc' 
Answer='banc'

I've come up with a solution,

public static String completeSubstring(String T, String S){

        String minSub = T;
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i <T.length()-1; i++) {
            for (int j = i + 1; j <= T.length() ; j++) {
                String sub = T.substring(i,j);
                if(stringContains(sub, S)){
                    if(sub.length() < minSub.length()) minSub = sub;
                }
            }
        }
        return minSub;

    }
    private static boolean stringContains(String t, String s){
        //if(t.length() <= s.length()) return false;

        int[] arr = new int[256];

        for (int i = 0; i <t.length() ; i++) {
            char c = t.charAt(i);
            arr[c -'a'] = 1;
        }
        boolean found = true;
        for (int i = 0; i <s.length() ; i++) {
            char c = s.charAt(i);
            if(arr[c - 'a'] != 1){
                found = false;
                break;
            }else continue;
        }
        return found;
    }

This algorithm has a O(n3) complexity, which but naturally isn't great. Can someone suggest a better algorithm.

like image 467
Zeus Avatar asked Jan 06 '23 09:01

Zeus


2 Answers

Here's the O(N) solution.

The important thing to note re: complexity is that each unit of work involves incrementing either start or end, they don't decrease, and the algorithm stops before they both get to the end.

public static String findSubString(String s, String t)
{
    //algorithm moves a sliding "current substring" through s
    //in this map, we keep track of the number of occurrences of
    //each target character there are in the current substring

    Map<Character,int[]> counts = new HashMap<>();
    for (char c : t.toCharArray())
    {
        counts.put(c,new int[1]);
    }

    //how many target characters are missing from the current substring
    //current substring is initially empty, so all of them
    int missing = counts.size();

    //don't waste my time
    if (missing<1)
    {
        return "";
    }

    //best substring found
    int bestStart = -1, bestEnd = -1;

    //current substring
    int start=0, end=0;
    while (end<s.length())
    {
        //expand the current substring at the end
        int[] cnt = counts.get(s.charAt(end++));
        if (cnt!=null)
        {
            if (cnt[0]==0)
            {
                --missing;
            }
            cnt[0]+=1;
        }
        //while the current substring is valid, remove characters
        //at the start to see if a shorter substring that ends at the
        //same place is also valid 
        while(start<end && missing<=0)
        {
            //current substring is valid
            if (end-start < bestEnd-bestStart || bestEnd<0)
            {
                bestStart = start;
                bestEnd = end;
            }
            cnt = counts.get(s.charAt(start++));
            if (cnt != null)
            {
                cnt[0]-=1;
                if (cnt[0]==0)
                {
                    ++missing;
                }
            }
        }
        //current substring is no longer valid.  we'll add characters
        //at the end until we get another valid one
        //note that we don't need to add back any start character that
        //we just removed, since we already tried the shortest valid string
        //that starts at start-1

    }
    return(bestStart<=bestEnd ? s.substring(bestStart,bestEnd) : null);
}
like image 109
Matt Timmermans Avatar answered Feb 04 '23 16:02

Matt Timmermans


I know that there already is an adequate O(N) complexity answer, but I tried to figure it out on my own without looking it up, just because it's a fun problem to solve and thought I would share. Here's the O(N) solution that I came up with:

public static String completeSubstring(String S, String T){
    int min = S.length()+1, index1 = -1, index2 = -1;
    ArrayList<ArrayList<Integer>> index = new ArrayList<ArrayList<Integer>>(); 
    HashSet<Character> targetChars = new HashSet<Character>();
    for(char c : T.toCharArray()) targetChars.add(c);

    //reduce initial sequence to only target chars and keep track of index
    //Note that the resultant string does not allow the same char to be consecutive

    StringBuilder filterS = new StringBuilder();
    for(int i = 0, s = 0 ; i < S.length() ; i++) {
        char c = S.charAt(i);
        if(targetChars.contains(c)) {
            if(s > 0 && filterS.charAt(s-1) == c) {
                index.get(s-1).add(i);
            } else {
                filterS.append(c);
                index.add(new ArrayList<Integer>());
                index.get(s).add(i);
                s++;
            }
        }
    }

    //Not necessary to use regex, loops are fine, but for readability sake
    String regex = "([abc])((?!\\1)[abc])((?!\\1)(?!\\2)[abc])";
    Matcher m = Pattern.compile(regex).matcher(filterS.toString());

    for(int i = 0, start = -1, p1, p2, tempMin, charSize = targetChars.size() ; m.find(i) ; i = start+1) {
        start = m.start();
        ArrayList<Integer> first = index.get(start);
        p1 = first.get(first.size()-1);
        p2 = index.get(start+charSize-1).get(0);
        tempMin = p2-p1;

        if(tempMin < min) {
            min = tempMin;
            index1 = p1;
            index2 = p2;
        }
    }

    return S.substring(index1, index2+1);   
}

I'm pretty sure the complexity is O(N), please correct if I'm wrong

like image 45
Maljam Avatar answered Feb 04 '23 15:02

Maljam