Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find longest substring without repeating characters

People also ask

How do you find the longest string without repeating characters?

Run a loop from i = 0 till N – 1 and consider a visited array. Run a nested loop from j = i + 1 to N – 1 and check whether the current character S[j] has already been visited. If true, break from the loop and consider the next window. Else, mark the current character as visited and maximize the length as j – i + 1.

How do I find the longest substring in Python?

Use a While Loop to Get the Longest Substring in Python We will go through the while loop until we have found the longest substring from the given string. As you can see from the above example, the longest substring possible has a length of 12, the same as the ABCGHIJKLMNO substring from the original string.

How many different substring exist in it that have no repeating characters?

Explanation: There are 4 unique substrings.


  1. You are going to need a start and an end locator(/pointer) for the string and an array where you store information for each character: did it occour at least once?

  2. Start at the beginning of the string, both locators point to the start of the string.

  3. Move the end locator to the right till you find a repetition (or reach the end of the string). For each processed character, store it in the array. When stopped store the position if this is the largest substring. Also remember the repeated character.

  4. Now do the same thing with the start locator, when processing each character, remove its flags from the array. Move the locator till you find the earlier occurrence of the repeated character.

  5. Go back to step 3 if you haven't reached the end of string.

Overall: O(N)


import java.util.HashSet;

public class SubString {
    public static String subString(String input){

        HashSet<Character> set = new HashSet<Character>();

        String longestOverAll = "";
        String longestTillNow = "";

        for (int i = 0; i < input.length(); i++) {
            char c = input.charAt(i);

            if (set.contains(c)) {
                longestTillNow = "";
                set.clear();
            }
            longestTillNow += c;
            set.add(c);
            if (longestTillNow.length() > longestOverAll.length()) {
                longestOverAll = longestTillNow;
            }
        }

        return longestOverAll;
    }

    public static void main(String[] args) {
        String input = "substringfindout";
        System.out.println(subString(input));
    }
}

You keep an array indicating the position at which a certain character occurred last. For convenience all characters occurred at position -1. You iterate on the string keeping a window, if a character is repeated in that window, you chop off the prefix that ends with the first occurrence of this character. Throughout, you maintain the longest length. Here's a python implementation:

def longest_unique_substr(S):
  # This should be replaced by an array (size = alphabet size).
  last_occurrence = {} 
  longest_len_so_far = 0
  longest_pos_so_far = 0
  curr_starting_pos = 0
  curr_length = 0

  for k, c in enumerate(S):
    l = last_occurrence.get(c, -1)
    # If no repetition within window, no problems.
    if l < curr_starting_pos: 
        curr_length += 1
    else:
        # Check if it is the longest so far
        if curr_length > longest_len_so_far: 
            longest_pos_so_far = curr_starting_pos
            longest_len_so_far = curr_length
        # Cut the prefix that has repetition
        curr_length -= l - curr_starting_pos
        curr_starting_pos = l + 1
    # In any case, update last_occurrence
    last_occurrence[c] = k

  # Maybe the longest substring is a suffix
  if curr_length > longest_len_so_far:
    longest_pos_so_far = curr_starting_pos
    longest_len_so_far = curr_length

  return S[longest_pos_so_far:longest_pos_so_far + longest_len_so_far]

EDITED:

following is an implementation of the concesus. It occured to me after my original publication. so as not to delete original, it is presented following:

public static String longestUniqueString(String S) {
    int start = 0, end = 0, length = 0;
    boolean bits[] = new boolean[256];
    int x = 0, y = 0;
    for (; x < S.length() && y < S.length() && length < S.length() - x; x++) {
        bits[S.charAt(x)] = true;
        for (y++; y < S.length() && !bits[S.charAt(y)]; y++) {
            bits[S.charAt(y)] = true;
        }
        if (length < y - x) {
            start = x;
            end = y;
            length = y - x;
        }
        while(y<S.length() && x<y && S.charAt(x) != S.charAt(y))
            bits[S.charAt(x++)]=false;
    }
    return S.substring(start, end);
}//

ORIGINAL POST:

Here is my two cents. Test strings included. boolean bits[] = new boolean[256] may be larger to encompass some larger charset.

public static String longestUniqueString(String S) {
    int start=0, end=0, length=0;
    boolean bits[] = new boolean[256];
    int x=0, y=0;
    for(;x<S.length() && y<S.length() && length < S.length()-x;x++) {
        Arrays.fill(bits, false);
        bits[S.charAt(x)]=true;
        for(y=x+1;y<S.length() && !bits[S.charAt(y)];y++) {
            bits[S.charAt(y)]=true;
        }           
        if(length<y-x) {
            start=x;
            end=y;
            length=y-x;
        }
    }
    return S.substring(start,end);
}//

public static void main(String... args) {
    String input[][] = { { "" }, { "a" }, { "ab" }, { "aab" }, { "abb" },
            { "aabc" }, { "abbc" }, { "aabbccdefgbc" },
            { "abcdeafghicabcdefghijklmnop" },
            { "abcdeafghicabcdefghijklmnopqrabcdx" },
            { "zxxaabcdeafghicabcdefghijklmnopqrabcdx" },
            {"aaabcdefgaaa"}};
    for (String[] a : input) {
        System.out.format("%s  *** GIVES ***  {%s}%n", Arrays.toString(a),
                longestUniqueString(a[0]));
    }
}

Here is one more solution with only 2 string variables:

public static String getLongestNonRepeatingString(String inputStr){
    if(inputStr == null){
        return null;
    }

    String maxStr = "";
    String tempStr = "";
    for(int i=0; i < inputStr.length(); i++){
        // 1. if tempStr contains new character, then change tempStr  
        if(tempStr.contains("" + inputStr.charAt(i))){
            tempStr = tempStr.substring(tempStr.lastIndexOf(inputStr.charAt(i)) + 1);
        }
        // 2. add new character
        tempStr = tempStr + inputStr.charAt(i);
        // 3. replace maxStr with tempStr if tempStr is longer
        if(maxStr.length() < tempStr.length()){
            maxStr = tempStr;
        }
    }

    return maxStr;
}

Algorithm in JavaScript (w/ lots of comments)..

/**
 Given a string S find longest substring without repeating characters.
 Example:

 Input: "stackoverflow"
 Output: "stackoverfl"

 Input: "stackoverflowabcdefghijklmn"
 Output: "owabcdefghijklmn"
 */
function findLongestNonRepeatingSubStr(input) {
    var chars = input.split('');
    var currChar;
    var str = "";
    var longestStr = "";
    var hash = {};
    for (var i = 0; i < chars.length; i++) {
        currChar = chars[i];
        if (!hash[chars[i]]) { // if hash doesn't have the char,
            str += currChar; //add it to str
        hash[chars[i]] = {index:i};//store the index of the char
    } else {// if a duplicate char found..
        //store the current longest non-repeating chars. until now
        //In case of equal-length, <= right-most str, < will result in left most str
        if(longestStr.length <= str.length) {
            longestStr = str;
        }
        //Get the previous duplicate char's index
        var prevDupeIndex = hash[currChar].index;

        //Find all the chars AFTER previous duplicate char and current one
        var strFromPrevDupe = input.substring(prevDupeIndex + 1, i);
        //*NEW* longest string will be chars AFTER prevDupe till current char
        str = strFromPrevDupe + currChar;
        //console.log(str);
        //Also, Reset hash to letters AFTER duplicate letter till current char
        hash = {};
        for (var j = prevDupeIndex + 1; j <= i; j++) {
            hash[input.charAt(j)] = {index:j};
        }
    }
  }
  return longestStr.length > str.length ? longestStr : str;
}

//console.log("stackoverflow => " + findLongestNonRepeatingSubStr("stackoverflow"));      
//returns stackoverfl

//console.log("stackoverflowabcdefghijklmn => " + 
findLongestNonRepeatingSubStr("stackoverflowabcdefghijklmn")); //returns owabcdefghijklmn

//console.log("1230123450101 => " + findLongestNonRepeatingSubStr("1230123450101")); //    
returns 234501