Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

detect duplicate character in string

I am doing the exercises in the Cracking The Coding Interview book and I am trying to determine if there is a duplicate character in a string. I am using the ArrayList data structure. My method is of return type Boolean, which returns true if there is a duplicate and returns false if there is no duplicate character. I added a third return statement so the program would compile but it always returns false.

import java.util.*;

public class QuestionOneCrackingCode {

    public static void main(String[] args) {
        String s = "abcdefga";
        ArrayList<String> c = new ArrayList<String>();
        c.add(s);

        System.out.print(check(c));
    }

    public static boolean check(ArrayList<String> g) {
        for (int i = 0; i < g.size(); i++) {
            for (int j = i + 1; j < g.size(); j++) {
                if (g.get(i) == g.get(j)) {
                    return true;
                } else {
                    return false;
                }
            }
        }
        return false;
    }
}
like image 488
Daniel Semel Avatar asked Jul 21 '15 22:07

Daniel Semel


4 Answers

In Java 8, you could do it like this:

public static boolean check(CharSequence checkString)
{
  return checkString.length() != checkString.chars().distinct().count();
}

I.e. if the number of distinct characters in the string is not the same as the total number of characters, you know there's a duplicate. It's not necessarily the most efficient way, but it's succinct.

like image 117
CupawnTae Avatar answered Oct 17 '22 00:10

CupawnTae


Here are the results of my version of your code.

abcdefga true
abcdefgh false
abcdefdh true
  1. I modified the check parameter to take a single String. There's no need for a List of Strings.

  2. In the check method, you can exit once one pair of characters match. You have to test the entire string before you can say there are no matching characters.

  3. The first for loop can stop at the second to last character. The second for loop will get the last character.

  4. Since I'm comparing char values, I use the ==. If I were comparing String values, I would use the .equals method.

Here's the code.

package com.ggl.testing;

public class QuestionOneCrackingCode {

    public static void main(String[] args) {
        String s = "abcdefga";
        System.out.println(s + " " + check(s));
        s = "abcdefgh";
        System.out.println(s + " " + check(s));
        s = "abcdefdh";
        System.out.println(s + " " + check(s));
    }

    public static boolean check(String s) {
        for (int i = 0; i < (s.length() - 1); i++) {
            for (int j = i + 1; j < s.length(); j++) {
                if (s.charAt(i) == s.charAt(j)) {
                    return true;
                }
            }
        }
        return false;
    }
}
like image 28
Gilbert Le Blanc Avatar answered Oct 16 '22 22:10

Gilbert Le Blanc


Your solution compares string references in a list. The list itself contains only one String.

Try the following:

// check one string for duplicate chars
public static boolean check(String checkString)
{
    // result flag
    boolean foundDuplicate = false;
    // get string length
    int stringLength = checkString.length();
    // create a set for all found characters (max size is number
    // of characters in the string to check
    Set<Character> characters = new HashSet<>(stringLength);
    // loop all characters in string
    for (int i = 0; i < stringLength; i++)
    {
        // construct a object (may be use internal JDK cache)
        Character c = Character.valueOf(checkString.charAt(i));
        // check if character is already found
        if (characters.contains(c))
        {
            // yes, set result to TRUE
            foundDuplicate = true;
            // break the loop
            break;
        }
        else
        {
            // not found, add char to set
            characters.add(c);
        }
    }
    return foundDuplicate;
}

This is limited by the string length and the heap size. But I assume that all UTF-8 characters may fit into heap.

@Maarten Bodewes You are right. The check can be simplified to:

        // add character to set and check result
        if (!characters.add(c))
        {
            // returned false: character already exists
            foundDuplicate = true;
            // break the loop
            break;
        }
        // no else necessary
like image 3
Konrad Avatar answered Oct 16 '22 22:10

Konrad


My participation:

    public static void main(String[] args) {
        System.out.println(check("abcdefga"));                    // true
        System.out.println(check("noduplicate"));                 // false
        System.out.println(check("withduplicate"));               // true
        System.out.println(check("abcdefghijklmnopqrstuvwxyz"));  // false
        System.out.println(check("abcdefghijklmnopqrstuvwxyzz")); // true
    }

    /**@brief Check if a String contains duplicated characters.
     * Strong expectation for the string: The String must only contains
     * lowercase alpha characters (ie. in [a-z])
     * @returns true if a char is present more than once */
    public static boolean check(String str) {
        int presentChars = 0; // will store the table of already found characters
        int l = str.length();
        for (int i = 0; i < l; ++i) {
            char c = str.charAt(i);
            int offset = c - 'a';             // a=0, b=1, ... z=25
            int mask = 1 << offset;
            if ((presentChars& mask) != 0) {  // Oh! Char already tagged as found
                return true;                  // No need to process further, bye!
            }
            presentChars|= mask;              // Tag the current char as present
        }
        return false;                         // No duplicate
    }

}

My goal with this code is to minimize the complexity. This algorithm is in O(N) on the worst case. Also, the memory footprint of the function is very low: only one int is really necessary (presentChars) even if I use more in order to improve readability =)

Downside of this code: there is a big prerequisite on the inputed string. I've detailed this on the comments, but it works only with characters in the [a-z] range.

I hope it's helps!

like image 1
johan d Avatar answered Oct 17 '22 00:10

johan d