Java - Removing duplicates in an ArrayList

Question

I'm working on a program that uses an ArrayList to store Strings. The program prompts the user with a menu and allows the user to choose an operation to perform. Such operations are adding Strings to the List, printing the entries etc. What I want to be able to do is create a method called removeDuplicates(). This method will search the ArrayList and remove any duplicated values. I want to leave one instance of the duplicated value(s) within the list. I also want this method to return the total number of duplicates removed.

I've been trying to use nested loops to accomplish this but I've been running into trouble because when entries get deleted, the indexing of the ArrayList gets altered and things don't work as they should. I know conceptually what I need to do but I'm having trouble implementing this idea in code.

Here is some pseudo code:

start with first entry; check each subsequent entry in the list and see if it matches the first entry; remove each subsequent entry in the list that matches the first entry;

after all entries have been examined, move on to the second entry; check each entry in the list and see if it matches the second entry; remove each entry in the list that matches the second entry;

repeat for entry in the list

Here's the code I have so far:

public int removeDuplicates()
{
  int duplicates = 0;

  for ( int i = 0; i < strings.size(); i++ )
  {
     for ( int j = 0; j < strings.size(); j++ )
     {
        if ( i == j )
        {
          // i & j refer to same entry so do nothing
        }

        else if ( strings.get( j ).equals( strings.get( i ) ) )
        {
           strings.remove( j );
           duplicates++;
        }
     }
 }

   return duplicates;
}

UPDATE: It appears that Will is looking for a homework solution that involves developing the algorithm to remove duplicates, rather than a pragmatic solution using Sets. See his comment:

Thx for the suggestions. This is part of an assignment and I believe the teacher had intended for the solution to not include sets. In other words, I am to come up with a solution that will search for and remove duplicates without implementing a HashSet. The teacher suggested using nested loops which is what I'm trying to do but I've been having some problems with the indexing of the ArrayList after certain entries are removed.

matt b · Accepted Answer

Why not use a collection such as Set (and an implementation like HashSet) which naturally prevents duplicates?

Azder · Answer

You can use nested loops without any problem:

public static int removeDuplicates(ArrayList<String> strings) {

    int size = strings.size();
    int duplicates = 0;

    // not using a method in the check also speeds up the execution
    // also i must be less that size-1 so that j doesn't
    // throw IndexOutOfBoundsException
    for (int i = 0; i < size - 1; i++) {
        // start from the next item after strings[i]
        // since the ones before are checked
        for (int j = i + 1; j < size; j++) {
            // no need for if ( i == j ) here
            if (!strings.get(j).equals(strings.get(i)))
                continue;
            duplicates++;
            strings.remove(j);
            // decrease j because the array got re-indexed
            j--;
            // decrease the size of the array
            size--;
        } // for j
    } // for i

    return duplicates;

}

Peter Lawrey · Answer

You could try this one liner to take a copy of the String preserving order.

List<String> list;
List<String> dedupped = new ArrayList<String>(new LinkedHashSet<String>(list));

This approach is also O(n) amortized instead of O(n^2)

Java - Removing duplicates in an ArrayList

Tags:

Will

3 Answers

matt b

Azder

Peter Lawrey

Recent Activity

Donate For Us

Java - Removing duplicates in an ArrayList

Tags:

Will

3 Answers

matt b

Azder

Peter Lawrey

Related questions

Recent Activity

Donate For Us