I'm working on a program that uses an ArrayList
to store Strings
. The program prompts the user with a menu and allows the user to choose an operation to perform. Such operations are adding Strings to the List, printing the entries etc. What I want to be able to do is create a method called removeDuplicates()
. This method will search the ArrayList
and remove any duplicated values. I want to leave one instance of the duplicated value(s) within the list. I also want this method to return the total number of duplicates removed.
I've been trying to use nested loops to accomplish this but I've been running into trouble because when entries get deleted, the indexing of the ArrayList
gets altered and things don't work as they should. I know conceptually what I need to do but I'm having trouble implementing this idea in code.
Here is some pseudo code:
start with first entry; check each subsequent entry in the list and see if it matches the first entry; remove each subsequent entry in the list that matches the first entry;
after all entries have been examined, move on to the second entry; check each entry in the list and see if it matches the second entry; remove each entry in the list that matches the second entry;
repeat for entry in the list
Here's the code I have so far:
public int removeDuplicates()
{
int duplicates = 0;
for ( int i = 0; i < strings.size(); i++ )
{
for ( int j = 0; j < strings.size(); j++ )
{
if ( i == j )
{
// i & j refer to same entry so do nothing
}
else if ( strings.get( j ).equals( strings.get( i ) ) )
{
strings.remove( j );
duplicates++;
}
}
}
return duplicates;
}
UPDATE: It appears that Will is looking for a homework solution that involves developing the algorithm to remove duplicates, rather than a pragmatic solution using Sets. See his comment:
Thx for the suggestions. This is part of an assignment and I believe the teacher had intended for the solution to not include sets. In other words, I am to come up with a solution that will search for and remove duplicates without implementing a HashSet
. The teacher suggested using nested loops which is what I'm trying to do but I've been having some problems with the indexing of the ArrayList
after certain entries are removed.
Why not use a collection such as Set
(and an implementation like HashSet
) which naturally prevents duplicates?
You can use nested loops without any problem:
public static int removeDuplicates(ArrayList<String> strings) {
int size = strings.size();
int duplicates = 0;
// not using a method in the check also speeds up the execution
// also i must be less that size-1 so that j doesn't
// throw IndexOutOfBoundsException
for (int i = 0; i < size - 1; i++) {
// start from the next item after strings[i]
// since the ones before are checked
for (int j = i + 1; j < size; j++) {
// no need for if ( i == j ) here
if (!strings.get(j).equals(strings.get(i)))
continue;
duplicates++;
strings.remove(j);
// decrease j because the array got re-indexed
j--;
// decrease the size of the array
size--;
} // for j
} // for i
return duplicates;
}
You could try this one liner to take a copy of the String preserving order.
List<String> list;
List<String> dedupped = new ArrayList<String>(new LinkedHashSet<String>(list));
This approach is also O(n) amortized instead of O(n^2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With