Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Collection removeAll ignoring case?

Ok so here is my issue. I have to HashSet's, I use the removeAll method to delete values that exist in one set from the other.

Prior to calling the method, I obviously add the values to the Sets. I call .toUpperCase() on each String before adding because the values are of different cases in both lists. There is no rhyme or reason to the case.

Once I call removeAll, I need to have the original cases back for the values that are left in the Set. Is there an efficient way of doing this without running through the original list and using CompareToIgnoreCase?

Example:

List1:

"BOB"
"Joe"
"john"
"MARK"
"dave"
"Bill"

List2:

"JOE"
"MARK"
"DAVE"

After this, create a separate HashSet for each List using toUpperCase() on Strings. Then call removeAll.

Set1.removeAll(set2);

Set1:
    "BOB"
    "JOHN"
    "BILL"

I need to get the list to look like this again:

"BOB"
"john"
"Bill"

Any ideas would be much appreciated. I know it is poor, there should be a standard for the original list but that is not for me to decide.

like image 342
user84786 Avatar asked Aug 06 '09 21:08

user84786


People also ask

How do you check if a string is in a list of strings ignoring case in Java?

Using StringUtils. The StringUtils class contains a containsIgnoreCase() method that checks if a string is a substring of another in a case-insensitive manner.

What does removeAll do in Java?

The removeAll() method of java. util. ArrayList class is used to remove from this list all of its elements that are contained in the specified collection. Parameters: This method takes collection c as a parameter containing elements to be removed from this list.

Is set contains case sensitive?

If the set contains String elements, the elements are case-sensitive.

How to remove all values in ArrayList?

The Java ArrayList removeAll() method removes all the elements from the arraylist that are also present in the specified collection. The syntax of the removeAll() method is: arraylist. removeAll(Collection c);


1 Answers

In my original answer, I unthinkingly suggested using a Comparator, but this causes the TreeSet to violate the equals contract and is a bug waiting to happen:

// Don't do this:
Set<String> setA = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
setA.add("hello");
setA.add("Hello");
System.out.println(setA);

Set<String> setB = new HashSet<String>();
setB.add("HELLO");
// Bad code; violates symmetry requirement
System.out.println(setB.equals(setA) == setA.equals(setB));

It is better to use a dedicated type:

public final class CaselessString {
  private final String string;
  private final String normalized;

  private CaselessString(String string, Locale locale) {
    this.string = string;
    normalized = string.toUpperCase(locale);
  }

  @Override public String toString() { return string; }

  @Override public int hashCode() { return normalized.hashCode(); }

  @Override public boolean equals(Object obj) {
    if (obj instanceof CaselessString) {
      return ((CaselessString) obj).normalized.equals(normalized);
    }
    return false;
  }

  public static CaselessString as(String s, Locale locale) {
    return new CaselessString(s, locale);
  }

  public static CaselessString as(String s) {
    return as(s, Locale.ENGLISH);
  }

  // TODO: probably best to implement CharSequence for convenience
}

This code is less likely to cause bugs:

Set<CaselessString> set1 = new HashSet<CaselessString>();
set1.add(CaselessString.as("Hello"));
set1.add(CaselessString.as("HELLO"));

Set<CaselessString> set2 = new HashSet<CaselessString>();
set2.add(CaselessString.as("hello"));

System.out.println("1: " + set1);
System.out.println("2: " + set2);
System.out.println("equals: " + set1.equals(set2));

This is, unfortunately, more verbose.

like image 84
McDowell Avatar answered Oct 23 '22 13:10

McDowell