Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count unique chars and validate String in some cases using Java Stream

I'm trying to write a method that will validate String. If string has same amount of every char like "aabb", "abcabc", "abc" it is valid or if contains one extra symbol like "ababa" or "aab" it is also valid other cases - invalid. Update: sorry, I forget to mention such cases like abcabcab -> a-3, b-3, c-2 -> 2 extra symbols (a, b) -> invalid. And my code doesn't cover such cases. Space is a symbol, caps letters are different from small letters. Now I have this, but it looks ambiguous (especially last two methods):

public boolean validate(String line) {
    List<Long> keys = countMatches(countChars(line));
    int matchNum = keys.size();
    if (matchNum < 2) return true;
    return matchNum == 2 && Math.abs(keys.get(0) - keys.get(1)) == 1;
}

Counting unique symbols entry I'd wish to get List<long>, but I don't know how:

private Map<Character, Long> countChars(String line) { 
    return line.chars()
               .mapToObj(c -> (char) c)
               .collect(groupingBy(Function.identity(), HashMap::new, counting()));
}


private List<Long> countMatches(Map<Character, Long> countedEntries) {
    return new ArrayList<>(countedEntries.values()
            .stream()
            .collect(groupingBy(Function.identity(), HashMap::new, counting()))
            .keySet());
}

How can I optimize a method above? I need just List<Long>, but have to create a map.

like image 253
Zorg Avatar asked Mar 03 '23 13:03

Zorg


2 Answers

As I could observe, you are looking for distinct frequencies using those two methods. You can merge that into one method to use a single stream pipeline as below :

private List<Long> distinctFrequencies(String line) {
    return line.chars().mapToObj(c -> (char) c)
            .collect(Collectors.groupingBy(Function.identity(),
                    Collectors.counting()))
            .values().stream()
            .distinct()
            .collect(Collectors.toList());
}

Of course, all you need to change in your validate method now is the assignment

List<Long> keys = distinctFrequencies(line);

With some more thought around it, if you wish to re-use the API Map<Character, Long> countChars somewhere else as well, you could have modified the distinct frequencies API to use it as

private List<Long> distinctFrequencies(String line) {
    return countChars(line).values()
            .stream()
            .distinct()
            .collect(Collectors.toList());
}
like image 175
Naman Avatar answered Mar 05 '23 15:03

Naman


you could perform an evaluation if every char in a string has the same occurence count using the stream api like this:

boolean valid = "aabbccded".chars()
      .boxed()  
      .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))                      
      .values().stream()
      .reduce((a, b) -> a == b ? a : -1L)
      .map(v -> v > 0)
      .get();

EDIT:

after reading the comments, I now believe to have understood the requirement.

  1. a string is considered valid if all chars in it have the same occurrence count like aabb
  2. or if there is a single extra character like abb
  3. the string abcabcab is invalid as it has 3a 3b and 2c and thus, it has 1 extra a and 1 extra b, that is too much. hence, you can't perform the validation with a frequency list, you need additional information about how often the char lengths differ -> Map

here is a new trial:

TreeMap<Long, Long> map = "abcabcab".chars()
                .boxed()
                .collect(groupingBy(Function.identity(), counting()))
                .values().stream()
                .collect(groupingBy(Function.identity(), TreeMap::new, counting()));

boolean valid = map.size() == 1 ||        // there is only a single char length
        ( map.size() == 2 &&              // there are two and there is only 1 extra char
        ((map.lastKey() - map.firstKey()) * map.lastEntry().getValue() <= 1));

the whole validation could be executed in a single statement by using the Collectors.collectingAndThen method that @Nikolas used in his answer or you could use a reduction as well:

boolean valid = "aabcc".chars()
    .boxed()
    .collect(groupingBy(Function.identity(), counting()))
    .values().stream()
    .collect(groupingBy(Function.identity(), TreeMap::new, counting()))
    .entrySet().stream()
    .reduce((min, high) -> {
         min.setValue((min.getKey() - high.getKey()) * high.getValue()); // min.getKey is the min char length
         return min;                                                     // high.getKey is a higher char length
                                                                         // high.getValue is occurrence count of higher char length
        })                                                               // this is always negative
    .map(min -> min.getValue() >= -1)
    .get();
like image 42
pero_hero Avatar answered Mar 05 '23 16:03

pero_hero