I am trying to get a best match string matching to work using existing Java data structures. It is quite slow though, any suggestions to improve its performance will be welcomed .
the Sample data would look like this
Key | V
---------------------
0060175559138 | VIP
--------------
006017555 | National
--------------
006017 | Local
---------------
0060 | X
--------------
so a best match search on the key = 0060175552020 will return 006017555
One way I can think of is having multiple TreeMaps using hashing to divert the data into different maps hence making the search area smaller.
private final TreeMap<String, V> index;
public Set<V> syncBestMatch(String key) {
Entry<String,V> entry = index.headMap(key, true)
.descendingMap().entrySet().stream()
.filter(e -> isPartiallyOrFullyMatching(key, e.getKey()))
.findFirst()
.orElseThrow(() -> new NoMatchException("No match found"));
Set<V> results = new HashSet<>();
results.add(entry.getValue());
return results;
}
Use a TreeMap
and the floorEntry(K key)
method:
Returns a key-value mapping associated with the greatest key less than or equal to the given key, or
null
if there is no such key.
The following is simplified. Real code would need to search if an invalid entry is found, e.g. if the map had a key 0060175551000
, in which case you'd need to find the common prefix between the search key and the found key, then do the lookup again. Rinse and repeat.
TreeMap<String, String> map = new TreeMap<>();
map.put("0060175559138", "VIP");
map.put("006017555" , "National");
map.put("006017" , "Local");
map.put("0060" , "X");
String key = "0060175552020";
Entry<String, String> entry = map.floorEntry(key);
if (entry == null)
System.out.println("Not found: " + key);
else {
System.out.println(key);
System.out.println(entry);
}
Output
0060175552020
006017555=National
UPDATE There is the full code, with loop for extended search.
private static Entry<String, String> lookup(NavigableMap<String, String> map, String key) {
String keyToFind = key;
for (;;) {
Entry<String, String> entry = map.floorEntry(keyToFind);
if (entry == null)
return null;
String foundKey = entry.getKey();
int prefixLen = 0;
while (prefixLen < keyToFind.length() && prefixLen < foundKey.length() &&
keyToFind.charAt(prefixLen) == foundKey.charAt(prefixLen))
prefixLen++;
if (prefixLen == 0)
return null;
if (prefixLen == foundKey.length())
return entry;
keyToFind = key.substring(0, prefixLen);
}
}
Test
TreeMap<String, String> map = new TreeMap<>();
map.put("0060175559138", "VIP");
map.put("0060175551000", "Other");
map.put("006017555" , "National");
map.put("006017" , "Local");
map.put("0060" , "X");
System.out.println(lookup(map, "0060175559138"));
System.out.println(lookup(map, "0060175552020"));
System.out.println(lookup(map, "0055708570068"));
System.out.println(lookup(map, "8684064893870"));
Output
0060175559138=VIP
006017555=National
null
null
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With