What algorithm would you suggest to find out the longest common prefixes of a list of strings?
I might have strings such as:
Call Mike and schedule meeting.
Call Lisa
Call Adam and ask for quote.
Implement new class for iPhone project
Implement new class for Rails controller
Buy groceries
I want to find out the following prefixes:
"Call "
"Implement new class "
I'll be using Objective C, so a ready made cocoa solution would be a plus (though not a must).
Edit: for the clarified question:
Actually, step (3) only requires that you remove any that's a dupe/prefix of another, which you could do with a trie or whatever instead of sorting. In fact it may be that the whole thing can be done faster with a suitably annotated trie - if you include a "count" at each node then you're looking precisely for nodes with a count of 2+, that have no children with a count of 2+.
But sorting is built in, and once you've sorted you can detect prefixes by looking at adjacent items, so it's probably less effort.
[Original answer:
Just a one-off operation, find the longest common prefix between all the strings?
I'd probably do it in terms of the length of the prefix. In pseudo-code, and assuming nul-terminated strings:
prefixlen = strlen(first_string);
foreach string in the list {
for (i = 0; i < prefixlen; ++i) {
if (string[i] != first_string[i]) {
prefixlen = i;
break;
}
}
if (prefixlen == 0) break;
}
common_prefix = substring(firststring, 0, prefixlen);
]
You could insert all your strings into a trie (aka prefix tree). Then traverse the trie from the root until you find a node with more than one child (or just stop inserting strings when you would have to append a second child to a node).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With