This is an interview question: design a distributed back-end for auto-complete.
I would answer it as follows:
Auto-complete is a search in a dictionary by a given suffix. The dictionary should be probably organized as a trie. The dictionary is built from the most frequent queries but it's another story.
Now I assume the dictionary is not changed frequently (e.g. once a day rather than every millisecond). Thus we can just replicate the dictionary across a number of servers that handle auto-complete queries (e.g. with a load balancer and round-robin policy).
We should also think about dictionary but this is also another story.
Does it make sense? Am I missing something?
It looks like the right question. The trie idea is really nice and would help you to search in log(n)
. The change frequency depends on the information, so I wouldn't say exactly time, but I would tune it dynamically.. Let's assume that you change once a day, it would be nice how much the tree has changed. And you can give a boundary (for example 10%). If the boundary is exceeded you can update more often the trie. It also depends how important is to be up to date, because in the most cases it isn't. The load balancer idea is also good.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With