Imagining there is a firewall, and the system administrator blocked many subnets, perhaps all subnets of a specific country.
For example:
192.168.2.0 / 255.255.255.0
223.201.0.0 / 255.255.0.0
223.202.0.0 / 255.254.0.0
223.208.0.0 / 255.252.0.0
....
To determine whether a IP address have been blocked, the firewall may use the algorithm below.
func blocked(ip)
foreach subnet in blocked_subnets
if in_subnet(subnet, ip)
return true
return false
But, the algorithm needs too much time to run, the time complexity is O(n). If the route table contains too many entries, the network will become almost unavailable.
Is there a more efficient way to match the IP addresses to huge route entries? It is based on some kinds of trees/graphs (Trie?) I guess. I have read something about Longest prefix match and Trie but didn't get the point.
All you really need is a trie with four levels. Each non-leaf node contains an array of up to 256 child nodes. Each node also contains a subnet mask. So, given your example:
192.168.2.0 / 255.255.255.0
223.201.0.0 / 255.255.0.0
223.202.0.0 / 255.254.0.0
223.208.0.0 / 255.252.0.0
Your tree would look something like that below. The two numbers for each node are the IP segment followed by the subnet mask.
root
/ \
192,255 223,255
| -------------------------
168,255 | | |
| 201,255 202,255 208,255
2,255
When you get an IP address, you break it into segments. You search for the first segment at the root level. For speed, you'll probably want to use an array at the root level so that you can do a direct lookup.
Say the first segment of the IP address is 223. You'd grab the node from root[223]
, and now you're working with just that one subtree. You probably don't want a full array at the other levels, unless your data is really dense. A dictionary of some kind for the subsequent levels is probably what you'll want. If the next segment is 201, you look up 201
in the dictionary for the 223
node, and now your possible list of candidates is just 64K items (i.e. all IP addresses that are 223,201.x.x). You can do the same thing with the other two levels. The result is that you can resolve an IP address in just four lookups: one lookup in an array, and three dictionary lookups.
This structure is also very easy to maintain. Inserting a new address or range requires at most four lookups and adds. Same with deleting. Updates can be done in-place, without having to rebuild the entire tree. You just have to make sure that you're not trying to read while you're updating, and you're not trying to do concurrent updates. But any number of readers can be accessing the thing concurrently.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With