Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ARP Timeouts. Why fixed periodic?

Tags:

udp

real-time

arp

This one's been bugging me for years.

Basic question: Is there some reason ARP has to be implemented with fixed timeouts on ARP cache entries?

I do a lot of work in Real Time ciricles. We do most of our inter-system communications these days on dedicated UDP/IP links. This for the most part works reliably in Real Time, but for one nit: ARP entry timeouts.

The way typical implementations do ARP is the following:

  • When client asks to send an IP packet to an IP address with an unkown MAC address, instead of sending that IP packet, the stack sends out an ARP request. If an upper layer (TCP) does resends, that's no problem. But since we use UDP, the original message is lost. At startup time this is OK, but in the middle of operation this is a Bad Thing™.
  • (Dynamic) ARP table entries are removed from the ARP table periodicly, even if we just got a packet from that system a millisecond ago. This means the Bad Thing™ happens to our system regularly.

The obvious solution (which we use religously) is to make all the ARP entries static. However, that's a royal PITA (particularly on RTOS's where finding an interface's MAC address is not always a matter of a couple of easy GUI clicks).

Back when we wrote our own IP stack, I solved this problem by never (ever) timing out ARP table entries. That has obvious drawbacks. A more robust and perfectly reasonable solution might be to refresh the entry timeout whenever a packet from the same MAC/IP combo is seen. That way an entry would only get timed-out if it hadn't communicated with the stack in that amount of time.

But now we're using our vendor's IP stack, and we're back to the stupid ARP timeouts. We have enough leverage with this vendor that I could perhaps get them to use a less inconvienient scheme. However, the universality of this brain-dead timeout algorithm leads me to believe it might be a required part of the implementation.

So that's the question. Is this behavior somehow required?

like image 206
T.E.D. Avatar asked Mar 15 '13 19:03

T.E.D.


People also ask

What causes ARP timeout?

ARP timeout messages are caused by normal activity on the SonicWall's LAN, DMZ, Work or Home ports. ARP timeouts are going to occur after 20 minutes for an IP address which isn't active.

What is the problem of ARP cache timeout?

ARP cache timeout values that are too high will cause problems whenever a host is assigned a different IP address, since the other hosts who have an older entry in their caches will still try to send data to the old (and invalid) hardware address.

What does ARP timeout mean?

The address resolution protocol (ARP) age is the amount of time the switch keeps a MAC address learned through ARP in the ARP cache. The switch resets the timer to zero each time the ARP entry is refreshed and removes the entry if the timer reaches the ARP age.

How often is ARP cache cleared?

Time-Outs in the ARP cache:Entries are deleted unless they are refreshed. The typical lifetime of an ARP entry is 2 minutes, but much longer lifetimes (up to 20 minutes) have been observed. You may want to verify when your Linux system does remove ARP entries automatically after a certain amount of time.


2 Answers

RFC1122 Requirements for Internet Hosts discusses this.

     2.3.2.1  ARP Cache Validation

        An implementation of the Address Resolution Protocol (ARP)
        [LINK:2] MUST provide a mechanism to flush out-of-date cache
        entries.  If this mechanism involves a timeout, it SHOULD be
        possible to configure the timeout value.

      ...

       DISCUSSION:
             The ARP specification [LINK:2] suggests but does not
             require a timeout mechanism to invalidate cache entries
             when hosts change their Ethernet addresses.  The
             prevalence of proxy ARP (see Section 2.4 of [INTRO:2])
             has significantly increased the likelihood that cache
             entries in hosts will become invalid, and therefore
             some ARP-cache invalidation mechanism is now required
             for hosts.  Even in the absence of proxy ARP, a long-
             period cache timeout is useful in order to
             automatically correct any bad ARP data that might have
             been cached.

Networks can be very dynamic; DHCP servers can assign the same IP address to different computers when old lease times expire (making current ARP data invalid), there can be IP conflicts that will never be noticed unless ARP requests are periodically made, etc.

It also provides a mechanism for checking if a host is still on the network. Imagine you're streaming a video over UDP to some IP address 192.168.0.5. If you cache the MAC address of that machine forever, you'll just keep spamming out UDP packets even if the host goes down. Doing an ARP request every now and then will stop the stream with a destination unreachable error because no one responded with a MAC for that IP.

like image 159
PherricOxide Avatar answered Sep 21 '22 08:09

PherricOxide


It originated in distrust of routing protocols, especially in the non-Ethernet world (especially MIT's CHAOS networks). Chris Moon, one of the early "ARPAnauts" was quoted specifically about this in the original ARP RFC.

You can, of course, keep the other guys' ARP caches from timing out by proactively broadcasting your own ARP announcements. Most Ethernet layers will accept gratuitous ARP responses into their caches without trying to correlate them to ARP requests they have previously sent.

like image 42
Ross Patterson Avatar answered Sep 19 '22 08:09

Ross Patterson