It is possible for routing-update algorithms based on the distance-vector idea to eliminate routing loops – and thus the slow-convergence problem – entirely. We present brief descriptions of two such algorithms.
DSDV, or Destination-Sequenced Distance Vector, was proposed in [PB94]. It avoids routing loops by the introduction of sequence numbers: each router will always prefer routes with the most recent sequence number, and bad-news information will always have a lower sequence number then the next cycle of corrected information.
DSDV was originally proposed for MANETs (3.7.8 MANETs) and has some additional features for traffic minimization that, for simplicity, we ignore here. It is perhaps best suited for wired networks and for small, relatively stable MANETs.
DSDV forwarding tables contain entries for every other reachable node in the system. One successor of DSDV, Ad Hoc On-Demand Distance Vector routing or AODV, 9.4.2 AODV, allows forwarding tables to contain only those destinations in active use; a mechanism is provided for discovery of routes to newly active destinations.
Under DSDV, each forwarding table entry contains, in addition to the destination, cost and next_hop, the current sequence number for that destination. When neighboring nodes exchange their distance-vector reachability reports, the reports include these per-destination sequence numbers.
When a router R receives a report from neighbor N for destination D, and the report contains a sequence number larger than the sequence number for D currently in R’s forwarding table, then R always updates to use the new information. The three cost-minimization rules of 9.1.1 Distance-Vector Update Rules above are used only when the incoming and existing sequence numbers are equal.
Each time a router R sends a report to its neighbors, it includes a new value for its own sequence number, which it always increments by 2. This number is then entered into each neighbor’s forwarding-table entry for R, and is then propagated throughout the network via continuing report exchanges. Any sequence number originating this way will be even, and whenever another node’s forwarding-table sequence number for R is even, then its cost for R will be finite.
Infinite-cost reports are generated in the usual way when former neighbors discover they can no longer reach one another; however, in this case each node increments the sequence number for its former neighbor by 1, thus generating an odd value. Any forwarding-table entry with infinite cost will thus always have an odd sequence number. If A and B are neighbors, and A’s current sequence number is s, and the A–B link breaks, then B will start reporting A at cost ∞ with sequence number s+1 while A will start reporting its own new sequence number s+2. Any other node now receiving a report originating with B (with sequence number s+1) will mark A as having cost ∞, but will obtain a valid route to A upon receiving a report originating from A with new (and larger) sequence number s+2.
The triggered-update mechanism is used: if a node receives a report with some destinations newly marked with infinite cost, it will in turn forward this information immediately to its other neighbors, and so on. This is, however, not essential; “bad” and “good” reports are distinguished by sequence number, not by relative arrival time.
It is now straightforward to verify that the slow-convergence problem is solved. After a link break, if there is some alternative path from router R to destination D, then R will eventually receive D’s latest even sequence number, which will be greater than any sequence number associated with any report listing D as unreachable. If, on the other hand, the break partitioned the network and there is no longer any path to D from R, then the highest sequence number circulating in R’s half of the original network will be odd and the associated table entries will all list D at cost ∞. One way or another, the network will quickly settle down to a state where every destination’s reachability is accurately described.
In fact, a stronger statement is true: not even transient routing loops are created. We outline a proof. First, whenever router R has next_hop N for a destination D, then N’s sequence number for D must be greater than or equal to R’s, as R must have obtained its current route to D from one of N’s reports. A consequence is that all routers participating in a loop for destination D must have the same (even) sequence number s for D throughout. This means that the loop would have been created if only the reports with sequence number s were circulating. As we noted in 9.1.1 Distance-Vector Update Rules, any application of the next_hop-increase rule must trace back to a broken link, and thus must involve an odd sequence number. Thus, the loop must have formed from the sequence-number-s reports by the application of the first two rules only. But this violates the claim in Exercise 10.0.
There is one drawback to DSDV: nodes may sometimes briefly switch to routes that are longer than optimum (though still correct). This is because a router is required to use the route with the newest sequence number, even if that route is longer than the existing route. If A and B are two neighbors of router R, and B is closer to destination D but slower to report, then every time D’s sequence number is incremented R will receive A’s longer route first, and switch to using it, and B’s shorter route shortly thereafter.
DSDV implementations usually address this by having each router R keep track of the time interval between the first arrival at R of a new route to a destination D with a given sequence number, and the arrival of the best route with that sequence number. During this interval following the arrival of the first report with a new sequence number, R will use the new route, but will refrain from including the route in the reports it sends to its neighbors, anticipating that a better route will soon arrive.
This works best when the hopcount cost metric is being used, because in this case the best route is likely to arrive first (as the news had to travel the fewest hops), and at the very least will arrive soon after the first route. However, if the network’s cost metric is unrelated to the hop count, then the time interval between first-route and best-route arrivals can involve multiple update cycles, and can be substantial.
AODV, or Ad-hoc On-demand Distance Vector routing, is another routing mechanism often proposed for MANETs, though it is suitable for some wired networks as well. Unlike DSDV, above, AODV messages circulate only if a link breaks, or when a node is looking for a route to some other node; this second case is the rationale for the “on-demand” in the name. For larger MANETs, this may result in a significant reduction in routing-management traffic. AODV is described in [PR99] and RFC 3561.
The “ad hoc” in the name was intended to suggest that the protocol is well-suited for mobile nodes forming an ad hoc network (3.7.4 Access Points). It is, but the protocol is also works well with infrastructure (those with access points) Wi-Fi networks.
AODV has three kinds of messages: RouteRequest or RREQ, for nodes that are looking for a path to a destination, RouteReply or RREP, as the response, and RouteError or RERR for the reporting of broken links.
AODV performs reasonably well for MANETs in which the nodes are highly mobile, though it does assume all routing nodes are trustworthy.
AODV is loop-free, due to the way it uses sequence numbers. However, it does not always find the shortest route right away, and may in fact not find the shortest route for an arbitrarily long interval.
Each AODV node maintains a node sequence number and also a broadcast counter. Every routing message contains a sequence number for the destination, and every routing record kept by a node includes a field for the destination’s sequence number. Copies of a node’s sequence number held by other nodes may not be the most current; however, nodes always discard routes with an older (smaller) sequence number as soon as they hear about a route with a newer sequence number.
AODV nodes also keep track of other nodes that are directly reachable; in the diagram below we will assume these are the nodes connected by a line.
If node A wishes to find a route to node F, as in the diagram below, the first step is for A to increment its sequence number and send out a RouteRequest. This message contains the addresses of A and F, A’s just-incremented sequence number, the highest sequence number of any previous route to F that is known to A (if any), a hopcount field set initially to 1, and A’s broadcast counter. The end result should be a route from A to F, entered at each node along the path, and also a return route from F back to A.
The RouteRequest is sent initially to A’s direct neighbors, B and C in the diagram above, using UDP. We will assume for the moment that the RouteRequest reaches all the way to F before a RouteReply is generated. This is always the case if the “destination only” flag is set, though if not then it is possible for an intermediate node to generate the RouteReply.
A node that receives a RouteRequest must flood it (“broadcast” it) out all its interfaces to all its directly reachable neighbors, after incrementing the hopcount field. B therefore sends A’s message to C and D, and C sends it to B and E. For this example, we will assume that C is a bit slow sending the message to E.
Each node receiving a RouteRequest must hang on to it for a short interval (typically 3 seconds). During this period, if it sees a duplicate of the RouteRequest, identified by having the same source and the same broadcast counter, it discards it. This discard rule ensures that RouteRequest messages do not circulate endlessly around loops; it may be compared to the reliable-flooding algorithm in 9.5 Link-State Routing-Update Algorithm.
A node receiving a new RouteRequest also records (or updates) a routing-table entry for reaching the source of the RouteRequest. Unless there was a pre-existing newer route (that is, with larger sequence number), the entry is marked with the sequence number contained in the message, and with next_hop the neighbor from which the RouteRequest was received. This process ensures that, as part of each node’s processing of a RouteRequest message, it installs a return route back to the originator.
We will suppose that the following happen in the order indicated:
- B forwards the RouteRequest to D*
- D forwards the RouteRequest to E and G
- C forwards the RouteRequest to E
- E forwards the RouteRequest to F
Because E receives D’s copy of the RouteRequest first, it ignores C’s copy. This will mean that, at least initially, the return path will be longer than necessary. Variants of AODV (such as HWMP below) sometimes allow E to accept C’s message on the grounds that C has a shorter path back to A. This does mean that initial RouteRequest messages farther on in the network now have incorrect hopcount values, though these will be corrected by later RouteRequest messages.
After the above messages have been received, each node has a path back to A as indicated by the blue arrows below:
F now increments its own sequence number and creates a RouteReply message; F then sends it to A by following the highlighted (unicast) arrows above, F→E→D→B→A. As each node on the path processes the message, it creates (or updates) its route to the final destination, F; the return route to A had been created earlier when the node processed the corresponding RouteRequest.
At this point, A and F can communicate bidirectionally. (Each RouteRequest is acknowledged to ensure bidirectionality of each individual link.)
This F→E→D→B→A is longer than necessary; a shorter path is F→E→C→A. The shorter path will be adopted if, at some future point, E learns that E→C→A is a better path, though there is no mechanism to seek out this route.
If the “destination only” flag were not set, any intermediate node reached by the RouteRequest flooding could have answered with a route to F, if it had one. Such a node would generate the RouteReply on its own, without involving F. The sequence number of the intermediate node’s route to F must be greater than the sequence number in the RouteRequest message.
If two neighboring nodes can no longer reach one another, each sends out a RouteError message, to invalidate the route. Nodes keep track of what routes pass through them, for just this purpose. One node’s message will reach the source and the other’s the destination, at which point the route is invalidated.
In larger networks, it is standard for the originator of a RouteRequest to set the IPv4 header TTL value (or the IPv6 Hop_Limit) to a smallish value (RFC 3561 recommends an intial value of 1) to limit the scope of the RequestRoute messages. If no answer is received, the originator tries again, with a slightly larger TTL value. In a large network, this reduces the volume of RouteRequest messages that have gone too far and therefore cannot be of use in finding a route.
AODV cannot form even short-term loops. To show this, we start with the observation that whenever a ⟨destination,next_hop⟩ forwarding entry installed at a node, due either to a RouteRequest or to a RouteReply, the next_hop is always the node from which the RouteRequest or RouteReply was received, and therefore the destination sequence number cannot get smaller as we move from the original node to its next_hop. That is, as we follow any route to a destination, the destination sequence numbers are nondecreasing. It immediately follows that, for a routing loop, the destination sequence number is constant along the loop. This means that each node on the route must have heard of the route via the same RouteRequest or RouteReply message, as forwarded.
The second observation, completing the argument, is that the hopcount field must strictly decrease as we travel along the route to the destination; the processing rules for RouteRequests and RouteReplies mean that each node installs a hopcount of one more than that of the neighboring node from which the route was received. This is impossible for a route that returns to the same node.
The Hybrid Wireless Mesh Protocol is based on AODV, and has been chosen for the IEEE 802.11s Wi-Fi mesh networking standard (220.127.116.11 Mesh Networks). In the discussion here, we will assume HWMP is being used in a Wi-Fi network, though the protocol applies to any type of network. A set of nodes is designated as the routing (or forwarding) nodes; ordinary Wi-Fi stations may or may not be included here.
HWMP replaces the hopcount metric used in AODV with an “airtime link metric” which decreases as the link throughput increases and as the link error rate decreases. This encourages the use of higher-quality wireless links.
HWMP has two route-generating modes: an on-demand mode very similar to AODV, and a proactive mode used when there is at least one identified “root” node that connects to the Internet. In this case, the route-generating protocol determines a loop-free subset of the relevant routing links (that is, a spanning tree) by which each routing node can reach the root (or one of the roots). This tree-building process does not attempt to find best paths between pairs of non-root nodes, though such nodes can use the on-demand mode as necessary.
In the first, on-demand, mode, HWMP implements a change to classic AODV in that if a node receives a RouteRequest message and then later receives a second RouteRequest message with the same sequence number but a lower-cost route, then the second route replaces the first.
In the proactive mode, the designated root node – typically the node with wired Internet access – periodically sends out specially marked RouteRequest messages. These are sent to the broadcast address, rather than to any specific destination, but otherwise propagate in the usual way. Routing nodes receiving two copies from two different neighbors pick the one with the shortest path. Once this process stabilizes, each routing node knows the best path to the root (or to a root); the fact that each routing node chooses the best path from among all RouteRequest messages received ensures eventual route optimality. Routing nodes that have traffic to send can at any time generate a RouteReply, which will immediately set up a reverse route from the root to the node in question. Finally, reversing each link to the root allows the root to send broadcast messages.
HWMP has yet another mode: the root nodes can send out RootAnnounce (RANN) messages. These let other routing nodes know what the root is, but are not meant to result in the creation of routes to the root.
EIGRP, or the Enhanced Interior Gateway Routing Protocol, is a once-proprietary Cisco distance-vector protocol that was released as an Internet Draft in February 2013. As with DSDV, it eliminates the risk of routing loops, even ephemeral ones. It is based on the “distributed update algorithm” (DUAL) of [JG93]. EIGRP is an actual protocol; we present here only the general algorithm. Our discussion follows [CH99].
Each router R keeps a list of neighbor routers NR, as with any distance-vector algorithm. Each R also maintains a data structure known (somewhat misleadingly) as its topology table. It contains, for each destination D and each N in NR, an indication of whether N has reported the ability to reach D and, if so, the reported cost c(D,N). The router also keeps, for each N in NR, the cost cN of the link from R to N. Finally, the forwarding-table entry for any destination can be marked “passive”, meaning safe to use, or “active”, meaning updates are in process and the route is temporarily unavailable.
Initially, we expect that for each router R and each destination D, R’s next_hop to D in its forwarding table is the neighbor N for which the following total cost is a minimum:
c(D,N) + cN
Now suppose R receives a distance-vector report from neighbor N1 that it can reach D with cost c(D,N1). This is processed in the usual distance-vector way, unless it represents an increased cost and N1 is R’s next_hop to D; this is the third case in 9.1.1 Distance-Vector Update Rules. In this case, let C be R’s current cost to D, and let us say that neighbor N of R is a feasible next_hop (feasible successor in Cisco’s terminology) if N’s cost to D (that is, c(D,N)) is strictly less than C. R then updates its route to D to use the feasible neighbor N for which c(D,N) + cN is a minimum. Note that this may not in fact be the shortest path; it is possible that there is another neighbor M for which c(D,M)+cM is smaller, but c(D,M)≥C. However, because N’s path to D is loop-free, and because c(D,N) < C, this new path through N must also be loop-free; this is sometimes summarized by the statement “one cannot create a loop by adopting a shorter route”.
If no neighbor N of R is feasible – which would be the case in the D—A—B example of 9.2 Distance-Vector Slow-Convergence Problem, then R invokes the “DUAL” algorithm. This is sometimes called a “diffusion” algorithm as it invokes a diffusion-like spread of table changes proceeding away from R.
Let C in this case denote the new cost from R to D as based on N1’s report. R marks destination D as “active” (which suppresses forwarding to D) and sends a special query to each of its neighbors, in the form of a distance-vector report indicating that its cost to D has now increased to C. The algorithm terminates when all R’s neighbors reply back with their own distance-vector reports; at that point R marks its entry for D as “passive” again.
Some neighbors may be able to process R’s report without further diffusion to other nodes, remain “passive”, and reply back to R immediately. However, other neighbors may, like R, now become “active” and continue the DUAL algorithm. In the process, R may receive other queries that elicit its distance-vector report; as long as R is “active” it will report its cost to D as C. We omit the argument that this process – and thus the network – must eventually converge.