** From: Routing TCP/IP, Volume I, 2nd Edition **
Routing Protocol Basics
- A routing protocol is the language a router speaks with other routers to share information about the reachability and status of networks.
- Obviously, for communication to occur, the communicators must speak the same language, i.e. use the same routing protocol.
- All dynamic routing protocols are built around an algorithm.
- At a minimum, a routing algorithm must specify the following:
- how to pass reachability information about networks to other routers
- how to receive reachibility information from other routers
- how to determine optimal routes and install them into the routing table
- how to react to and advertise changes in a network
- All subnets within a network must be connected to a router, and wherever a router has an interface on a network, that interface must have an address on the network.
- When there are multiple routes to the same destination, a router must have a mechanism for calculating the best path. A metric is a variable assigned to routes as a means of ranking them from best to worst or from most preferred to least preferred.
- Different routing protocols use different metrics. For example, RIP defines the “best” route as the one with the least number of router hops; EIGRP defines the “best” route based on a combination of the lowest bandwidth along the route and the total delay of the route.
- Example metric components:
- Hop count - A hop-count metric simply counts router hops. The best route is the one with the fewest hops.
- Bandwidth - A bandwidth metric would choose a higher-bandwidth path over a lower-bandwidth link.
- Load - Load reflects the amount of traffic utilizing the links along the path. The best path is the one with the lowest load. Unlike hop count and bandwidth, the load on a route changes, and, therefore, the metric will change. Care must be taken here. Route flaps can have adverse effects on the router’s CPU, the bandwidth of the data links, and the overall stability of the network.
- Delay - Delay is a measure of the time a packet takes to traverse a route. The path with the least delay would be chosen as the best path.
- Reliability - Reliability measures the likelihood that the link will fail in some way and can be either variable or fixed. The path with highest reliability would be selected as best.
- Cost - Cost is configured by a network administrator to reflect more- or less-preferred routes. Cost might be defined by any policy or link characteristic or might reflect the arbitrary judgment of the network administrator. Therefore, “cost” is a term of convenience describing a dimensionless metric.
- A dynamic routing protocol must include a set of procedures for a router to inform other routers about its directly connected networks, to receive and process the same information from other routers, and to pass along the information it receives from other routers.
- Further, the reachability information in the route tables of all routers in the network must be consistent.
- The process of bringing all route tables to a state of consistency is called convergence.
- The time it takes to share information across a network and for all routers to calculate best paths is the convergence time.
- It is during the intermediate time, when the network is in an unconverged state, that routing errors might occur. Therefore, convergence time is an important factor in any routing protocol. The faster a network can reconverge after a topology change, the better.
- Load balancing is the practice of distributing traffic among multiple paths to the same destination, so as to use bandwidth efficiently.
- Load balancing should be implemented to alternate traffic between the two paths.
- Load balancing can be equal cost or unequal cost, and per packet or per destination.
Distance Vector Routing Protocols
- Most routing protocols fall into one of two classes: distance vector or link state.
- The name distance vector is derived from the fact that routes are advertised as vectors of (distance, direction), where distance is defined in terms of a metric and direction is defined in terms of the next-hop router.
- For example, “Destination A is a distance of five hops away, in the direction of next-hop Router X.”
- Because each router depends on its neighbors for information, which the neighbors in turn might have learned from their neighbors, and so on, distance vector routing is sometimes facetiously referred to as “routing by rumor.”
- Notable distance vector routing protocols: RIP and (E)IGRP.
- A typical distance vector routing protocol uses a routing algorithm in which routers periodically send routing updates to all neighbors by broadcasting their entire route tables.
- Periodic updates means that at the end of a certain time period, updates will be transmitted. The interval typically ranges from 10 seconds to 90 seconds. Too frequent updates might cause CPU congestion and too infrequent updates might cause the convergence time to be unacceptably high.
- A distance vector routing protocol sends its updates to neighboring routers and depends on them to pass the update information along to their neighbors.
- When a router first becomes active on a network, how does it find other routers and how does it announce its own presence? Several methods are available. The simplest is to send the updates to the broadcast address (in the case of IP, 255.255.255.255). Interested routers will process the broadcasts, other will drop them.
- Most distance vector routing protocols take the very simple approach of telling their neighbors everything they know by broadcasting their entire route table (some exceptions). Neighbors receiving these updates glean the information they need and discard everything else.
- Distance vector algorithms provide road signs to networks. They provide the direction and the distance, but no details about what lies along the route.
- How will distance vector routing handle reconvergence when some part of the topology changes? Simple- in its next scheduled update, the router flags the network as unreachable and passes the information along.
- Typical periods for route timeouts range from three to six update periods. A router would not want to invalidate a route after a single update has been missed, because this event might be the result of a corrupted or lost packet or some sort of network delay. At the same time, if the period is too long, reconvergence will be excessively slow.
- As described so far, at every update period each router broadcasts its entire route table to every neighbor. But is this really necessary?
- A route pointing back to the router from which packets were received is called a reverse route.
- Split horizon is a technique for preventing reverse routes between two routers.
- Implementing split horizon prevents the possibility of a routing loop.
- There are two categories of split horizon: simple split horizon and split horizon with poisoned reverse.
- The rule for simple split horizon is, when sending updates out a particular interface, do not include networks that were learned from updates received on that interface.
- Simple split horizon works by suppressing information. Split horizon with poisoned reverse is a modification that provides more positive information.
- The rule for split horizon with poisoned reverse is, when sending updates out a particular interface, designate any networks that were learned from updates received on that interface as unreachable.
- Notice that a route is marked as unreachable by setting the metric to infinity; in other words, the network is an infinite distance away.
- Split horizon with poisoned reverse is considered safer and stronger than simple split horizon—a sort of “bad news is better than no news at all” approach.
- Most modern distance vector implementations use split horizon with poisoned reverse. The trade-off is that routing update packets are larger, which might exacerbate any congestion problems on a link.
- Triggered updates, also known as flash updates, are very simple: If a metric changes for better or for worse, a router will immediately send out an update without waiting for its update timer to expire.
- Reconvergence will occur far more quickly than if every router had to wait for regularly scheduled updates, and the problem of counting to infinity is greatly reduced, although not completely eliminated.
- Regular updates might still occur along with triggered updates.
- A further refinement is to include in the update only the networks that actually triggered it, rather than the entire route table.
- Holddown timers introduce a certain amount of skepticism to reduce the acceptance of bad routing information.
- If the distance to a destination increases (for example, the hop count increases from two to four), the router sets a holddown timer for that route. Until the timer expires, the router will not accept any new updates for the route.
- Obviously, a trade-off is involved here. The likelihood of bad routing information getting into a table is reduced but at the expense of the reconvergence time.
Counting to Infinity
- Split horizon will break loops between neighbors, but it will not stop loops in a network.
- The way to alleviate the effects of counting to infinity is to define infinity. Most distance vector protocols define infinity to be 16 hops.
- At that time, the network will be considered unreachable.
- This method is also how routers advertise a network as unreachable. Whether it is a poisoned reverse route, a network that has failed, or a network beyond the maximum network diameter of 15 hops, a router will recognize any 16-hop route as unreachable.
- Setting a maximum hop count of 15 helps solve the counting-to-infinity problem, but convergence will still be very slow. Given an update period of 30 seconds, a network could take up to 7.5 minutes to reconverge and is susceptible to routing errors during this time. Triggered updates can be used to reduce this convergence time.
Link State Routing Protocols
- Distance vector routing protocols has been compared to the information available from a road sign.
- Link state routing protocols are like a road map.
- A link state router cannot be fooled as easily into making bad routing decisions, because it has a complete picture of the network.
- Each router originates information about itself, its directly connected links, the state of those links (hence the name), and any directly connected neighbors.
- This information is passed around from router to router, each router making a copy of it, but never changing it.
- The ultimate objective is that every router has identical information about the network, and each router will independently calculate its own best paths.
- Link state routing protocols are built around a well-known algorithm from graph theory, E. W. Dijkstra’s shortest path algorithm.
- The most well-known link state routing protocols are OSPF and IS-IS.
- The basic functionality of link state routing protocols is not complex at all:
- Each router establishes a relationship—an adjacency—with each of its neighbors.
- Each router sends a data unit we will call link state advertisements (LSAs) to each neighbor. Each neighbor receiving an advertisement in turn forwards (floods) the advertisement to its own neighbors.
- Each router stores a copy of all the LSAs it has seen in a database. If all works well, the databases in all routers should be identical.
- The completed topological database, also called the link state database, is used by the Dijkstra algorithm to compute a graph of the network indicating the shortest path to each router. The link state protocol can then consult the link state database to find what subnets are attached to each router, and enter this information into the routing table.
- Neighbor discovery is the first step to setting up link state routing.
- A Hello protocol is used for this purpose.
- At a minimum, the Hello packet will contain a router ID (RID) and the address of the network on which the packet is being sent.
- The router ID is something by which the router originating the packet can be uniquely distinguished from all other routers.
- Other fields of the packet might carry a subnet mask, Hello intervals, a specified maximum period the router will wait to hear a Hello before declaring the neighbor “dead,” a descriptor of the circuit type, and flags to help in bringing up adjacencies.
- When two routers have discovered each other as neighbors, they go through a process of synchronizing their databases in which they exchange and verify database information until their databases are identical.
- Beyond building adjacencies, Hello packets serve as keepalives to monitor the adjacency.
- If Hellos are not heard from an adjacent neighbor within a certain established time, the neighbor is considered unreachable and the adjacency is broken.
Link State Flooding
- After the adjacencies are established, the routers might begin sending out LSAs.
- The term flooding implies the advertisements are sent to every neighbor.
- Each received LSA is copied and forwarded to every neighbor except the one that sent the LSA.
- The flooding process is the most complex piece of a link state protocol.
- There are several ways to make flooding more efficient and more reliable, such as using unicast and multicast addresses, checksums, and positive acknowledgments.
- LSAs contain a sequence number.
- When a router receives an LSA that is already in the database and its sequence number is the same, the received information is discarded.
- If the sequence number is greater, the received information and new sequence number are entered into the database and the LSA is flooded.
- In this way, flooding is abated when all routers have seen a copy of the most recent LSA.
- Because the sequence numbers are carried in a set field within the LSAs, the numbers must have some upper boundary. What happens when this maximum sequence number is reached?
- IS-IS uses a 32-bit linear sequence number space.
- If a restarted router issues to a neighbor an LSA with a sequence number that appears to be older than the neighbor’s stored sequence number, the neighbor will send its own stored LSA and sequence number back to the router. The router will thus learn the sequence number it was using before it restarted and can adjust accordingly.
- The problem with circular spaces is that there is no number less than all other numbers. The problem with linear spaces is that they are—well—not circular. That is, their set of sequence numbers is finite.
- An implementation of the lollipop-shaped sequence number space. A 32-bit signed number space N is used, yielding 231 positive numbers and 231 negative numbers. –N (–231, or 0x80000000) and N – 1 (231 – 1, or 0x7FFFFFFF) are not used. A router coming online will begin its sequence numbers at –N + 1 (0x80000001) and increment up to zero, at which time it has entered the circular number space. When the sequence reaches N – 2 (0x7FFFFFFE), the sequence wraps back to zero (again, N – 1 is unused).
- Next, suppose the router restarts. The sequence number of the last LSA sent before the restart is 0x00005de3 (part of the circular sequence space). As it synchronizes its database with its neighbor after the restart, the router sends an LSA with a sequence number of 0x80000001 (–N + 1). The neighbor looks into its own database and finds the pre-restart LSA with a sequence number of 0x00005de3. The neighbor sends this LSA to the restarted router, essentially saying, “This is where you left off.” The restarted router then records the LSA with the positive sequence number. If it needs to send a new copy of the LSA at some future time, the new sequence number will be 0x00005de6.
- The current version of OSPF, OSPFv2 (originally specified in RFC 1247), adopts the best features of linear and lollipop sequence number spaces. It uses a signed number space like lollipop sequence numbers, beginning with 0x80000001. However, when the sequence number goes positive, the sequence space continues to be linear until it reaches the maximum of 0x7FFFFFFF. At that point, the OSPF process must flush the LSA from all link state databases before restarting.
- The LSA format should include a field for the age of the advertisement. When an LSA is created, the router sets this field to one. As the packet is flooded, each router increments the age of the advertisement.
- The protocol defines a maximum age difference (MaxAgeDiff) value for the network. A router might receive multiple copies of the same LSA with identical sequence numbers but different ages. If the difference in the ages is lower than the MaxAgeDiff, it is assumed that the age difference was the result of normal network latencies; the original LSA in the database is retained, and the newer LSA (with the greater age) is not flooded. If the difference is greater than the MaxAgeDiff value, it is assumed that an anomaly has occurred in the network in which a new LSA was sent without incrementing the sequence number. In this case, the newer LSA will be recorded, and the packet will be flooded. A typical MaxAgeDiff value is 15 minutes (used by OSPF).
- If the age for a link state record is incremented up to some maximum age (MaxAge)—again defined by the specific routing protocol—the LSA, with age field set to the MaxAge value, is flooded to all neighbors and the record is deleted from the databases.
- If the LSA is to be flushed from all databases when MaxAge is reached, there must be a mechanism to periodically validate the LSA and reset its timer before MaxAge is reached. A link state refresh time (LSRefreshTime) is established; when this time expires, a router floods a new LSA to all its neighbors, who will reset the age of the sending router’s records to the new received age. OSPF defines a MaxAge of 1 hour and an LSRefreshTime of 30 minutes.
Link State Database
- The link state or topological database stores the LSAs as a series of records. Although a sequence number and age and possibly other information are included in the LSA, these variables exist mainly to manage the flooding process. The important information for the shortest path determination process is the advertising router’s ID, its attached networks and neighboring routers, and the cost associated with those networks or neighbors.
- LSAs might include two types of generic information:
- Router link information advertises a router’s adjacent neighbors with a triple of (Router ID, Neighbor ID, Cost), where cost is the cost of the link to the neighbor.
- Stub network information advertises a router’s directly connected stub subnets (subnets with no neighbors) with a triple of (Router ID, Network ID, Cost).
- The shortest path first (SPF) algorithm is run once for the router link information to establish shortest paths to each router, and then stub network information is used to add these networks to the routers.
- Link costs are calculated for the outgoing direction from an interface and do not necessarily have to be the same at all interfaces on a link.
- An area is a subset of the routers that make up a network.
- Dividing a network into areas is a response to three concerns commonly expressed about link state protocols:
- The necessary databases require more memory than a distance vector protocol requires.
- The complex algorithm requires more CPU time than a distance vector protocol requires.
- The flooding of link state packets adversely affects available bandwidth, particularly in unstable networks.
- When a network is subdivided into areas, the routers within an area need to flood LSAs only within that area and therefore need to maintain a link state database only for that area.
- The smaller database means less required memory in each router and fewer CPU cycles to run the SPF algorithm on that database.
- If frequent topology changes occur, the resulting flooding will be confined to the area of the instability.
- The routers connecting two areas (Area Border Routers, in OSPF terminology) belong to both areas and must maintain separate topological databases for each.
- A router in one area that wants to send a packet to another area only needs to know how to find its local Area Border Router.
Interior and Exterior Gateway Protocols
- Another layer is added to this hierarchical structure by grouping areas into larger areas. These higher-level areas are called autonomous systems in the IP world and routing domains in the ISO world.
- The routing protocols that run within an autonomous system are referred to as Interior Gateway Protocols (IGP).
- Routing protocols that route between autonomous systems or routing domains are referred to as Exterior Gateway Protocols (EGP).
- Whereas IGPs discover paths between networks, EGPs discover paths between autonomous systems.
Static or Dynamic Routing?
- The price of this “automation” (dynamic routing) is paid in bandwidth and maybe queue space, in memory, and in processing time.
- The routing protocol decides what the best path is to a given destination, you don’t.
- When designing a network, the simplest solution is almost always the best solution. It is good practice to choose a dynamic routing protocol only after determining that static routing is not a practical solution for the design at hand.
- Hub and spoke (star) topology: Configure one static route in the hub router for each spoke router and a single default route in each spoke router pointing to the hub, and the network is ready to go.