Closed-loop control is a central component in the delivery of most modern dynamical systems, represented by aircraft, automobile engines, spacecraft, and large distributed power grids. Engineers begin the design of these complex systems by modeling the governing differential equations, referred to as the "characteristic equations." They then apply control systems theory to attain optimal performance using real-time measurements. The application of a feedback controller to a dynamical system is called "closing the loop."
Packet-switched routing networks are dynamical systems that to date have not benefited from closed-loop control. Unfortunately, for many decades the derivation of the underlying closed-form solution for the characteristic equations remained elusive. In the absence of preferred closed-loop control, very basic, heuristic protocols like BGP and OSPF were developed to control these networks.
The National Science Foundation sponsored research at Cornell University to end this decades-long quest. These researchers revealed the characteristic equations that define any packet-switched network. They then implemented their breakthrough as HALO, the world's first distributed real-time control system for packet switched networks (read the original paper).
Mode was founded by these same researchers, who designed a commercial version, Mode HALO, and an autonomous global virtual network implementation, Mode SD-CORE.
Mode HALO was evaluated in a series of experiments including the National Science Foundation GENI Test, the AT&T SDN Network Design Challenge, and others, described below.
The National Science Foundation (NSF) supported the original research, and ultimately facilitated an evaluation of the initial implementation of Mode HALO on the NSF network testbed, GENI (Figure 1). In this experiment, the researchers deployed a network across the United States, using the shared compute resources at GENI Points of Presence (PoPs) to create virtual routers (connected as shown in Figure 2). They then used a randomly generated, high-demand traffic matrix to set the communication rates among PoPs, while the routers at each PoP ran Mode HALO. The resulting network path diversity required traffic at each PoP to be apportioned in some non-trivial manner, for optimal bandwidth use and minimized overall network delay in the face of dynamic traffic spikes.
As shown in Figure 3, Mode HALO supported 300% the throughput at the lowest possible delay between hosts in New York and Sunnyvale, when compared with the prior state-of-the-art used by network operators to handle traffic spikes.
Unlike heuristic protocols, Mode HALO was able to quickly adapt to dynamic traffic changes (Figure 4) without prior knowledge. This inherent dynamism is the key to the performance, flexibility, and reliability of Mode SD-CORE.
The Mode HALO victory at the AT&T SDN Network Design Challenge further validated the operational advantage revealed by the NSF experiments. The AT&T challenge was to provide an optimal solution on a prototypical carrier network (Figure 5) in the face of rapidly rising dynamic traffic demand, with an emphasis on efficiency and cost-effectiveness. The Mode team leveraged Mode HALO to deliver a near-optimal solution in approximately thirty seconds. Mode HALO had proven unique in its ability to handle the traffic demands and scale of one of the world's largest networks in an efficient and sustainable manner.
The next test was designed to confirm the ability of Mode HALO to scale to hundreds of locations, while handling Tbps of traffic. The test setup used an actual customer network topology, with over 1,000 routing nodes, and an asymmetric link capacity of approximately 1 Gbps. The first set of tests introduced traffic changes designed to overload individual link capacity. Even at this scale, Mode HALO maintained its ability to converge and adapt rapidly (Figure 6 shows link utilization vs. time, revealing the response of the test setup to multiple, impactful traffic changes).
Test results from all cases confirmed that even in large-scale networks with widely varying (and unplanned) traffic changes, Mode HALO was able to adapt and respond in real time, optimizing link utilization and system throughput.
Another set of tests was used to reveal differences between Mode HALO and best-practices Shortest Path Routing among a varying set of POPs with a theoretical maximum throughput of 40 Gbps. A uniformly random traffic pattern was generated and input to both the Mode HALO and Shortest Path Routing solutions.
In the case of the Shortest Path test, the network was able to achieve 12-13 Gbps (33%) of the total network theoretical capacity of 40 Gbps. Link utilization for this configuration is shown in Figure 7. This figure highlights the problem areas encountered by the Shortest Path algorithm, including many unused links, and visible choke points which throttled traffic (please note that in Figure 7 and in the following Figure 8, for ease of exposition, only a subset of routing locations from the core of the network are shown).
Figure 8 shows link utilization for the same test using Mode HALO, which delivered almost 36 Gbps of traffic — 90% of theoretical capacity, and more than 2.7x the throughput of the Shortest Path implementation. The numbers and colors in the network graph show the percentage splits of traffic flows between each router node. In the Mode HALO case, only 3 of the network links remain unused. In a large-scale network, Mode HALO provides a consistent performance improvement over the prior state-of-the-art due to its ability to use network capacity optimally.