5.3. Shifting the problem to the centralized site

Another possible approach is shifting the Internet access resiliency problem to a central site when the branch has a redundant, private WAN connectivity provisioned through any or multiple available methods (SDH/PDH/OTN links, MPLS VPNs, customer-managed overlays). In this scenario, the backhaul towards the central site (which then performs the ultimate handoff to the Internet) happens over physically or virtually dedicated links, and the actual addressing solution offered by the carrier serving the branch becomes irrelevant; in the extreme case the branch carrier does not provide an IP service at all (as in the case of optical transport or a layer-2 VPN) or the provided addressing is only used to establish tunnels (in the case of overlays). It is not uncommon for the branch edge device to not even have a default route configured towards any of the carrier next hops, instead configuring a default route through the overlay or having as next hop a loopback interface address reachable through the overlay itself.
The reasoning behind this choice is based on several factors and commonly involves one or more of the following assumptions:
1. Branch site Multi-Homing is mainly a matter of first-mile redundancy due to the increased difficulty of providing stable connectivity to remote sites compared to a large central site. Suppose the traffic can be made to traverse the first mile in an optimal environment (because the entire path is under the network administrator's control, at least at the network level). In that case, the relatively high-quality Internet circuits found at the central site can be managed using more traditional and resource-intensive techniques (for example, by significantly increasing capacity and carrier diversity, tuning routing advertisements, and using ECMP).
2. Obtaining enterprise-class connectivity, where the customer has the option to announce their own address space dynamically to the carriers, can be complex and cannot thus be done for every site.
3. Managing a geographically-distributed Internet breakout may pose greater operational, financial, and security-related challenges when the proper orchestration tools are not employed.
It is typically much easier to arrange the resiliency over internal WAN links. This is primarily because the addressing structure and the path selection are under the control of a single entity throughout the LAN and the private WAN. This can allow, for example, to number the entire internetwork using a single type of addressing, recreating the benefits of the solution in section 5.1 while avoiding some of its disadvantages, namely:
- "In case of provider failure, a prefix may not be correctly deprecated": in the tunnel-based solution, all network devices are under the control of the same entity experiencing the hypothetical issue, simplifying troubleshooting and speeding up resolution.
- "Need carriers to accept PI prefix advertisements": in the tunnel-based solution, this needs to be done only once, at the central site, instead of at every site implementing the solution.
- "Carriers typically charge significantly more for such type of attachment": see the previous observation.
- "Return traffic path not guaranteed to mirror the outbound path": path symmetry in the more critical first mile can be guaranteed by the network administrator through configuring the network devices at the far end of the WAN to honor the original link choice of the incoming sessions in a stateful manner, or by applying the same deterministic forwarding algorithms on devices at both ends. Several vendors provide this functionality out of the box in certain products. Symmetry is thus guaranteed where it matters, i.e., where traffic must traverse links having vastly different characteristics and quality (it is not uncommon for remote sites to be served by a primary DSL/fiber link and a secondary, much more limited cellular link).
The tunnel-based solution described in this section may also be implemented as a scheme in which the central site is not owned by the organization at all and is instead part of a service offered by a tunnel broker somewhere on the Internet. Such a choice can be appealing due to factors such as outsourcing of operational burden and the possibility of superior performance due to the broker having a globally distributed and fine-tuned network of "hubs," to the closest of which each site can then connect to.
Another advantage of having control over the path crossing the first mile of the branch site lies in the possibility of applying error-correcting algorithms to the traffic; several vendors offer this functionality which, although proprietary, can be made to work by placing compatible devices at the branch edge and the central site, usually terminating the tunnels comprising the overlay. Such techniques typically include forward error correction, compression, packet duplication, and deduplication across multiple low-quality links, to prevent or lessen packet loss across the overlay. These techniques, however, cannot improve other metrics such as latency.
The principal downside of the tunnel-based solution is not making use of the "local Internet breakout": users at the branch site are almost guaranteed to experience worse performance towards Internet destinations compared to solutions listed in sections 5.1-5.2, 5.4-5.5 due to the traffic having to be backhauled to the central site first.
However, it should be noted that as long as the centralized site uses a solution similar to 5.1 or 5.2, it'll also enable communication that otherwise would have failed or required complex use-case-specific workarounds.
In fact, despite the alleviating factors discussed above, shifting the problem to a different area of the network might not be considered a technical solution at all because the central site would face the same fundamental challenges, and it would ultimately have the same options for multi-homing as discussed in sections 5.1-5.2, 5.4-5.5. As such, what is described in this section could be considered a non-technical solution for a small site.
Enterprise WAN design in itself remains outside the scope of this document.
Advantages:
- Shifting the problem to a different location may help solve it more efficiently,
- The simplest solution for the small site,
- Supported CE products (Wireguard),
- No need to renumber, hence no issues with prefix deprecation,
- No need for special support on hosts,
- Traffic steering is easy to implement, including traffic symmetry requirements or active/passive failover,
- A centralized Internet gateway simplifies perimeter security,
- Possibility of applying WAN optimization techniques to the first section of the path toward the Internet,
- Multiple free or paid tunnel brokers exist with different SLAs,
- Avoid unnecessarily polluting the global routing table, and may also get better AS paths because of the aggregations compared to using a PI.
Disadvantages:
- Looping the Internet traffic through the centralized site might increase latency, and additional links on the traffic path may contribute to jitter and packet loss,
- More bandwidth is needed for rented WAN links,
- Side effects related to tunnelings, such as encapsulation processing and overhead,
- Convergence time in case of underlay network failures may be affected by the need to re-establish the tunnels and routing neighborships of the overlay,
- The central site becomes a single point of failure for the Internet access of the entire organization,
- Some or all of the disadvantages listed in sections 5.1-5.2, 5.4-5.5 apply, depending on the specific solution selected to solve the multi-homing issue at the central site. These may include end-to-end connectivity and traffic steering issues toward Internet destinations.