SD-WAN overlay vs underlay: Fixing the Distorted View of WAN

Wikipedia defines a Fata Morgana as “significantly distorting the object or objects on which they are based, often such that the object is completely unrecognizable.” – in other words, a mirage. Reading through this excellent article on future network architectures earlier today, I started thinking about what that means for enterprise networking.

If you have followed the Software-Defined Networking discussion over the years, it all started here. I urge you to watch this presentation to truly understand the fundamentals of the motivation for SDN, an excellent presentation by Prof Scott Shenker from Berkley University, one of the “fathers” of SDN. The key moment is at about 7:45, when Shenker states that, unlike the software world, networking had completely failed to provide abstractions, and that abstractions are key to simplification and effective reuse.

Virtual Overlay Network: What is it?

Based on this vision, abstractions in networking indeed started to happen. First in the Data Center, and then in the wide area network with SD-WAN. The solution is generically called a virtual overlay network. Virtual overlay networks create a software-defined, elegantly orchestrated, agile and intent-based solution on top of a static, slow-to-adapt-to-change underlay solution (based on MPLS, internet, 4/5G or direct connectivity). I will make a contentious point: while agile virtual overlays indeed deliver on digital enterprise needs to greatly accelerate adaptability to business needs, they can also be a Fata Morgana. They can deliver on an illusionary vision of reality: you get the benefits of software-defined agility, but in real life the pitfalls of the static underlay remain – only they have been obfuscated to you. I know I will get challenged on this – it is a complex discussion, but let me provide two proof points:

Sd WAN Overlay vs Underlay

  1. In the Gartner Network Performance Monitoring and Diagnostics report for 2019 [1], the main frustrations of customers are tool sprawl as well as the time required to establish RCA (root cause analysis). I am not surprised. It used to be that your single-source router vendor’s management panel provided you with complete visibility into your routed WAN network.

    sd wan overlay vs underlay : Virtual Tunnel Overlay vs Physical Underlay

    With SD-WAN virtual overlays, you very often keep that infrastructure as your underlay (and if you don’t, you still have a routing infrastructure at the core of your MPLS provider’s network), and you roll out a more agile virtual infrastructure on top of it. So now you have two separate layers to manage, adding to the tool sprawl issue, and the fact you have to do swivel-chair root cause analysis, correlating the data from the two layers in your head, is not going to help time to resolution. You will see application performance issue pop up in your virtual overlay management tool. But how do they correlate? A virtual tunnel may well map to several physical routers and MPLS paths in the physical underlay. The mapping between the two will invariably require valuable expert time, detracting from the time said experts can spend on strategic business initiatives rather than troubleshooting. It’s a well-known industry benchmark that enterprises spend over $60B a year simply trouble-shooting their network infrastructures. That’s about as much as they spend acquiring core network gear.

  2. The increasingly virtualized network function environment both layers have adopted adds a third dimension to network performance troubleshooting. It used to be that you could trust the ASICs in your router (or Multi-layer switch, the distinction has become near meaningless over the years) to always deliver on wireline performance. If you had an issue it would seldom be wireline forwarding performance, it could be probably attributed to issues with capacity planning or erroneous DSCP re-marking along the way. That line gets dramatically blurred with virtualization – of course it can still be insufficient bandwidth to address new demands, but it can also be CPU or memory allocation to the particular, virtualized network function that is falling short within a virtualized service chain. Both in the overlay and underlay. I don’t think it would be fun to trouble-shoot an environment with as many un-correlated variables.

This is great for Network Performance Monitoring and Diagnostics companies, who offer a myriad of tools to address many (but not even remotely all) of these issues. It’s up to enterprise network managers to consolidate the information from all of these tools and -again- it’s no surprise to me IT departments are immensely frustrated with the required tool sprawl to address their needs when monitoring underlays and overlays and related application performance.

Cloud-First Global SD WAN Solutions by Aryaka

If you are a network manager that would rather spend time planning strategically than living a reactive life troubleshooting issues 70% of your time, Aryaka has a solution for you: design with intent first and then get 100% end-to-end visibility across a global network infrastructure that provides immediate, single-pane-of-glass visibility into network and application performance. How does it work?

  1. When it comes to last mile connectivity, Aryaka’s SmartLINK connectivity delivers on MPLS-like SLAs at internet access cost (read more in my last blog).
  2. Aryaka’s SmartCONNECT global L2 network provides utterly deterministic performance when it comes to the QoS trifecta (packet loss, latency, jitter). This can *not* be delivered over a global L3 overlay infrastructure.
  3. The MyAryaka customer web portal provides immediate, end-to-end network and application performance visibility and status – all the way from company-wide abstracted views, with the ability to drill down into individual links for very fast root cause analysis.
    Application and Network Monitoring Portal

The proof is in over 800 enterprise customers that, after adopting the technology, never consider an alternative. And as you can imagine, many of them previously implemented their global WANs with do-it-yourself technology, manually configuring complex policies and change management. They just prefer to focus on relevant strategic business issues for the network in the digital age such as:

  • Fast global connectivity
  • Optimal support for XaaS, anytime and anywhere
  • Support for temporary, ephemeral sites and requirements (pop-up shops, large-scale physical and virtual events, construction sites, etc)
  • Local presence in highly regulated markets (for example, China)
  • Immediate ability to accommodate ever-changing application and business needs

In our industry, we discuss Intent-Based Networking as the elusive ideal we should strive for. Aryaka delivers on it here and now. We can implement your global WAN infrastructure within 48 hours – just state your intent. We help you visualize (and we will also proactively inform you with our 365x7x24 expert support) any WAN issues or trends you should address, end to end.

Instead of providing you with a Fata Morgana of what Intent-Based Networking could be, we deliver on the reality of it.

Want to learn more? Please book a demo with one of our experts.

Note [1]: The NPMD MQ report is gated for subscribers, but several NPMD vendors have it for download, for example https://blog.viavisolutions.com/2019/02/13/npmdmq/.

About the author

Paul Liesenberg
Paul is a Director in Aryaka’s Product Solutions Team. Paul has over 20 years of experience in product marketing, product management, sales engineering, business development and software engineering in Cisco, LiveAction, Bivio Networks and StrataCom. Paul enjoys scuba diving, motorcycles, open software projects and oil painting.