This article was originally published on CloudTweaks.com on September 4, 2018.
Network Challenges of Office 365
Microsoft’s focus on growing commercial adoption of Office 365 in 2018 has resulted in the number of licensed seats growing an average of 29% on a yearly basis for each quarter of fiscal 2018. With an estimated 135 million commercial users worldwide, Office 365 is performing well for Microsoft. But with that success comes growing pains.
The problem for customers is network performance. A 2017 survey by TechValidate noted that nearly 70 percent of companies experienced weekly network-related performance issues after deploying Office 365, even after increasing both firewall and network bandwidth capacity.
To be clear, availability of the application’s different services is covered by Microsoft’s SLAs, which guarantees 99.9% of Office 365 uptime. However, availability is not the same thing as ensuring end users have a frustration-free experience. For one, the solution to performance problems is sometimes dependent on which Office 365 application users are having trouble. For example, doing live video streams from Microsoft Teams can saturate a corporate network, in which case companies can use proxy servers or enterprise content delivery network (eCDN) add-ons to reduce bandwidth consumption.
The other issue is that performance issues are frequently network related, yet IT personnel struggle to correlate network issues with application performance because of limitations with monitoring and management tools.
Current “best practices” for network performance issues
Microsoft provides guidance on how users can mitigate network-related performance issues, suggesting things like adding memory to your computer, ensuring there’s no malware running, and the like. Among the most useful of the recommendations Microsoft offers: create a performance baseline. Understanding what performance should look like is essential to figuring out why there is deviation from the standard. Mapping your application data flows from device to egress point to understand latency is a key step. This includes:
- Measuring the time it takes to egress to Office 365, including identification of the devices between your client computer and your egress point. This could include proxy servers, switches, and the like. Any misconfigurations or malfunctions along this path could be contributing to performance issues.
- Knowing the location of the server that resolves the URLs for Office 365. Allowing the application to route requests to the nearest Microsoft datacenter, rather than the one nearest the corporate datacenter, could improve performance.
- Measuring the speed of the customer’s ISP DNS resolution in milliseconds (if not using a private hosted or managed DNS service). ISPs don’t always do a good job of managing their DNS servers, which could result in performance issues for Office 365 apps.
Solutions- ExpressRoute or Managed SD-WAN?
Assuming that an enterprise was still fielding user complaints about Office 365 after implementing the best practices (like not “tromboning” Office 365 traffic through the corporate datacenter), there are two other network segments where issues may be occurring: the public internet path that the traffic is traversing, and the datacenter network where connecting (peering) with Microsoft’s network occurs.
ExpressRoute and Office 365 – Pros and Cons
One way to address the performance issue with cloud-based services like Office 365 is to move traffic off the public internet and onto more direct, private connections. ExpressRoute is Microsoft’s Azure’s direct connect service.
Enterprises can buy connectivity from a network service provider (NSP) that is a Microsoft ExpressRoute partner. The NSP has already connected with Microsoft at a multi-tenant datacenter, thus providing the “direct connect” between enterprise and the Office 365 service.
Both independent testing and real-world experience suggest that ExpressRoute will help performance for voice and video applications like Skype for Business and Office Teams, but not for Sharepoint and Outlook. In the latter instances, proximity to Microsoft’s POPs, along with sufficient bandwidth, are the key determinant of service quality.
Considering that employees, partners, and customers are accessing services via mobile devices, also that companies often have many offices around the globe, and IT managers quickly realize that ExpressRoute isn’t the answer for another reason: it’s hard to manage. Microsoft itself recommends using the public internet for cost and ease of implementation, compared to ExpressRoute.
Using the public internet for branch connectivity, while convenient and cost effective, has trade-offs in terms of control over jitter, latency and packet loss. Routing is done on a best effort basis, meaning, that good network paths to Office 365 will change based on traffic congestion at different network interconnection points. If the link goes down, your data is going to be re-routed along alternative paths along with everyone else’s traffic.
Managed SD-WAN and Office 365 – Pros and Cons
Going back to the example of Skype for Business, the issue of controlling jitter, latency, and packet loss can be aided by the use of SD-WAN technology.
One of the core features of SD-WAN is the ability to aggregate broadband links to remote or branch office sites and manage these links along with the core enterprise WAN through a centralized control plane. SD-WAN also has programmability, unlike MPLS, offering the ability to do automated failover to different network paths, for instance. The network manager gains more control over Office 365 traffic running on the enterprise WAN, but traffic management and monitoring capabilities end there unless the SD-WAN network is extended all the way to a datacenter where Microsoft has a POP for Office 365.
The cons? SD-WAN alone also doesn’t solve all performance problems if the underlying physical network isn’t architected with Office 365 in mind.
Managing Office 365-What else is needed?
The next requirement from the SD-WAN service should be the ability to provide an integrated look into network and application health. A managed MPLS service provider is mainly concerned with ensuring the network is meeting SLAs for uptime. Even an SD-WAN service alone does not solve performance problems.
For instance, one of the challenges of managing a large Office 365 deployment is ensuring firewalls and antivirus application are continually updated with valid URLs and IP addresses being used by Office 365. For many IT managers, this is still a manual copy and paste operation because their equipment doesn’t support update using Microsoft’s web services.
If network monitoring isn’t integrated with monitoring of other applications and hardware on the enterprise network, how does one go about determining the source of a performance issue? Is it simply because a config file in the firewall is outdated? Needless to say, it’s usually a lengthy process to correlate events across disparate systems. An integrated, end-to-end view of application and network performance is needed.
Office 365 adoption has increased dramatically over the last year. Application performance challenges have increased as a direct result of its popularity. Optimizing the performance of Office 365 applications presents challenges – each particular application has different usage characteristics and resource needs. Ultimately, a combination of SD-WAN technology, advanced monitoring and analytics is needed, underpinned by a re-architected enterprise network that leverages optimized network routes and peering relationships. All these elements need to come together to enable an application-aware network capable of properly supporting Office 365 performance requirements.