What are the connectivity options that can be used to build hybrid cloud architectures?

Introduction

Every cloud journey is unique, as they say, and so is ours! Cisco has seen exponential growth in our own SaaS (Software as a Service) offerings in the last few years and we are one of the largest consumers of public clouds. Throughout our journey, with successes and failures, we continue to evolve, mature, and learn a great deal. With this white paper we attempt to share our journey and the operational choices we made as we evolved our Cloud Center of Excellence under central operations.  This paper focuses on Hybrid network management and is a part of a series covering various aspects of the multi-cloud journey we are on.

Around 6-7 years ago, as we were carefully laying the blocks of central cloud operations practice for Cisco, one question started to become prominent. How do we provide a foundation to deliver consistent customer experience, globally, through multi-vendor environments?

Cisco has a large and varied portfolio of cloud-hosted services and products that must be delivered securely to our customers and employees on a scale. Building the foundation for operating and governing disparate, complex, and continuously evolving multi-cloud environments was then, and is no less so now a significant challenge. Early on in this journey, a partnership was formed between various operational teams that helped evolve our operating model. This paper reviews the overall journey and does not go into the details of decisions made or the process followed for the same.

This paper shares our experience of hybrid cloud connectivity and our view of a desirable operational and governance architecture for the connectivity aspects of the multi-cloud problem space. We explain how we partner with major public cloud vendors to structure our operating model and get to where we need to be for managing and governing multi-cloud connectivity. The core objective of our operational governance was consistency and policy enforcement. Given the flexibility and agility required by various business units, singular centrally managed connectivity architecture was not the goal.

In a future paper, we will discuss other the many other dimensions of building the foundation for operating and governing in multi-cloud environment for large enterprises.

The hybrid Cloud has been defined and interpreted differently in the industry, so let me first define it for the context of this paper. A Hybrid Cloud is an application deployment architecture that involves a minimum of one private cloud and one public cloud. You can have more than one private or public cloud in the mix, sometimes referred to as multi-cloud. In our hybrid cloud configuration, we have an OpenStack-based private cloud and AWS, GCP, Azure, OCI as its public cloud counterparts. Also, please, note the application deployment reference in the definition of the hybrid cloud. It is very intentional as even though there are foundational long-serving infrastructure and platform components that are required to build the hybrid cloud, application deployment design is what drives the scope, size, and lifecycle of those components.

Why this is important

According to recent Gartner studies, more than 80% of large enterprises offer their services through hybrid clouds. Often, as the enterprises expand their workloads (amount of work performed by a group of resources in a specified period. Typically, it refers to resources required to run an application or a service) [1]in the public cloud arena, secure, performing, and scalable interconnectivity has been an afterthought. Moreover, it is not very uncommon that, in the rush and excitement to migrate applications and data to public clouds, the disposition is to use OTT (over the internet) connectivity and just make it work first!

Most hybrid cloud adoptions are driven by enterprises balancing growth, scale, cost, and security. As applications and workload teams (teams that design and build the deployment and operational criteria of the application) drive the underlying connectivity requirements and architectures, common enterprise-wide hybrid network connectivity architecture falls through the cracks to the bottom of priorities. It is not intentional, but it is the case that the problem space is complex and that the teams responsible for workloads often are not equipped to understand this properly. This is an example of where the separation of concerns works against us. In a traditional IT environment, the underlying connectivity, on-premises, is often just assumed to be constant and reliable high-capacity LAN, into which the network operations team has good insight, and over which they have operational control. That is clearly not the case over basic Internet, and not as much the case over managed SD-WAN as it should be. The teams responsible for workloads, on-premises, are not always accustomed to consulting with their networking colleagues in the way that they need to when the workloads move off-premises to CoLo or public cloud.

Problem Statement

Global and complex business models tied with local regulatory requirements drive application delivery models. Application performance, availability requirements combined with data redundancy, and localization requirements determine the multi/hybrid cloud deployment models.

Currently, the wider industry has fragmented standards, limited best practices, and a handful of comprehensive tools for end-to-end network management supporting Day 2 operations in a multi-cloud environment. Most hybrid networking solutions are organically grown and custom-built, which leads to complex governance and operations in the long run.   We will explore the assorted options and choices we leveraged through our journey and lead into the framework and building blocks for hybrid network services.

Growth and benefits of hybrid clouds

One question every CIO asks: “Is Hybrid Cloud only a step to public cloud adoption, or is it is a long-term delivery model?”

Hybrid cloud is an architecture that integrates public and private cloud environments. This architecture provides a tremendous amount of flexibility for application deployment and delivery. It addresses some of the critical inhibitors with security, compliance, and governance for public cloud adoption. Additionally, a hybrid cloud is an architecture pattern that provides a path to digital transformation in many traditional IT environments.

According to the Enterprise Strategy Group (ESG) Hybrid Cloud Trends and Strategiese-book, the number of organizations committed to or interested in a hybrid cloud strategy has increased from 81% to 93% since 2017.

A report “The Cost of Cloud, a Trillion Dollar Paradox” explores the cost of the public clouds and reflects on cost as a driver for public cloud adoption. It challenges the idea that public clouds are always cost effective, vis-a-vi private cloud, for all cases.

Key drivers do include cost optimization, but also include regulatory and compliance factors, the flexibility of service delivery, performance, and security. While public cloud adoption is gaining momentum, overall service delivery architectures are maturing through the balance and integration of public and private clouds in hybrid systems.

The graph below, from that report, illustrates a transition of spending where the “Data Center” spend, as a proxy for private, on-premises infrastructure, remains steady whilst spending on public cloud increases. The implication is that the public cloud spend is additional spend, as opposed to replacement spend.

What are the connectivity options that can be used to build hybrid cloud architectures?

Network - Foundation of the hybrid cloud

We, like many large enterprises, evolved our hybrid cloud strategy organically. We did so because, like many, we had to discover what worked in partnership with cloud and CoLo partners as we went along. A key difference is that we also used this experience to inform how we developed our various products, capabilities, and offers in this space, which we will discuss further below. Establishing connectivity between cloud environments is the foundation of building hybrid cloud delivery model.

Initially, we faced a lack of maturity in the public cloud space, a lack of best practices, and architecture precedence. We knew that ease of configuring, operating, and optimizing interconnectivity between cloud environments would be critical. As the first step of hybrid cloud evolution, we needed to establish interconnectivity that would allow data transfers between endpoints in various public and private cloud environments.

We realized that ease of Configuring, Operating, and Optimizing the interconnectivity between cloud environments is critical, and core to hybrid cloud transformation and sustainability. That sounds logical but reiterating the challenge of lack of visibility in future applications growth, fragmented architecture roadmaps, combined with lack of industry standards, best practices, and reference architectures for hybrid cloud connectivity made it exponentially difficult if not impossible to layout foundational hybrid cloud connectivity.  To add to the complexity, many business units were already delivering their services through various public clouds and had topical connectivity solutions in place.

Hybrid network connectivity interconnects different control and data planes of public and private clouds. The major challenge with the hybrid network configuration is vendor-specific disparate control planes, exacerbated by the lack of end-to-end visibility and integration standards across the cloud providers. This fragmentation leads to inconsistent and localized connectivity solutions with even moderately complex application architectures in any large enterprise. Hybrid cloud architecture may additionally require large enterprises with a global footprint to design and securely extend their backbone into various public cloud regions and AZs (Availability Zones). Network Operations Center needs to restructure its monitoring, runbook automation, and process to reflect hybrid cloud composition.

We decided to focus on the simplicity of configuration, abstraction of vendor-specific complexity, and seamless management together forming the pillars of our hybrid cloud network operations.

Guiding principles of a Hybrid cloud network

Knowing that future applications growth and architecture roadmaps were not under our control, we decided to gradually build out and enable a hybrid connectivity architecture. To build gradually, we decided on operational criteria and guiding principles organized around security, performance, scale, cost, and reliability.

Driving principles of our hybrid cloud connectivity were:

    Let the target workload determine the security, performance, and scale requirements for a specific hybrid interconnect.

    End-to-end security of data at rest and transit would drive the interconnection choices and the endpoint selection and configurations.

    Performance and availability requirements would drive connectivity methods and redundancy design.

    Each design choice, at this stage, would be evaluated against the cost of implementation and operations (see below for the general design pattern options that we considered).

    Consider Cost optimization as a continuous process. The optimization process should start at the requirements phase through the design, implementation, and operations.

    Continue evolving the cloud migration and adoption, for any new workload, If the chosen interconnectivity model between the source and target environments meets the SLAs (Service Level Agreements), we scale up.

    If the SLA demands a different interconnectivity model, we enable the next Lego block by going back to bullet 4 above

Hybrid cloud connectivity design patterns

Large enterprises with a global footprint typically deliver their service and solutions through multiple data centers across the globe. As they expand into the public cloud regions, which are also global, the connectivity complexity increases drastically. Each of these connections must be evaluated on the business, operational requirements and balanced against the operating costs.

We came to realize that there are four prevalent hybrid connectivity design patterns, which we discuss below. Corporate and business security polices, infrastructure availability and operating cost dictate which of these design patterns should be applied in given circumstances.

What are the connectivity options that can be used to build hybrid cloud architectures?

Figure 1.           Hybrid Cloud Connectivity Architecture Patterns

Over the Internet

Over the internet connectivity is the most adopted connectivity model. This form of connectivity naturally supports HTTPS/S, SSH, and SNMP protocols, which cover most of the traffic patterns for applications delivered. It is most common as it does not require any additional setup and is cost-effective. However, internet connectivity does not support any Service Level Agreements (SLAs). Additionally, the internet is subject to disruptions and congestion. Internet connectivity can also expose applications that use unencrypted network traffic to security threats, e.g., “man-in-the-middle” attacks. All public consumer SaaS application access is over the Internet, as is a sizable portion of enterprise SaaS access.

Custom IPsec Tunnel

Custom IPsec tunnels provide greater flexibility and control for secure connectivity over the internet. Secure connectivity is established between virtual Cloud Service Routers (e.g., Cisco CSR) or L3 routing capability devices on both ends.

This router needs to be deployed on both sides, potentially deeper into the security boundaries, and enable secure data transfers through the IPsec tunnel setup. Depending on the routers and configurations, you may be able to establish multiple IPsec tunnels between them and have complex routing rules.

This architecture is the most flexible for secure and complex connectivity between multiple endpoints on either side of the hybrid cloud. With this configuration, you can leverage the features and capabilities of your routers to establish secure custom routing schemes. However, as this routing is done over the internet, your bandwidth is going to be limited without SLA guarantees. Further, you can also create universally available and scalable secure connectivity architecture leveraging the capabilities of the Cloud Service Routers. This option is cost-effective for the features and functionality. However, it does require a higher operational overhead, as the router endpoints need to be explicitly configured.

Site-to-site VPN

A site-to-site Virtual Private Network (VPN), with a customer IPsec overlay, is a commonly adopted secure connectivity solution approach. This solution approach relies on customer gateway (GW) infrastructure to exist on the enterprise side and leverages the GW routing schemes for the enterprise to establish connectivity.

A site-to-site VPN typically connects to a Virtual Private Gateway as a Service endpoint on the public cloud provider and the customer gateway on the enterprise side to establish secure connectivity. Depending on the cloud provider capabilities, you can create more complex routing policies and share the connectivity with multiple workloads.

Customer gateways are typically deployed inside the enterprise DMZ and have limited direct access to resources behind the corporate firewalls. In a typical enterprise environment, connection to the outside world is allowed only if it is initiated from inside the firewall. Considering that limitation, one would need to work with Info Security teams to create a special firewall policy exception to make this form of connectivity more meaningful. The Virtual Private Gateway service at the cloud provider has access to the VPC (Virtual Private Cloud) resources based on configured ACLs (Access Control List), Security policies, and any inter VPC routing configurations.

Dedicated connectivity

Dedicated connectivity is established with an ethernet connection from an enterprise location to a CoLo that has a direct connection to the cloud provider network. Over that direct ethernet connection can be any of the forms of VPN and IPsec discussed above.

The main reason to use a direct connect is for SLAs. A direct connection has bandwidth and throughput guarantees, as well as higher availability and redundancy if architected appropriately. For example, with multiple redundant links from multiple locations, and/or with one of the Internet based options as a backup.

This connectivity method builds a dedicated connectivity between Cloud Service Router (or a gateway) on the private DC and Virtual Private Gateway on the public cloud side over the ethernet. This connectivity is point-to-point connectivity between a private DC location and a public cloud region. This connectivity typically has bandwidth and throughput guarantees, and it provides higher availability and redundancy if architected appropriately. This is expensive compared to the other options described above; however, it is the only form of connectivity that can have guaranteed SLAs. Public cloud providers define and support the architecture in collaboration with the CoLo partners and end customers.

Hybrid cloud connectivity architectures and solutions

Having reviewed various connectivity design patterns above, we can now explore the different hybrid cloud architectures that have evolved alongside those connectivity options. While there are many ways you can build and operate in hybrid mode, the following are the most common.

What are the connectivity options that can be used to build hybrid cloud architectures?

Figure 2.           Hybrid Cloud Architecture Patterns

Public cloud extensions

Public cloud extensions provide, on-premises, the same XaaS environment as is found within the public cloud provider’s infrastructure. The main advantages are that the data is still on-premises, and so regulatory and compliance requirements can be satisfied, whilst the development and operations environment is uniform across on-premises and cloud.

Azure offers this architecture as the Azure Stack, which has successfully complimented the prevalence of Microsoft solutions in many private DCs. Variations of the same architecture solution are also offered by Amazon Web Services (AWS Outpost), Google Cloud Platform (GCP Anthos), and others.

Google Anthos is another comprehensive solution out there in the market today.

This form of hybrid cloud architecture is based on deploying a small formfactor (minimum set of services and functionalities replicated from the public cloud stack) public cloud stack on your private cloud and creating a management plane between public cloud and that local stack in the private cloud. This component can be deployed on existing private DC hardware that creates a public cloud VPC extension into the private cloud. The private cloud extension stack is network segmented and so must be given explicit connectivity with other private DC resources, as appropriate.

Connectivity to the corresponding public cloud is typically through CoLo (Colocation Facility operated by service providers) for SLA guarantees, as connectivity over the public internet introduces variability and thus operational challenges. CoLo facilities offer many ways to provide connectivity between private and public clouds, and I do not plan to cover all the options here. However, CoLo is the connectivity termination point for each segment of private and public cloud connectivity and is necessary for setting up backhaul connectivity between the clouds. This architecture is comprehensive and provides ease of operations and SLA guarantees through built-in observability services. However, it creates an infrastructure island inside your private DC, where you must migrate your applications to leverage the hybrid cloud functionality.  

Over the internet – Best efforts connectivity  

Over the internet integration is typical with smaller footprint workloads. Given the potential for intermittent failures of the public internet, responsibility for managing partial failure is assumed by platforms or the applications in this scenario. The application components deployed in public and private DCs communicate over the internet through secure interfaces, e.g., HTTPS, SSH, and similar.

This architecture is the best effort in terms of performance, availability, and security without any SLA guarantees.

Most enterprises use a combination of over the Internet and IPsec tunnels in this architecture. Application architecture, in this case, must take sole responsibility and assume connectivity failures and operational risks for the target SLAs. Given that most enterprises have evolved into a hybrid cloud state through the gradual adoption of public clouds, this architecture is the most common today. There is high degree of variation in this architecture depending on the composition and the deployment location of application components.

SLA-Driven hybrid cloud connectivity

This architecture uses dedicated connectivity, with direct connections via CoLo, between public and private cloud regions for SLA guarantees. It then provides an abstraction for configuring and managing the network configurations on either side seamlessly to provide for a uniform operations model.

This operational uniformity, combined with SLA support, helps overcome some of the limitations in the models discussed above using the public Internet. Workloads can move more freely between on-premises and public cloud computing infrastructures, as the application architecture bears less responsibility for partial failure and so is less sensitive to differences in the connectivity models. Additionally, the given connectivity is inherently secure and validated, application components can communicate seamlessly and components running in public clouds can freely initiate the connection to the components inside the firewall.

Based on the cloud cost report and hybrid cloud trends sighted above, it would be safe to glean that while the public clouds do fuel innovation and agility at a lower cost, private clouds are better suited for workloads that require high availability, dedicated infrastructure, and deeper control across full stack. Hybrid cloud architecture merges the best of both worlds.

Over the last decade, public cloud networking features have continuously evolved and matured to allow such integrations feasible. Dedicated connectivity between the public and private endpoints addresses many operational challenges, including performance, security, and scale associated with hybrid deployments. Now, many SD-WAN (Software-Defined Wide Area Network) fabrics provide cloud extensions (e.g., Cisco SD-WAN Cloud OnRamp) to support this architecture through SDCI (Software-Defined Cloud Interconnect) and are a notable example of this deployment architecture. 

These realizations, coupled with growing experience and maturity, have led to the need for SLA-driven hybrid clouds.

Focus on performance, availability, and reliability

We evolved our hybrid journey by building a scalable private cloud (VMWare and OpenStack), and later integrating with public cloud providers. As we integrated with public clouds, we started with an over the internet hybrid architecture. As we came to better understand how to operate scalable production workloads, we realized the need for an SLA-driven hybrid cloud architecture.

We needed to build a scalable solution that homogenously manages public and private cloud networks and their interconnectivity. While there were operational best practices, visibility, and tools for public and private clouds, they were missing for hybrid cloud operations. Individual and topical solutions existed as many peer enterprises evolved and built them out, but they were difficult to scale and lacked ecosystems that allow you to configure, manage and govern.

As we were evolving, we set out on a journey to provide SLA driven connectivity and network operations. For hybrid cloud connectivity and operations, we needed to consider how the operational processes, network configurations and segmentation, policy management, and support models would align between the multiple cloud environments.

Abstraction of the network configurations and policy management became mandatory, and then core of the hybrid network service. Abstraction, without compromising functionality, is the key to being able to operate such a service at scale.

Our hybrid connectivity service supports the availability, reliability, and performance of both inter-cloud and intra-cloud connectivity and management. This includes Day 0 initial set up, day 1 configurations, and ongoing day 2 operations of hybrid networks. Hybrid cloud architectures are typically complex, so any type of disruption in such networks is not only difficult to isolate and recover from, but also has broad implications on application availability around the globe. Further, since the network is a multi-vendor, foundational service, single points of failure are almost inevitable, and so careful design considerations are required to ensure high availability.

We set three core objectives for our hybrid network services architecture:

    Heterogeneous network management through automation

    Consistent and intent based security policy enforcement

    Comprehensive event correlation and analytics

To achieve these goals, it is critical to have a methodical process where the hybrid network operations concept is fully documented and reviewed. We use a hybrid network design document (HNDD) that contains the end-to-end network architecture, network policy definitions, day-to-day operations and support processes, metrics definitions, and Integration interfaces.

Given such a document at its core, a hybrid network design process can be executed in four distinct phases over which we iterate for continuous optimization.

What are the connectivity options that can be used to build hybrid cloud architectures?

Figure 3.           Hybrid Network Design Phases

Configure: Network connectivity, segmentation, policies, observability

This Configure phase is the initial setup phase where you build the interconnectivity through CoLo and deploy network configurations on each cloud environment. The focus is on setting up and automatically configuring connectivity, segmentation, policies, and observability on each cloud environment, and between them, as per the network design.

During this phase, you implement your network design and register the devices and platforms centrally. Monitoring and logging are configured as per the observability design. All integrations with third-party platforms are set up during this phase to follow the operational process design.

Consistent network policy management across multiple cloud environments is a critical component of hybrid cloud interconnectivity governance.  Often, in a multi-cloud environment, the network policy controls and rules are managed locally and lack central visibility. This approach can be error-prone and overall, create inconsistent policy enforcement. A multi-site policy orchestrator with the target-specific interpretation of a common policy language (e.g., Cisco ACI Anywhere) would be a scalable and secure way to manage the network policies across multi-cloud environments. We leveraged a combination of ACI and Secure Interconnect (custom-built scalable network connectivity solution based on Cisco CSRv routers) on public and private clouds to establish secure network connectivity and consistent hybrid app-centric network policy management.

Observe: End to end observability and correlations

Post configuration, you can start to collect data from devices and platform services. This phase ensures that you have complete visibility into your network routes and traffic flows. To meet the multi-cloud SLAs, you would need to gain visibility into each component that participates in serving the end-user requests from various regions around the world. This concept of layered but correlated visibility into end-user requests or business transactions is achieved through Full Stack Observability (FSO) solutions. There are various Full Stack Observability Solutions on the market, with a varied degree of capabilities and costs. FSO can be achieved by aggregating various data sources and visibility tools (including logs). Depending on the priority and criticality of your business transactions, you would need to build this custom solution. For us, we used various platforms including Cisco AppD, Cisco ThousandEyes, Cisco Intersight, Splunk, ELK stack to name a few.  The Observe phase establishes the quality and completeness of the data to meet the SLA commitments. Insights and intelligence derived from this data creates the foundation for AIOps. 

Manage: Ability to manage heterogeneous network devices

The manage phase assumes initial network configurations and policies are in place and you have complete visibility into the network. Operational processes are followed during this phase and continuously optimized through data collection and configuration updates. This phase focuses on stabilizing and optimizing the end-to-end network operations.

Automate: automated configurations, scaling, healing

As we continue to manage and optimize, the focus needs to move to automated ongoing, day 2, operational tasks. Whilst the initial configuration phase will, of course, also have been automated, this phase focuses on reducing the time to recovery and manual interventions in daily operations, iterating and learning as we go along.

Existing solutions do address a portion of what is required here. Niche solution vendors focus on vendor-agnostic functionality, while public cloud providers focus on richer features and deeper integrations within their services. Hybrid configuration and policy management is an area, though, that tracks behind the curve and so needs more focus. The bigger challenge, however, is to cohesively operate disparate tools to achieve a unified business outcome.

Given what we have explained thus far, we can now review a reference architecture for a hybrid network service system based on these concepts.

What are the connectivity options that can be used to build hybrid cloud architectures?

Figure 4.           A Reference Architecture for a Hybrid Cloud Services System

Aggregated pane of glass and self-service

Before I dive deeper into this concept, note the distinction between aggregated vs single pane of glass. A single pane of glass becomes complicated and frankly useless at some point when you have multiple disparate information sources combined into a single pane.

On the other hand, an aggregated pane of glass focuses on providing end-to-end visibility into the information flow and actionable decision-making by building correlations between various sets of views. To be honest, aggregated pane of glass for network connectivity is expected to continuously evolve and will always be an aspirational goal for any enterprise. It is aspirational for us given the challenge of operating in the combination of brownfield environments (due to the preexisting production workloads) and the ever-evolving needs of the greenfield environments. This goal becomes more challenging with the onboarding of every new acquisition. Additionally, the compelling need of the DevOps teams to adopt emerging technologies on the public and private cloud pushes the goal post further away. We are continually evolving and assimilating various connectivity solutions across our enterprise.

In the target state, a Hybrid cloud network services system should focus on two key goals:

    An aggregated pane of glass for management and monitoring

    Self-service provisioning of hybrid cloud connectivity between end points

Let us review the major components of this aggregated pane of glass

Control Plane Automation

Overall, this architecture relies on the combination of network configuration via Application Programming Interfaces (APIs) for the public clouds and Software-Defined Wide Area Network (SD-WAN) fabric APIs for the private clouds. Some SD-WAN fabrics on the market today even provide public cloud extension functionality (e.g., Cisco SD-WAN Cloud OnRamp) to build hybrid cloud connectivity seamlessly. The control plane in the above architecture forms the foundation for the hybrid network service. It has three core components: A Hybrid Cloud network configurations engine; Hybrid Cloud policy manager; and a Monitoring engine. These components abstract the private and public functions and leverage automation templates, APIs, and Command Line Interface (CLIs).

Hybrid Network Configurations

A network configuration engine leverages a network controller (e.g., SD-WAN controller) to push the configurations on the private DC (Data Center), while, on the public cloud, it uses the combination of SD-WAN extension (e.g., Cisco Cloud OnRamp) and public cloud network APIs to enable consistent configurations. Your ability to push configurations on both public and private clouds consistently would be limited without automation. It becomes even more challenging with brownfield fragmented solutions in place. We are slowly evolving centralized network management functions through various solutions including Cisco SD-WAN Cloud OnRamp and Cisco DNA center.

Hybrid Network Policy Management

Consistent policy management is the centerpiece of day 2 operational governance, and it requires a tiered approach. We plan to document our approach to operational governance in a separate paper. Briefly, network policy governance is driven by the four types of policies requirements: security, cost, business, and compliance need of applications.

The hybrid network policy engine relies on the policy controller (if exists) on the private DC and manages the configuration of policies through public APIs for the public clouds. There are policy engines (e.g., Cisco ACI Anywhere) available on the market that allow you to manage private and public cloud network policies consistently through a single point of action. Again, due to the constraints stated above, we currently have both manual and automated policy management in place. The eventual goal is to assimilate the network policy management under unified operational governance.

Hybrid Network visibility and data correlations

Deep network visibility is not new and natively enabled through most network devices on the market today. The goal of this data correlation component is to consume these millions of events generated by hundreds of active network connections and correlate into actionable intelligence. We leveraged Splunk for the event correlations in some areas and combination of Cisco Thousand Eyes and Cisco AppD in other areas. For public clouds we relied on VPC/NSG flow logs, and other higher level telemetry data.

Unified Operations Management Control

The unified operational management console can further abstract control plane core functionality through ease-of-use workflows, use case-driven functions, and an automation template repository. This console provides day-to-day operational capabilities and enables network operators to implement the Hybrid Network Design Document (HNDD). In a large enterprise, this can be done for groups of applications and corresponding network connectivity.

Secure DevOps (DevSecOps) + AI Ops

Secure DevOps (DevSecOps) processes can be built on foundational day-to-day operational workflows. DevSecOps focuses on building the processes and policies to achieve the network SLAs. This layer can leverage deep network visibility and automated configuration state management to build closed-loop network management. Continuous visibility of hybrid networks through the monitoring agents and logs can form the foundation for AIOps, and the potential transition from reactive to proactive DevSecOps models. Cisco Threat Response (Cisco SecureX) is what we leverage for building the operational workflows and manage full-stack security.

DevSecOps workflows and AIOps insights can be assimilated and aggregated in a unified view to provide status, trends, and insights. SLA-driven network design, enabled by proactive, automated operations, can help simplify the typically complex hybrid network operations.

Cost and operational commitments of the hybrid cloud network services are key factors in making the design and operational choices throughout. Cost considerations include features and functionality offered by the hybrid cloud network service, as well as the availability and scalability of such a service.

Cost structures for network services are complex and, at times hidden in periodic cost snapshots.  Continuous cost monitoring of hybrid cloud network services is critical to have a comprehensive understanding of the operational costs of such a network service. While cost may dictate the functionality and operational experience of hybrid cloud network services, security requirements should never be compromised.

As more workloads adopt hybrid deployment models, business requirements will drive the need for SLA-driven network services and operations. Even though most enterprises have a working hybrid cloud connectivity, they lack standardization and have fragmented operational experience. This moves quite a bit of performance, availability, and reliability responsibilities to the application architecture, and provides fertile grounds for security vulnerabilities. Reliable, stable, and agile network services are table stakes for enterprise-grade hybrid cloud.

This is a high-level view of hybrid network service building blocks. Most of these building blocks exist in some form or shape today but lack interoperability goals and standards. As I suggested early on, every cloud journey will be different, and this paper is not an attempt to prescribe the solutions. Depending on your business and operational needs, you may make different decisions in a comparable situation.

We do recommend considering the following as some of the calls to make it happen

    Outline of business and application growth

    Framework for quality of service & potential global footprint for application delivery

    Security and compliance guardrails for data in transit

    Network segmentation requirements based on enterprise security guidelines

In addition to the rich product and services portfolio we have, Cisco is helping customers leverage products like Cisco CrossWorks (Assurance), Vitria (AIOps), and ServiceNow; all with deeper integration with Cisco products that you can consider in your hybrid-cloud network design.

More white papers to address the standardization, KPI/Metrics and SLA needed.

Here are some of the Cisco products that we leveraged as hybrid network building blocks

    SD WAN Cloud OnRamp

    ACI Anywhere

    CSRv1000

    AppDynamics & ThousandEyes

    DUO

On the other hand, public cloud providers offer their public cloud extensions into your private DC through as discussed above

    Google Anthos: Google Anthos

    Azure Stack: Azure Stack

    AWS Outposts: AWS Outposts

References

·       The Cost of Cloud, a Trillion Dollar Paradox by Sarah Wang and Martin Casado

·       Hybrid cloud connectivity best practices (I came across this article as I was finishing up my write up, and thought it’s a great complimentary read)

Glossary

Terminology

Definition

Workload

Resources required to run a software or an application in its intended target state.

Target Workload

Workload under consideration for sizing of the resources

Hybrid Cloud

An architecture that integrates public and private cloud environments

Hybrid Cloud connectivity/Interconnect

Connectivity between public and private cloud endpoints. There can be multiple of these in a typical hybrid cloud architecture for a given enterprise.

Target Environment

Identified host environment for the workload under consideration.

Data at Rest

Data in persisted on any physical storage. Typically used to identify data that is currently not being accessed or used.

Data in transit

Data in active movement between two end points.

Authors

Hemal Surti, Principal Architect, CX-CXPM

Contributors

Chocks Ramiah, Senior Software Architect, CX-CXPM

Reviewers

Nathan Sowatskey, Principal Architect CX-CTO

Javier Guillermo, Principal Architect, CX-CXPM

Ashley Novak, Principal Architect CX-CTO

Shannon McFarland, Distinguished Engineer, ET&I

Vijay Raghavendran, Distinguished Engineer, CX-CTO

What are the connectivity options that can be used to build hybrid cloud architectures AWS?

The network interconnecting on-premises infrastructure with AWS can be through dedicated physical connections, VPN, or over the internet. With AWS Direct Connect, you can establish a private virtual interface from your on- premises network directly to your Amazon VPC.

Which service would provide network connectivity in a hybrid architecture?

Which service would provide network connectivity in a hybrid architecture that includes the AWS Cloud? Amazon Virtual Private Cloud (Amazon VPC) is a logically isolated, private section of the AWS Cloud to launch resources in a virtual data center in the cloud.

What is hybrid cloud connectivity?

A hybrid cloud network is a network that enables data transfers between on-premises IT resources, private clouds and public clouds, in other words, a hybrid cloud.

What is hybrid connectivity in AWS?

AWS hybrid connectivity solutions link your premises to AWS, and your premises to each other, to support applications that span locations—without the need to settle for reduced performance.