By Jason English
Principal Analyst: Intellyx
Pull up to the fire, pardner. Let me tell you about a time before cloud, when we used to wrangle as many as 10 different prod and pre-prod environments in our data center, loaded with hundreds of VMs, largely by looking at system-level metrics and logs to see which racked servers we’d need to update or reboot.
The wild west days of operating physical IT infrastructure for use by application development teams—before the cloud abstracted everything—really aren’t that far in the past. Even as recently as a decade ago, most executives probably thought of the cloud as a place for hosting SaaS apps, rather than an operating model for application environments.
Today, we must support DevSecOps patterns across both private and public clouds, each with its own complexities. Operations teams are being asked to provide management and observability front ends, databases, and other services, many of which are still delivered as virtual machines while more and more are being delivered within customized Docker containers and ephemeral Kubernetes-orchestrated pods.
Now hold on a minute—it sure sounds like we are still living in the wild west! How can platform operations teams corral so many moving parts for application delivery teams, and bring this heterogeneous hybrid cloud herd into the future?
Drawing ops into the dev container wrangling business
The early stages of cloud-native development happened in fits and starts, as developers sought ways to run applications within highly portable containers that could run practically anywhere, encapsulating dependencies including code, OS, and libraries.
For the first time, developers could get past some of the operational constraints of legacy IT approaches. They could realistically provision their own target applications—just download a container image from a library that looks close enough, install code, and spin it up on their own test server, in the data center, or on AWS or Azure-–in less than a second.
Kubernetes (or K8s) came along and offered even more promise for development teams to control their own destiny, by letting them deploy and orchestrate complete cloud-native environments with internal networking, security, and data handling features, within the release pipeline.
However, anyone who has ever had to install and setup a K8s cluster much less maintain their clusters post-release in a real production environment will tell you that wrangling containers across multiple releases and distributions, in a way that supports an entire enterprise just ain’t that simple. Time for ops teams to ride to the rescue…
The changing role of Operations on the hybrid cloud frontier
Cloud-native development and CI/CD approaches encouraged us to govern the organization’s herd of K8s clusters and containers as ‘cattle not pets’ — so that developers favor ops-approved packages and configurations over highly bespoke, unique distributions.
The concept of platform engineering arose from the DevOps tenet of ops teams providing a platform for self-service environments, so devs could grab a self-service environment that represents the organization’s current target architecture. Devs want to continue building new functionality, rather than requesting ops involvement for provisioning environments for each release.
Unfortunately, even the best-laid provisioning plans didn’t account for the widespread developer use of Git-style repos of readily downloadable K8s cluster definitions and container configurations that would be shared among developers in the wild over time.
Furthermore, no application stands alone, even if developers started building it in the ‘globally approved’ environment. Modern applications have many dependencies on third-party services and API calls, while still needing to bring forward the existing enterprise fleet of VMs and core systems that never really went away.
To address this moving architectural target, the discipline of platform operations picks up where the self-service concept of platform engineering leaves off on Day 1, and operationalizes the continuing management of their environments on Day 2 and beyond, when devs are equipping themselves to deliver the next set of features and rethink how they are deployed.
Normalizing the inconsistencies of operating dev environments
Over time, large companies will acquire many unique and heterogeneous technology stacks as they grow and adapt their hybrid cloud application and data estate to the bespoke needs of customers, partners and their own delivery groups serving different parts of the business.
This leaves platform operations teams with the difficult—but not insurmountable—task of normalizing a highly inconsistent development and delivery environment.
We’d love to be able to agree on standard definitions, but whose standard will everyone trust? VMware? Red Hat? Rancher? AWS? Azure? In any decent-sized enterprise, there’s a massive number of permutations of IaC configurations, charts, and delivery mechanisms that have already been selected in the past that may be hard to swap out or replace without serious impact on the business.
Rather than forcing dev teams to accept a single rigid enterprise standard, the platform operations approach offers developers a self-service ‘paved road’ approach of offering a library of tested and well-maintained packages proven most likely to work for different groups of constituents and application types.
This paved road can extend to the underpinning K8s clusters themselves. While there are historical business and technical reasons a given distribution may have been selected, there is an opportunity to streamline how those clusters are requested, deployed, and managed. It’s also important that the road does not exist on an island, given that enterprise application libraries are still made up of more than just containers. The goal is to minimize the bumps and bruises along the way as the herd makes its way across the bare metal, virtualized, and hyperscale landscape.
Platform operations teams maintain awareness of all available hybrid IT infrastructure for the region and application use case, so developers and product teams can auto-pilot provisioning to continue down the ‘paved road’ approach, or specify which type of infrastructure would work best for a given project.
Keeping the paved road clear of potholes
Some production environments and networked services trend toward peak demand levels and become costly bottlenecks to successful outcomes, whereas others become as abandoned as ghost towns with tumbleweeds rolling by. Platform operations work on behalf of finance and other key stakeholders to continuously optimize the utilization of clusters, containers, VMs, and more across the hybrid cloud estate. Frequently used environments need high availability and autoscaling capabilities on highly performant infrastructure, while underutilized environments that are in less demand should be deprioritized or decommissioned to save costs.
Who owns what in this hybrid cloud, and who pays for what? Answering authorization and FinOps questions in a highly interdependent microservices world requires more than simply assigning account ownership rights to individuals or groups, or arbitrarily rate-limiting services to meet cost-cutting goals.
Platform operations teams also offer a central point of policy for governing access and chargebacks, allowing companies to continue using their preferred service management tools, in order to realize efficiencies of scale and management effort over the long term.
The Intellyx Take
Platform operations is a practice that advances the goals of the enterprise for maturing the use of hybrid cloud infrastructure. In an organization, platform operations teams fulfill the -Ops requirement of DevSecOps, where usage patterns are variable and the demand for ready environments that meet the application form factor is highest.
Platform operations practices don’t generally dictate a particular technology stack, but instead can best support development teams through a well-oiled self-service portal that eliminates friction while still providing operational guardrails.
Morpheus Data offers a solution designed for hybrid cloud platform operations teams that can be represented to the organization as a self-service platform operations portal, managing most known flavors of containers, K8s clusters, VMs, cloud-native PaaS workloads, and more, so development teams can use their preferred CI/CD, GitOps, service management, and collaboration tooling, atop almost any private or public cloud infrastructure. Beyond that, their platform includes significant day-2 capabilities to tie into centralized monitoring, logging, backup, and other technologies to complete the picture.
Even as the DevOps frontier expands outward over the cloud-native deployment horizon, platform engineering, and platform operations make it easier for developers to benefit from well-governed, tested, and defined images and cluster layouts that help the org meet performance, cost, and compliance goals—rather than riding out like rogue cowboys and rustling their own.
Copyright ©2023 Intellyx LLC. Intellyx is solely responsible for the content of this article. As of the time of writing, Morpheus Data is an Intellyx customer. No AI chatbots were used to write this article