Protection file

3 Reasons We Need Data Protection in Kubernetes – The New Stack

To understand how increasingly important it is to protect Kubernetes applications, it’s instructive to compare today’s container environments with what they looked like when Kubernetes started just eight years ago.

Gaurav Rishi

Gaurav is Vice President of Products and Partnerships at Kasten by Veeam. He is at the forefront of several Kubernetes ecosystem partnerships and has been a frequent speaker and author on cloud native innovations. He previously led product strategy and management for Cisco’s Cloud Media Processing business.

Back then, protecting data in container environments was an afterthought. Containers were envisioned as lightweight, stateless constructs that could be spun up quickly to launch applications. Since these simplistic applications were not dependent on data and could be stopped or restarted without any appreciable side effects, companies did not place a high priority on protection strategies.

That changed as Kubernetes became a much more ubiquitous out-of-the-box project. With users running hundreds of nodes inside clusters, it quickly became clear that any large Kubernetes application that deals with business functions will require data to persist beyond the initial container launch (customer shopping cart , banking, etc.).

Kubernetes has evolved rapidly to add these features to manage state, including constructs such as StatefulSets, Container Storage Interface (CSI), etc. This evolution has led to data protection initiatives such as backup and disaster recovery as an imperative and priority in organizations.

Let’s dive into some of the drivers behind native Kubernetes data protection:

  1. The rise of cloud-native applications.
  2. The proliferation of stateful applications.
  3. Changing roles and scopes in computing.

Cloud-native apps

As architectures (servers, virtual server, containers) have evolved and become more dynamic and distributed, core data protection has remained an imperative.

With cloud-native applications in a Kubernetes operating environment, the underlying application architecture is completely different from hypervisor-based environments. Therefore, a new Kubernetes-native data protection approach is needed. A few examples that highlight the changes include that with Kubernetes, pods are constantly rescheduled to different physical nodes, so using the VM as a backed up unit doesn’t work. Additionally, with Kubernetes, there is an order of magnitude increase in the number of metadata objects (secrets, Configmaps, etc.) that need to be backed up in addition to storage volume data, making backups based on the unsuitable hypervisor.

Therefore, a Kubernetes-native solution that uses cloud-native applications as the unit of atomicity for backup and restore operations should be the goal of any organization looking to modernize its infrastructure and applications.

Stateful apps

While Kubernetes-based applications started out as simplistic, short-lived stateless workloads, a lot has changed since then. Applications that solve serious business functions need state. It was not developmentally or operationally optimal to run your stateless builds in a Kubernetes environment and a stateful database in a legacy environment.

Thus, Kubernetes itself has evolved to include constructs that allow cloud-native applications to contain state that persists in individual pods. These builds included the introduction of StatefulSets in 2017, which made it possible to manage distributed database clusters in a highly available environment. Operator frameworks started gaining popularity in 2018, allowing applications to control their lifecycle operations and define dependencies for individual microservices, including those that contain state. In the same year, Container Storage Interfaces (CSI) were made generally available to allow storage vendors to expose standard block and file interfaces to applications. In 2020, volume snapshots became part of the Kubernetes v1.20 release, allowing you to restore or clone data from a previous snapshot. And many more features have since been added and are currently being worked on to make the stateful application a breeze to use in your favorite Kubernetes environment.

The net result of all these advancements is that databases are among the most popular workloads on Kubernetes today. Redis, Postgres, MySQL, etc., are all examples of some of the best technologies running on containers. This has brought immense productivity gains and simplified operations. However, making sure your environment is protected with easy-to-use, native Kubernetes backup and DR tools is even more imperative.

Switch roles

One of the ways Kubernetes accelerates and improves application development and delivery is by bridging the gap between infrastructure and application teams. Infrastructure teams are typically responsible for building and delivering the tools that manage secure cloud-native infrastructure. Let’s call them suppliers. Application teams are the consumers of these tools and focus on building business applications.

Kubernetes enables infrastructure teams to create flexible environments that can span on-premises and cloud deployments. These environments can be complemented by a platform that provides common features such as security, backup, and DR that protects applications introduced as part of a Kubernetes cluster. Application teams, on the other hand, do not need to open service tickets and wait for a long process to perform functions such as data recovery or restores. Instead, they can leverage self-service capabilities to perform these functions if they have been authenticated and authorized to do so.

This is where Kubernetes native role-based access control (RBAC) comes in. A native Kubernetes data protection tool is aware of these RBAC constructs and can ensure that application teams can access and gain visibility and operations only on the applications and namespaces that their Kubernetes administrator has configured. . This, along with container-optimized operating systems like Bottlerocket or Red Hat Enterprise Linux, ensures that the attack surface is contained while maintaining operations agility and separation of concerns.

Conclusion

As enterprises adopt Kubernetes as their operating environment, data protection initiatives such as backup and disaster recovery have become imperative and a priority. This will require choosing the right Kubernetes-native data protection tool that gives infrastructure and application teams the ability to innovate at DevOps speed while ensuring that cloud-native applications can scale and operate smoothly.

Feature image via Pixabay.