Testing cloud native best practices with the CNF Test Suite

, , , ,
testing cloud native best practices with cnf test suite

Here at The Linux Foundation’s blog, we share content from our projects, such as this article by Joel Hans from the Cloud Native Computing Foundation’s blog

The telecommunications industry is the backbone of today’s increasingly-digital economies, but it faces a difficult new challenge in evolving to meet modern infrastructure practices. How did telecommunications get itself into this situation? Because the risks of incidents or downtime are so severe, the industry has focused almost exclusively on system designs that minimize risk and maximize reliability. That’s fantastic for mission-critical services, whether public air traffic control or private high-speed banking, but it emphasizes stability over productivity and the adoption of new technologies that might make their operations more resilient and performant.

Telecommunications is playing catch-up on cloud native technology, and the downstream effects are starting to show. These organizations are now behind the times on the de facto choices for enterprise and IT, which means they’re less likely to recruit the top-tier engineering talent they need. In increasingly competitive landscapes, they need to escalate productivity and deploy new telephony platforms to market faster, not get quagmired in old custom solutions built in-house.

To make that leap from internally-trusted to industry-trusted tooling, telecommunications organizations need confidence that they’re on track to properly evolve their virtual network function (VNF) infrastructure to enable cloud native functions using Kubernetes. That’s where CNCF aims to help.

Enter the CNF Test Suite for telecommunications

A cloud native network function (CNF) is an application that implements or facilitates network functionality in a cloud native way, developed using standardized principles and consisting of at least one microservice.

And the CNF Test Suite (cncf/cnf-testsuite) is an open source test suite for telcos to know exactly how cloud native their CNFs are. It’s designed for telecommunications developers and network operators, building with Kubernetes and other cloud native technology, to validate how well they’re following cloud native principles and best practices, like immutable infrastructure, declarative APIs, and a “repeatable deployment process.”

The CNCF is bringing together the Telecom User Group (TUG) and the Cloud Native Network Function Working Group (CNF WG) to implement the CNF Test Suite, which helps telco developers and ops teams build faster feedback loops thanks to the suite’s flexible testing and optimized execution time. Because it can be integrated into any CI/CD pipeline, whether in development or pre-production checks, or run as a standalone test for a single CNF, telecommunications development teams get at-a-glance understanding of how their new deployments align with the cloud native ecosystem, including CNCF-hosted projects, technologies, and concepts.

It’s a powerful answer to a difficult question: How cloud native are we?

The CNF Test Suite leverages 10 CNCF-hosted projects and several open source tools. A modified version of CoreDNS is used as an example CNF for end users to get familiar with the test suite in five steps, and Prometheus is utilized in an observability test to check the best practice for CNFs to actively expose metrics. And it packages other upstream tools, like OPA GatekeeperHelm linter, and Promtool to make installation, configuration, and versioning repeatable. The CNF Test Suite team is also grateful to contributions from Kyverno on security tests, LitmusChaos for resilience tests, and Kubescope for security policies.

The minimal install for the CNF Test Suite requires only a running Kubernetes cluster, kubectl, curl, and helm, and even supports running CNF tests on air-gapped machines or those who might need to self-host the image repositories. Once installed, you can use an example CNF or bring your own—all you need is to supply the .yml file and run `cnf-testsuite all` to run all the available tests. There’s even a quick five-step process for deploying the suite and getting recommendations in less than 15 minutes.

What the CNF Test Suite covers and why

At the start of 2022, the CNF Test Suite can run approximately 60 workload tests, which are segmented into 7 different categories.

Best practices

Compatibility, Installability & Upgradability: CNFs should work with any Certified Kubernetes product and any CNI-compatible network that meet their functionality requirements while using standard, in-band deployment tools such as Helm (version 3) charts. The CNF Test Suite checks whether the CNF can be horizontally and vertically scaled using `kubectl` to ensure it can leverage Kubernetes’ built-in functionality.

Microservice: The CNF should be developed and delivered as a microservice for improved agility, or the development time required between deployments. Agile organizations can deploy new features more frequently or allow multiple teams to safely deploy patches based on their functional area, like fixing security vulnerabilities, without having to sync with other teams first.

State: A cloud native infrastructure should be immutable, environmentally-agnostic, and resilient to node failure, which means properly managing configuration, persistent data, and state. A CNF’s configuration should be stateless, stored in a custom resource definition or a separate database over local storage, with any persistent data managed by StatefulSets. Separate stateful and stateless information makes for infrastructure that’s easily reproduced, consistent, disposable, and always deployed in a repeatable way.

Reliability, Resilience & Availability: Reliability in telco infrastructure is the same as standard IT—it needs to be highly secure and reliable and support ultra-low latencies. Cloud native best practices try to reduce mean time between failure (MTBF) by relying on redundant subcomponents with higher serviceability (mean time to recover (MTTR)), and then testing those assumptions through chaos engineering and self-healing configurations. The Test Suite uses a type of chaos testing to ensure CNFs are resilient to the inevitable failures of public cloud environments or issues on an orchestrator level, such as what happens when pods are unexpectedly deleted or run out of computing resources. These tests ensure CNFs meet the telco industry’s standards for reliability on non-carrier-grade shared cloud hardware/software platforms.

Observability & Diagnostics: Each piece of production cloud native infrastructure must make its internal states observable through metrics, tracing, and logging. The CNF Test suite looks for compatibility with FluentdJaegerPromtoolPrometheus, and OpenMetrics, which help DevOps or SRE teams maintain, debug, and gather insights about the health of their production environments, which must be versioned, maintained in source control, and altered only through deployment pipelines.

Security: Cloud native security requires attention from experts at the operating system, container runtime, orchestration, application, and cloud platform levels. While many of these fall outside the scope of the CNF Test Suite, it still validates whether containers are isolated from one another and the host, do not allow privilege escalation, have defined resource limits, and are verified against common CVEs.

Configuration: Teams should manage a CNF’s configuration in a declarative manner—using ConfigMaps, Operators, or other declarative interfaces—to design the desired outcome, not how to achieve said outcome. Declarative configuration doesn’t have to be executed to be understood, making it far less prone to error than imperative configuration or even the most well-maintained sequences of `kubectl` commands.

After deploying numerous tests in each category, the CNF Test Suite outputs flexible scoring and suggestions for remediation for each category (or one category if you chose that in the CLI), giving you practical next steps on improving your CNF to better follow cloud native best practices. It’s a powerful—and still growing—solution for the telecommunications industry to embrace the cloud native in a way that’s controllable, observable, and validated by all the expertise under the CNCF umbrella.

What’s next for the CNF Test Suite?

The Test Suite initiative will continue to work closely with the Telecom User Group (TUG) and the Cloud Native Network Function Working Group (CNF WG), collecting feedback based on real-world use cases and evolving the project. As the CNF WG publishes more recommended practices for cloud native telcos, the CNF Test Suite team will add more tests to validate each.

In fact, v0.26.0, released on February 25, 2022, includes six new workload tests, bug fixes, and improved documentation around platform tests. If you’d like to get involved and shape the future of the CNF Test Suite, there are already several ways to provide feedback or contribute code, documentation, or example CNFs:

Looking ahead: The CNF Certification Program

The CNF Test Suite is just the first exciting step in the upcoming Cloud Native Network Function (CNF) Certification Program. We’re looking forward to making the CNF Test Suite the de facto tool for network equipment providers and CNF development teams to prove—and then certify—that they’re adopting cloud native best practices in new products and services.

The wins for the telecommunications industry are clear:

  • Providers get verification that their cloud native applications and architectures adhere to cloud native best practices.
  • Their customers get verification that the cloud native services or networks they’re procuring are actually cloud native.

And they both get even better reliability, reduced risk, and lowered capital/operating costs.

We’re planning on supporting any product that runs in a certified Kubernetes environment to make sure organizations build CNFs that are compatible with any major public cloud providers or on-premises environments. We haven’t yet published the certification requirements, but they will be similar to the k8s-conformance process, where you can submit results via pull request and receive updates on your certification process over email.

As the CNF Certification Program develops, both the TUG and CNF-WG will engage with organizations that use the Test Suite heavily to make improvements and stay up-to-date on the latest cloud native best practices. We’re excited to see how the telecommunications industry evolves by adopting more cloud native principles, like loosely-coupled systems and immutability, and gathering proof of their hard work via the CNF Test Suite. That’s how we ensure a complex and essential industry makes the right next steps away toward the best technology infrastructure has to offer—without sacrificing an inch on reliability.

To take the next steps with the CNF Test Suite and prepare your organization for the upcoming CNF Certification Program, schedule a personalized CNF Test Suite demo or attend Cloud Native Telco Day, a co-located Event at KubeCon + CloudNativeCon Europe 2022 on May 16, 2022.