DataStax reveals K8ssandra as sky-native Cassandra


Following the follow-up to the release of a Kubernetes (K8s) operator to Cassandra last spring, DataStax unveils a full-blown open-source distribution of Cassandra built for Kubernetes. It is extended to the original operator along with preconfigured tools and guides for implementing and configuring Cassandra on Kubernetes clusters, while utilizing several different open source projects for observability. This is the latest move by DataStax to launch a cloud-native development in the Apache open source project.

Announced at the virtual edition of KubeCon this week, the K8ssandra (pronounced “Kay-sandra) starts with the original DataStax Cass operator. It also includes some naming tools that reflect the fact that at least some members of the project have a sense of humor. They include Cassandra reaper, which is not the personification of death, but rather a very harmless garbage collector or defragger to disk that cleans up committed or neglected prior logs. Cassandra Reaper was created by Spotify and advanced by Last Pickle, the Cassandra consulting firm Last Pickle before DataStax bought the company last March. And then there is Cassandra medusa, which is not a Greek mythological monster, but a modest tool for backing up and restoring data.

In addition, K8ssandra builds on open source Prometheus for the collection of measurements and Grafana for visualization, where both are preconfigured to collect specific metrics and provide some jumpstart dashboards. Rounding off the distro are Helm diagrams to guide database administrators and Site Reliability Engineers in setting up and operating Cassandra clusters in a Kubernetes environment.

Much of the content for the distro, from the preconfiguration of Prometheus and Grafana, and the best practices outlined in the Helm charts, has come directly from DataStax’s experience running Astra, its managed DataStax Enterprise cloud service. For example, Prometheus was preconfigured to retrieve specific key metrics from Cassandra, and Grafana was packed with several pre-built dashboards. Customers can, of course, expand and customize from these jumpstarts or develop their own integrations.

More to that point, K8ssandra is part of DataStax’s strategy to adapt runtime to the open source community. While DataStax no longer controls the Apache Cassandra project, within the last few years it has doubled its efforts to get back in line with them. The K8s operator, which was the distribution line, was introduced to the community as an open source project on GitHub last spring. As we noted a few months back, while job one for the community right now will be Cassandra 4.0 out the door (it’s still inside beta), the group begins to consider how and whether to place cloud-native in the project mainstream.

So the DataStax operator is not currently part of the central Apache project; it was introduced to the community, some of which members had previously developed their own operators. It is noteworthy that in the press release announcing K8ssandra, Orange, which developed one of the operators, supported DataStax’s features. “I’m pleased to see K8ssandra expand what we do as a community to make Cassandra the standard for data on Kubernetes,” said Franck Dehay, software engineer at Orange.