Denylisting Partitions

Like any database, Apache Cassandra can be impacted by data modeling choices gone awry or the unexpected use of even a well-designed data model. This occurs when partitions get too large or have too many rows or tombstones. These problematic partitions can impact the reading and writing of other partitions, which can lead to service-impacting incidents. While the Cassandra community is always actively working to reduce the impact they have, it can be necessary for operators to take immediate action when they are encountered. Apache Cassandra 4.1 introduces a new feature to give operators this control: the Partition Denylist.

The Partition Denylist allows operators to make a choice between providing access to the entire data set with reduced performance or reducing the available data set (by preventing reads and writes to specific partitions) to ensure performance is not affected. This choice gives operators new control over how these problematic partitions affect other reads and writes serviced by Cassandra, converting operations that would be slow and eat resources to immediate failures before any resources are consumed. Operators have the choice to prevent reads and/or writes to the partition as well as range reads that include the partition.

The Operators Perspective

For operators, JMX is the primary interface to interact with the Denylist. There are four operations available to operators: add a key to the denylist, check if a key is currently in the denylist, remove a key from the denylist, and force a refresh of a node’s local denylist cache. The examples below will use jmxterm but any JMX interface will work.

Denylisting a Key

To add a key to the denylist all that is needed is the keyspace, table, and partition key. In jmxterm denylisting the key baz in the keyspace foo and table bar looks like:

bean org.apache.cassandra.db:type=StorageProxy
run denylistKey foo bar baz

All operations are defined on the StorageProxy mbean.

Checking Denylist Status

To determine if a key (baz) for a particular keyspace and table (foo.bar) is in the denylist use the isKeyDenylisted operation:

bean org.apache.cassandra.db:type=StorageProxy
run isKeyDenylisted foo bar baz

Removing a Key from the Denylist

After mitigation or for other reasons, operators will eventually want to remove a key from the denylist. Like the other operations, all that is needed is the keyspace, table, and key. In this case, to remove the key baz in foo.bar from the denylist:

bean org.apache.cassandra.db:type=StorageProxy
run removeDenylistKey foo bar baz

Refreshing the Denylist Cache

For performance purposes, the denylist implementation maintains a local cache on every node. The cache is refreshed periodically (by default every 5 minutes) but it may be necessary to manually refresh it. This can be done with a quick JMX call on the nodes where its necessary (typically on the nodes that own a partition that was recently added):

bean org.apache.cassandra.db:type=StorageProxy
run loadPartitionDenylist

Other Denylist Configuration

Other aspects of the denylist are controlled via properties in cassandra.yaml. Operators can control whether reads, range reads, and/or writes fail for partitions listed in the denylist. They can also control the frequency of local node cache refreshes and the consistency level used to load the cache (higher consistency gives more accuracy at the cost of potentially not being able to complete the query while lower consistency increases the chances of a successful load while also increasing the potential to have stale or missed records in the denylist).

The Client Perspective

For clients of Cassandra, they have no control over what partitions are in the denylist. They will however get an informative exception telling them why the read or write failed. In the example below key 3 has been placed into the denylist while key 9 remains unaffected – both reads and writes are configured to be denied.

cqlsh> select * from system_distributed.partition_denylist;

 ks_name | table_name | key
---------+------------+------------
     foo |        buz | 0x00000003

cqlsh> select * from foo.buz where key=3;
InvalidRequest: Error from server: code=2200 [Invalid query] message= \
  "Unable to read denylisted partition [0xDecoratedKey(9010454139840013625, 00000003)] in foo/buz"

cqlsh> select * from foo.buz where key=9;

 key | c1 | c2 | c3 | c4
-----+----+----+----+----
   9 |  1 |  3 |  4 |  5
   9 |  9 |  3 |  4 |  5

cqlsh> insert into foo.buz (key, c1, c2, c3, c4) VALUES (3, 1, 1, 1, 1);
InvalidRequest: Error from server: code=2200 [Invalid query] message= \
  "Unable to write to denylisted partition [0xDecoratedKey(9010454139840013625, 00000003)] in foo/buz"

The Partition Denylist provides operators with a powerful tool when data models run off the tracks. The Cassandra community continues to strengthen and widen those tracks with the hopes that the need to prevent reads and writes to specific partitions will diminish. However, users always find new ways to surprise database authors and administrators. For those cases, the Partition Denylist provides a quick solution with a significant impact.