Contributing to Cassandra

Getting Started

Initial Contributions

Writing a new feature is just one way to contribute to the Cassandra project. In fact, making sure that supporting tasks, such as quality testing, documentation, and helping users are completed is just as important. Tracking the development of new features is an ongoing challenge for this project, like most open source projects. We suggest learning how this project gets things done before tackling a new feature. Here are some suggestions for ways to contribute:

  • Update the documentation

  • Answer questions on the user list

  • Review and test a submitted patch

  • Investigate and fix a reported bug

  • Create unit tests and d-tests

Updating documentation

The Cassandra documentation is maintained in the Cassandra source repository along with the Cassandra code base. To submit changes to the documentation, follow the standard process for submitting a patch.

Answering questions on the user list

Subscribe to the user list, look for some questions you can answer and write a reply. Simple as that! See the community page for details on how to subscribe to the mailing list.

Reviewing and testing a submitted patch

Reviewing patches is not the sole domain of committers. If others review a patch, it can reduce the load on the committers. Less time spent reviewing patches means committers can more great features or review more complex patches. Follow the instructions in How to review or alternatively, create a build with the patch and test it with your own workload. Add a comment to the JIRA ticket to let others know you’ve reviewed and tested, along with the results of your work. For example:

"I tested this performance enhancement on our application’s standard production load test and found a 3% improvement."

Investigate and/or fix a reported bug

Often, the hardest work in fixing a bug is reproducing it. Even if youdon’t have the knowledge to produce a fix, figuring out a way to reliably reproduce an issue can be a massive contribution. Document your method of reproduction in a JIRA comment or, better yet, produce an automated test that reproduces the issue and attach it to the ticket. If you go as far as producing a fix, follow the process for submitting a patch.

Create unit tests and Dtests

Test coverage for Cassandra will always benefit from more automated test coverage, as with most code bases. Before starting work on a particular area of code, consider reviewing and enhancing the existing test coverage. You’ll both improve your knowledge of the code before you start on an enhancement, and reduce the chance introducing issues with your change. See testing and patches for more detail.

Building and IDE Integration

Building From Source

Building Cassandra from source is the first important step in contributing to the Apache Cassandra project. You’ll need to install Java 8, Git, and Ant first.

The source code for Cassandra is shared on the central Apache Git repository and organized by branch, one branch for each major version. You can access the code for the current development branch using:

git clone https://gitbox.apache.org/repos/asf/cassandra.git cassandra-trunk

Other branches will point to different versions of Cassandra. Switching to a different branch requires checking out the branch. For example, to checkout the latest version of Cassandra 3.0, use:

git checkout cassandra-3.0

You can get a list of available branches with git branch.

Build Cassandra using ant:

ant

This may take a significant amount of time depending on artifacts that have to be downloaded or the number of classes that need to be compiled.

Hint

You can setup multiple working trees for different Cassandra versions from the same repository using git-worktree.

Now you can get started with Cassandra using IntelliJ IDEA or Eclipse.

Setting up Cassandra in IntelliJ IDEA

IntelliJ IDEA by JetBrains is one of the most popular IDEs for Cassandra and Java development in general. The Community Edition can be freely downloaded with all features needed to get started developing Cassandra.

Use the following procedure for Cassandra 2.1.5+. If you wish to work with older Cassandra versions, see our wiki for instructions.

First, clone and build Cassandra. Then execute the following steps to use IntelliJ IDEA.

  1. Generate the IDEA files using ant:

ant generate-idea-files
  1. Start IDEA.

  2. Open the IDEA project from the checked-out Cassandra directory using File > Open in IDEA’s menu.

The project generated by ant generate-idea-files contains nearly everything you need to debug Cassandra and execute unit tests. You should be able to:

  • Run/debug defaults for JUnit

  • Run/debug configuration for Cassandra daemon

  • Read/modify the license header for Java source files

  • Study Cassandra code style

  • Inspections

Opening Cassandra in Apache NetBeans

Apache NetBeans is an older open source Java IDE, and can be used for Cassandra development. There is no project setup or generation required to open Cassandra in NetBeans. Use the following procedure for Cassandra 4.0+.

First, clone and build Cassandra. Then execute the following steps to use NetBeans.

  1. Start Apache NetBeans

  2. Open the NetBeans project from the ide/ folder of the checked-out Cassandra directory using File > Open Project in NetBeans' menu.

You should be able to:

  • Build code

  • Run code

  • Debug code

  • Profile code

These capabilities use the build.xml script. Build/Run/Debug Project are available via the Run/Debug menus, or the project context menu. Profile Project is available via the Profile menu. In the opened Profiler tab, click the green "Profile" button. Cassandra’s code style is honored in ide/nbproject/project.properties. The JAVA8_HOME system environment variable must be set for NetBeans to execute the Run/Debug/Profile ant targets to execute.

Setting up Cassandra in Eclipse

Eclipse is a popular open source IDE that can be used for Cassandra development. Various Eclipse environments are available from the download page. The following guide was created with "Eclipse IDE for Java Developers".

These instructions were tested on Ubuntu 16.04 with Eclipse Neon (4.6) using Cassandra versions 2.1 through 3.x.

First, clone and build Cassandra. Then execute the following steps to use Eclipse.

  1. Generate the IDEA files using ant:

ant generate-eclipse-files
  1. Start Eclipse.

  2. Open the Eclipse project from the checked-out Cassandra directory using File > Import > Existing Projects and Workspace > Select git directory. Select the correct branch, such as cassandra-trunk.

  3. Confirm and select Finish to import your project.

Find the project in Package Explorer or Project Explorer. You should not get errors if you build the project automatically using these instructions. Don’t set up the project before generating the files with ant.

You should be able to:

  • Run/debug defaults for JUnit

  • Run/debug Cassandra

  • Study Cassandra code style

Unit tests can be run from Eclipse by simply right-clicking the class file or method and selecting Run As > JUnit Test. Tests can be debugged by defining breakpoints (double-click line number) and selecting Debug As > JUnit Test.

Alternatively all unit tests can be run from the command line as described in testing.

Debugging Cassandra Using Eclipse

There are two ways to start a local Cassandra instance with Eclipse for debugging. You can either start Cassandra from the command line or from within Eclipse.

Debugging Cassandra started at command line
  1. Set environment variable to define remote debugging options for the JVM: export JVM_EXTRA_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=1414"

  2. Start Cassandra by executing the ./bin/cassandra

Next, connect to the running Cassandra process by:

  1. In Eclipse, select Run > Debug Configurations.

image
  1. Create new remote application.

image
  1. Configure connection settings by specifying a name and port 1414. Confirm Debug and start debugging.

image
Debugging Cassandra started from Eclipse

Cassandra can also be started directly from Eclipse if you don’t want to use the command line.

  1. In Eclipse, select Run > Run Configurations.

image
  1. Create new application.

image
  1. Specify name, project and main class org.apache.cassandra.service.CassandraDaemon

image
  1. Configure additional JVM specific parameters that will start Cassandra with some of the settings created by the regular startup script. Change heap related values as needed.

-Xms1024M -Xmx1024M -Xmn220M -Xss256k -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark -javaagent:./lib/jamm-0.3.0.jar -Djava.net.preferIPv4Stack=true
image
  1. Confirm Debug and you should see the output of Cassandra start up in the Eclipse console.

You can now set breakpoints and start debugging!

Testing

Creating tests is one of the most important and also most difficult parts of developing Cassandra. There are different ways to test your code depending on what you’re working on.

Unit Testing

The simplest test to write for Cassandra code is a unit test. Cassandra uses JUnit as a testing framework and test cases can be found in the test/unit directory. Ideally, you’d be able to create a unit test for your implementation that would exclusively cover the class you created (the unit under test). Unfortunately, this is not always possible, because Cassandra doesn’t have a very mock friendly code base. Often you’ll find yourself in a situation where you have to use an embedded Cassandra instance to interact with your test. If you want to make use of CQL in your test, you can extend CQLTester and use some of the convenient helper methods, as shown here:

@Test
public void testBatchAndList() throws Throwable
{
   createTable("CREATE TABLE %s (k int PRIMARY KEY, l list<int>)");
   execute("BEGIN BATCH " +
           "UPDATE %1$s SET l = l +[ 1 ] WHERE k = 0; " +
           "UPDATE %1$s SET l = l + [ 2 ] WHERE k = 0; " +
           "UPDATE %1$s SET l = l + [ 3 ] WHERE k = 0; " +
           "APPLY BATCH");

   assertRows(execute("SELECT l FROM %s WHERE k = 0"),
              row(list(1, 2, 3)));
}

Unit tests can be run using the command ant test. Both test suites and individual tests can be executed.

Test suite:

ant test -Dtest.name=<simple_classname>

For example, replace <simple_classname> with SimpleQueryTest to test all the methods in org.apache.cassandra.cql3.SimpleQueryTest.

Individual test:

ant testsome -Dtest.name=<FQCN> -Dtest.methods=<testmethod1>[,testmethod2]

For example, replace <FQCN> with org.apache.cassandra.cql3.SimpleQueryTest and <testmethod1> with testStaticCompactTables to test just the one method.

If you get the following error for a unit test, install the ant-optional package because you need the JUnitTask class:

Throws: cassandra-trunk/build.xml:1134: taskdef A class needed by class org.krummas.junit.JStackJUnitTask cannot be found:
org/apache/tools/ant/taskdefs/optional/junit/JUnitTask  using the classloader
AntClassLoader[/.../cassandra-trunk/lib/jstackjunit-0.0.1.jar]

Tests that consume a significant amount of time during execution can be found in the test/long directory. They can be executed as a regular JUnit test or standalone program. Except for the execution time, there’s nothing really special about them, but ant will only execute these test with the ant long-test target.

DTests

One way of doing integration or system testing at larger scale is using dtest (Cassandra distributed test). These dtests automatically setup Cassandra clusters with certain configurations and simulate use cases you want to test. DTests are Python scripts that use ccmlib from the ccm project. The clusters set up with dtests run like ad-hoc clusters executed with ccm on your local machine.

Once a cluster is initialized, the Python driver is used to interact with the nodes, manipulate the file system, analyze logs, or change individual nodes.

The CI server uses dtests against new patches to prevent regression bugs. Committers can set up build branches and use the CI environment to run tests for your submitted patches. Read more on the motivation behind continuous integration.

The best way to learn how to write dtests is probably by reading the introduction "http://www.datastax.com/dev/blog/how-to-write-a-dtest[How to Write a Dtest]". Looking at existing, recently updated tests in the project is another good activity. New tests must follow certain style conventions that are checked before contributions are accepted. In contrast to Cassandra, dtest issues and pull requests are managed on github, therefore you should make sure to link any created dtests in your Cassandra ticket and also refer to the ticket number in your dtest PR.

Creating a good dtest can be tough, but it should not prevent you from submitting patches! Please ask in the corresponding JIRA ticket how to write a good dtest for the patch. In most cases a reviewer or committer will able to support you, and in some cases they may offer to write a dtest for you.

Performance Testing

Performance tests for Cassandra are a special breed of tests that are not part of the usual patch contribution process. In fact, many people contribute a lot of patches to Cassandra without ever running performance tests. However, they are important when working on performance improvements; such improvements must be measurable.

Several tools exist for running performance tests. Here are a few to investigate:

Code Style

Code Style

General Code Conventions

Exception handling

  • Never ever write catch (…​) {} or catch (…​) { logger.error() } merely to satisfy Java’s compile-time exception checking. Always propagate the exception up or throw RuntimeException (or, if it "can’t happen," AssertionError). This makes the exceptions visible to automated tests.

  • Avoid propagating up checked exceptions that no caller handles. Rethrow as RuntimeException (or IOError, if that is more applicable).

  • Similarly, logger.warn() is often a cop-out: is this an error or not? If it is don’t hide it behind a warn; if it isn’t, no need for the warning.

  • If you genuinely know an exception indicates an expected condition, it’s okay to ignore it BUT this must be explicitly explained in a comment.

Boilerplate

  • Avoid redundant @Override annotations when implementing abstract or interface methods.

  • Do not implement equals or hashcode methods unless they are actually needed.

  • Prefer public final fields to private fields with getters. (But prefer encapsulating behavior in "real" methods to either.)

  • Prefer requiring initialization in the constructor to setters.

  • Avoid redundant this references to member fields or methods.

  • Do not extract interfaces (or abstract classes) unless you actually need multiple implementations of it.

  • Always include braces for nested levels of conditionals and loops. Only avoid braces for single level.

Multiline statements

  • Try to keep lines under 120 characters, but use good judgement. It is better to exceed 120 by a little, than split a line that has no natural splitting points.

  • When splitting inside a method call, use one line per parameter and align the items called:

SSTableWriter writer = new SSTableWriter(cfs.getTempSSTablePath(),
                                         columnFamilies.size(),
                                         StorageService.getPartitioner());
  • When splitting a ternary, use one line per clause, carry the operator, and align by indenting with 4 white spaces:

var = bar == null
    ? doFoo()
    : doBar();

Whitespace

  • Make sure to use 4 spaces instead of the tab character for all your indentation.

  • Many lines in the current files have a bunch of trailing whitespace. If you encounter incorrect whitespace, clean up in a separate patch. Current and future reviewers won’t want to review whitespace diffs.

Imports

Observe the following order for your imports:

java
[blank line]
com.google.common
org.apache.commons
org.junit
org.slf4j
[blank line]
everything else alphabetically

Format files for IDEs

How-to Commit

If you are a committer, feel free to pick any process that works for you - so long as you are planning to commit the work yourself.

Here is how committing and merging typically look for merging and pushing for tickets that follow the convention (if patch-based). A hypothetical CASSANDRA-12345 ticket used in the example is a cassandra-3.0 based bug fix that requires different code for cassandra-3.3, and trunk. Contributor Jackie is supplying a patch for the root branch (12345-3.0.patch), and patches for the remaining branches (12345-3.3.patch, 12345-trunk.patch).

On cassandra-3.0
  1. git am -3 12345-3.0.patch (if we have a problem b/c of CHANGES.txt not merging anymore, we modify it ourselves, in place)

On cassandra-3.3
  1. git merge cassandra-3.0 -s ours

  2. git apply -3 12345-3.3.patch (likely to have an issue with CHANGES.txt here: modify it ourselves, then git add CHANGES.txt)

  3. git commit -amend

On trunk
  1. git merge cassandra-3.3 -s ours

  2. git apply -3 12345-trunk.patch (likely to have an issue with CHANGES.txt here: modify it ourselves, then git add CHANGES.txt)

  3. git commit -amend

On any branch
  1. git push origin cassandra-3.0 cassandra-3.3 trunk -atomic

Same scenario, but a branch-based contribution:

On cassandra-3.0
  1. git cherry-pick <sha-of-3.0-commit> (if we have a problem b/c of CHANGES.txt not merging anymore, we modify it ourselves, in place)

On cassandra-3.3
  1. git merge cassandra-3.0 -s ours

  2. git format-patch -1 <sha-of-3.3-commit>

  3. git apply -3 <sha-of-3.3-commit>.patch (likely to have an issue with CHANGES.txt here: modify it ourselves, then git add CHANGES.txt)

  4. git commit -amend

On trunk
  1. git merge cassandra-3.3 -s ours

  2. git format-patch -1 <sha-of-trunk-commit>

  3. git apply -3 <sha-of-trunk-commit>.patch (likely to have an issue with CHANGES.txt here: modify it ourselves, then git add CHANGES.txt)

  4. git commit -amend

On any branch
  1. git push origin cassandra-3.0 cassandra-3.3 trunk -atomic

Notes on git flags

The -3 flag used with git am or git apply will instruct git to perform a 3-way merge. If a conflict is detected, you can either resolve it manually or invoke git mergetool.

The -atomic flag to git push does the obvious thing: pushes all or nothing. Without the flag, the command is equivalent to running git push once per each branch. This is nifty if a race condition occurs - you won’t push half the branches, blocking other committers’ progress while you are resolving the issue.

Tip

The fastest way to get a patch from someone’s commit in a branch on github if you don’t have their repo in remote, is to append .patch to the commit url: curl -O github.com/apache/cassandra/commit/7374e9b5ab08c1f1e612bf72293ea14c959b0c3c.patch

Review Checklist

When reviewing tickets in Apache JIRA, the following items should be covered as part of the review process:

General

  • Does it conform to the code_style guidelines?

  • Is there any redundant or duplicate code?

  • Is the code as modular as possible?

  • Can any singletons be avoided?

  • Can any of the code be replaced with library functions?

  • Are units of measurement used in the code consistent, both internally and with the rest of the ecosystem?

Error-Handling

  • Are all data inputs and outputs checked (for the correct type, length, format, and range) and encoded?

  • Where third-party utilities are used, are returning errors being caught?

  • Are invalid parameter values handled?

  • Are any Throwable/Exceptions passed to the JVMStabilityInspector?

  • Are errors well-documented? Does the error message tell the user how to proceed?

  • Do exceptions propagate to the appropriate level in the code?

Documentation

  • Do comments exist and describe the intent of the code (the "why", not the "how")?

  • Are javadocs added where appropriate?

  • Is any unusual behavior or edge-case handling described?

  • Are data structures and units of measurement explained?

  • Is there any incomplete code? If so, should it be removed or flagged with a suitable marker like ‘TODO’?

  • Does the code self-document via clear naming, abstractions, and flow control?

  • Have NEWS.txt, the cql3 docs, and the native protocol spec been updated if needed?

  • Is the ticket tagged with "client-impacting" and "doc-impacting", where appropriate?

  • Has lib/licences been updated for third-party libs? Are they Apache License compatible?

  • Is the Component on the JIRA ticket set appropriately?

Testing

  • Is the code testable? i.e. don’t add too many or hide dependencies, unable to initialize objects, test frameworks can use methods etc.

  • Do tests exist and are they comprehensive?

  • Do unit tests actually test that the code is performing the intended functionality?

  • Could any test code use common functionality (e.g. ccm, dtest, or CqlTester methods) or abstract it there for reuse?

  • If the code may be affected by multi-node clusters, are there dtests?

  • If the code may take a long time to test properly, are there CVH tests?

  • Is the test passing on CI for all affected branches (up to trunk, if applicable)? Are there any regressions?

  • If patch affects read/write path, did we test for performance regressions w/multiple workloads?

  • If adding a new feature, were tests added and performed confirming it meets the expected SLA/use-case requirements for the feature?

Logging

  • Are logging statements logged at the correct level?

  • Are there logs in the critical path that could affect performance?

  • Is there any log that could be added to communicate status or troubleshoot potential problems in this feature?

  • Can any unnecessary logging statement be removed?

Contributing Code Changes

Choosing What to Work on

Submitted patches can include bug fixes, changes to the Java code base, improvements for tooling (both Java or Python), documentation, testing or any other changes that requires changing the code base. Although the process of contributing code is always the same, the amount of work and time it takes to get a patch accepted also depends on the kind of issue you’re addressing.

As a general rule of thumb
  • Major new features and significant changes to the code base will likely not be accepted without deeper discussion within the developer community.

  • Bug fixes take higher priority compared to features.

  • The extent to which tests are required depends on how likely your changes will effect the stability of Cassandra in production. Tooling changes requires fewer tests than storage engine changes.

  • Less complex patches will be reviewed faster; consider breaking up an issue into individual tasks and contributions that can be reviewed separately.

Hint

Not sure what to work? Just pick an issue marked as Low Hanging Fruit Complexity in JIRA, which flags issues that often turn out to be good starter tasks for beginners.

Before You Start Coding

Although contributions are highly appreciated, we do not guarantee that every contribution will become a part of Cassandra. Therefore, it’s generally a good idea to first get some feedback on the thing you plan to do, especially about any new features or major changes to the code base. You can reach out to other developers on the mailing list or Slack.

You should also
  • Avoid redundant work by searching for already reported issues in JIRA to work on.

  • Create a new issue early in the process describing what you’re working on - before finishing your patch.

  • Link related JIRA issues with your own ticket to provide a better context.

  • Update your ticket from time to time by giving feedback on your progress and link a GitHub WIP branch with your current code.

  • Ping people who you actively like to ask for advice on JIRA by mentioning users.

There are also some fixed rules that you need to be aware
  • Patches will only be applied to branches by following the release model

  • Code must be testable

  • Code must follow the code_style convention

  • Changes must not break compatibility between different Cassandra versions

  • Contributions must be covered by the Apache License

Choosing the Right Branches to Work on

There are currently multiple Cassandra versions maintained in individual branches:

Version Policy

4.0

Code freeze (see below)

3.11

Critical bug fixes only

3.0

Critical bug fixes only

2.2

Critical bug fixes only

2.1

Critical bug fixes only

Corresponding branches in git are easy to recognize as they are named cassandra-<release> (e.g. cassandra-3.0). The trunk branch is an exception, as it contains the most recent commits from all other branches and is used for creating new branches for future tick-tock releases.

4.0 Code Freeze

Patches for new features are currently not accepted for 4.0 or any earlier versions. All efforts should focus on stabilizing the 4.0 branch before the first official release. During that time, only the following patches will be considered for acceptance:

  • Bug fixes

  • Measurable performance improvements

  • Changes not distributed as part of the release such as:

  • Testing related improvements and fixes

  • Build and infrastructure related changes

  • Documentation

Bug Fixes

Creating patches for bug fixes is a bit more complicated and will depend on how many different versions of Cassandra are affected. In each case, the order for merging such changes will be cassandra-2.1cassandra-2.2cassandra-3.0cassandra-3.xtrunk. But don’t worry, merging from 2.1 would be the worst case for bugs that affect all currently supported versions, an uncommon event. As a contributor, you’re also not expected to provide a single patch for each version. What you need to do however is:

  • Be clear about which versions you could verify to be affected by the bug

  • For 2.x: ask if a bug qualifies to be fixed in this release line, as this may be handled on case by case bases

  • If possible, create a patch against the lowest version in the branches listed above (e.g. if you found the bug in 3.9 you should try to fix it already in 3.0)

  • Test if the patch can be merged cleanly across branches in the direction listed above

  • Be clear which branches may need attention by the committer or even create custom patches for those if you can

Creating a Patch

So you’ve finished coding and the great moment arrives: it’s time to submit your patch!

  1. Create a branch for your changes if you haven’t done already. Many contributors name their branches based on ticket number and Cassandra version, e.g. git checkout -b 12345-3.0

  2. Verify that you follow Cassandra’s code_style

  3. Make sure all tests (including yours) pass using ant as described in testing. If you suspect a test failure is unrelated to your change, it may be useful to check the test’s status by searching the issue tracker or looking at CI results for the relevant upstream version. Note that the full test suites take many hours to complete, so it is common to only run specific relevant tests locally before uploading a patch. Once a patch has been uploaded, the reviewer or committer can help setup CI jobs to run the full test suites.

  4. Consider going through the how_to_review for your code. This will help you to understand how others will consider your change for inclusion.

  5. Don’t make the committer squash commits for you in the root branch either. Multiple commits are fine - and often preferable - during review stage, especially for incremental review, but once +1d, do either:

  1. Attach a patch to JIRA with a single squashed commit in it (per branch), or

  2. Squash the commits in-place in your branches into one

  1. Include a CHANGES.txt entry (put it at the top of the list), and format the commit message appropriately in your patch as below. Please note that only user-impacting items should be listed in CHANGES.txt. If you fix a test that does not affect users and does not require changes in runtime code, then no CHANGES.txt entry is necessary.

    <One sentence description, usually Jira title and CHANGES.txt summary>
    <Optional lengthier description>
    patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####
  2. When you’re happy with the result, create a patch:

    git add <any new or modified file>
    git commit -m '<message>'
    git format-patch HEAD~1
    mv <patch-file> <ticket-branchname.txt> (e.g. 12345-trunk.txt, 12345-3.0.txt)

Alternatively, many contributors prefer to make their branch available on GitHub. In this case, fork the Cassandra repository on GitHub and push your branch:

git push --set-upstream origin 12345-3.0
  1. To make life easier for your reviewer/committer, you may want to make sure your patch applies cleanly to later branches and create additional patches/branches for later Cassandra versions to which your original patch does not apply cleanly. That said, this is not critical, and you will receive feedback on your patch regardless.

  2. Attach the newly generated patch to the ticket/add a link to your branch and click "Submit Patch" at the top of the ticket. This will move the ticket into "Patch Available" status, indicating that your submission is ready for review.

  3. Wait for other developers or committers to review it and hopefully +1 the ticket (see how_to_review). If your change does not receive a +1, do not be discouraged. If possible, the reviewer will give suggestions to improve your patch or explain why it is not suitable.

  4. If the reviewer has given feedback to improve the patch, make the necessary changes and move the ticket into "Patch Available" once again.

Once the review process is complete, you will receive a +1. Wait for a committer to commit it. Do not delete your branches immediately after they’ve been committed - keep them on GitHub for a while. Alternatively, attach a patch to JIRA for historical record. It’s not that uncommon for a committer to mess up a merge. In case of that happening, access to the original code is required, or else you’ll have to redo some of the work.

CI Environments

About CI testing and Apache Cassandra

Cassandra can be automatically tested using various test suites, that are either implemented based on JUnit or the dtest scripts written in Python. As outlined in testing, each kind of test suite addresses a different way to test Cassandra. Eventually, all of the tests will be executed together on the CI platform at builds.apache.org, running Jenkins.

Setting up your own Jenkins server

Jenkins is an open source solution that can be installed on a large number of platforms. Setting up a custom Jenkins instance for Cassandra may be desirable for users who have hardware to spare, or organizations that want to run Cassandra tests for custom patches before contribution.

Please refer to the Jenkins download and documentation pages for details on how to get Jenkins running, possibly also including slave build executor instances. The rest of the document will focus on how to setup Cassandra jobs in your Jenkins environment.

Required plugins

In addition, the following plugins need to be installed along with the standard plugins (git, ant, ..).

You can install any missing plugins using the install manager.

Go to Manage Jenkins → Manage Plugins → Available and install the following plugins and respective dependencies:

  • Job DSL

  • Javadoc Plugin

  • description setter plugin

  • Throttle Concurrent Builds Plug-in

  • Test stability history

  • Hudson Post build task

Setup seed job

  1. Config New Item

    • Name it Cassandra-Job-DSL

    • Select Freestyle project

  2. Under Source Code Management select Git using the repository: github.com/apache/cassandra-builds

  3. Under Build, confirm Add build stepProcess Job DSLs and enter at Look on Filesystem: jenkins-dsl/cassandra_job_dsl_seed.groovy

Generated jobs will be created based on the Groovy script’s default settings. You may want to override settings by checking This project is parameterized and add String Parameter for on the variables that can be found in the top of the script. This will allow you to setup jobs for your own repository and branches (e.g. working branches).

  1. When done, confirm "Save".

You should now find a new entry with the given name in your project list. However, building the project will still fail and abort with an error message "Processing DSL script cassandra_job_dsl_seed.groovy ERROR: script not yet approved for use". Go to Manage JenkinsIn-process Script Approval to fix this issue. Afterwards you should be able to run the script and have it generate numerous new jobs based on the found branches and configured templates.

Jobs are triggered by either changes in Git or are scheduled to execute periodically, e.g. on daily basis. Jenkins will use any available executor with the label "cassandra", once the job is to be run. Please make sure to make any executors available by selecting Build Executor StatusConfigure → Add “cassandra” as label and save.

Executors need to have "JDK 1.8 (latest)" installed. This is done under Manage Jenkins → Global Tool Configuration → JDK Installations…. Executors also need to have the virtualenv package installed on their system.

CircleCI

Cassandra ships with a default CircleCI configuration to enable running tests on your branches. Go to the CircleCI website, click "Login" and log in with your github account. Then give CircleCI permission to watch your repositories.

Once you have done that, you can optionally configure CircleCI to run tests in parallel if you wish:

  1. Click Projects and select your github account, and then click the settings for your project.

  2. Set the parallelism setting. If you leave the default value of 1 for Cassandra, only ant eclipse-warnings and ant test will be run. If you change the value to 4, Circle CI also runs ant long-test, ant test-compression and ant stress-test.

Dependency Management

Managing libraries for Cassandra is a bit less straight forward compared to other projects, as the build process is based on ant, maven and manually managed jars. Make sure to follow the steps below carefully and pay attention to any emerging issues in the ci and reported related issues on Jira/ML, in case of any project dependency changes.

As Cassandra is an Apache product, all included libraries must follow Apache’s software license requirements.

Required steps to add or update libraries

  • Add or replace jar file in lib directory

  • Add or update lib/license files

  • Update dependencies in build.xml

    • Add to parent-pom with correct version

    • Add to all-pom if simple Cassandra dependency (see below)

POM file types

  • parent-pom - contains all dependencies with the respective version. All other poms will refer to the artifacts with specified versions listed here.

  • build-deps-pom(-sources) + coverage-deps-pom - used by ant build compile target. Listed dependenices will be resolved and copied to build/lib/{jar,sources} by executing the maven-ant-tasks-retrieve-build target. This should contain libraries that are required for build tools (grammar, docs, instrumentation), but are not shipped as part of the Cassandra distribution.

  • test-deps-pom - refered by maven-ant-tasks-retrieve-test to retrieve and save dependencies to build/test/lib. Exclusively used during JUnit test execution.

  • all-pom - pom for cassandra-all.jar that can be installed or deployed to public maven repos via ant publish

Troubleshooting and conflict resolution

Here are some useful commands that may help you out resolving conflicts.

  • ant realclean - gets rid of the build directory, including build artifacts.

  • mvn dependency:tree -f build/apache-cassandra-*-SNAPSHOT.pom -Dverbose -Dincludes=org.slf4j

    • shows transitive dependency tree for artifacts, e.g. org.slf4j. In case the command above fails due to a missing parent pom file, try running ant mvn-install.

  • rm ~/.m2/repository/org/apache/cassandra/apache-cassandra/ - removes cached local Cassandra maven artifacts

Working on Documentation

Working on Documentation

How Cassandra is documented

The official Cassandra documentation lives in the project’s git repository. We use a static site generator, Antora, to create pages hosted at cassandra.apache.org.

<!-- You’ll also find developer-centric content about Cassandra internals in our retired wiki (not covered by this guide). -→

Using a static site generator often requires the use of a markup language instead of visual editors (which some people would call good news). Antora processes Ascidoc, the markup language used to generate our documentation. Markup languages allow you to format text using certain syntax elements. Your document structure will also have to follow specific conventions. Feel free to take a look at existing documents to get a better idea how we structure our documents.

So how do you actually start making contributions?

GitHub based work flow

Recommended for shorter documents and minor changes on existing content (e.g. fixing typos or updating descriptions)

Follow these steps to contribute using GitHub. It’s assumed that you’re logged in with an existing account.

  1. Fork the GitHub mirror of the Cassandra repository

image
  1. Create a new branch that you can use to make your edits. It’s recommended to have a separate branch for each of your working projects. It will also make it easier to create a pull request later to when you decide you’re ready to contribute your work.

image
  1. Navigate to document sources doc/source/modules to find the .adoc file to edit. The URL of the document should correspond to the directory structure within the modules, where first the component name, such as cassandra is listed, and then the actual pages inside the pages directory. New files can be created using the "Create new file" button:

image
  1. At this point you should be able to edit the file using the GitHub web editor. Start by naming your file and add some content. Have a look at other existing .adoc files to get a better idea what format elements to use.

image

Make sure to preview added content before committing any changes.

image
  1. Commit your work when you’re done. Make sure to add a short description of all your edits since the last time you committed before.

image
  1. Finally if you decide that you’re done working on your branch, it’s time to create a pull request!

image

Afterwards the GitHub Cassandra mirror will list your pull request and you’re done. Congratulations! Please give us some time to look at your suggested changes before we get back to you.

Jira based work flow

Recommended for major changes

Significant changes to the documentation are best managed through our Jira issue tracker. Please follow the same contribution guides as for regular code contributions. Creating high quality content takes a lot of effort. It’s therefore always a good idea to create a ticket before you start and explain what you’re planning to do. This will create the opportunity for other contributors and committers to comment on your ideas and work so far. Eventually your patch gets a formal review before it is committed.

Working on documents locally using Antora

Recommended for advanced editing

Using the GitHub web interface should allow you to use most common layout elements including images. More advanced formatting options and navigation elements depend on Antora to render correctly. Therefore, it’s a good idea to setup Antora locally for any serious editing. Please follow the instructions in the Cassandra source directory at doc/README.md. Setup is very easy (at least on OSX and Linux).

Notes for committers

Please feel free to get involved and merge pull requests created on the GitHub mirror if you’re a committer. As this is a read-only repository, you won’t be able to merge a PR directly on GitHub. You’ll have to commit the changes against the Apache repository with a comment that will close the PR when the committ syncs with GitHub.

You may use a git work flow like this:

git remote add github https://github.com/apache/cassandra.git
git fetch github pull/<PR-ID>/head:<PR-ID>
git checkout <PR-ID>

Now either rebase or squash the commit, e.g. for squashing:

git reset --soft origin/trunk
git commit --author <PR Author>

Make sure to add a proper commit message including a "Closes #<PR-ID>" text to automatically close the PR.

Publishing

Details for building and publishing of the site at cassandra.apache.org can be found here.

Release Process

The steps for Release Managers to create, vote, and publish releases for Apache Cassandra.

While a committer can perform the initial steps of creating and calling a vote on a proposed release, only a PMC member can complete the process of publishing and announcing the release.

Prerequisites

A debian based linux OS is required to run the release steps from. Debian-based distros provide the required RPM, dpkg and repository management tools.

Create and publish your GPG key

To create a GPG key, follow the guidelines. The key must be 4096 bit RSA. Include your public key in:

https://dist.apache.org/repos/dist/release/cassandra/KEYS

Publish your GPG key in a PGP key server, such as MIT Keyserver.

Bintray account with access to Apache organisation

Publishing a successfully voted upon release requires bintray access to the Apache organisation. Please verify that you have a bintray account and the Apache organisation is listed here.

Create Release Artifacts

Any committer can perform the following steps to create and call a vote on a proposed release.

Check that there are no open urgent jira tickets currently being worked on. Also check with the PMC that there’s security vulnerabilities currently being worked on in private.' Current project habit is to check the timing for a new release on the dev mailing lists.

Perform the Release

Run the following commands to generate and upload release artifacts, to the ASF nexus staging repository and dev distribution location:

cd ~/git
git clone https://github.com/apache/cassandra-builds.git
git clone https://github.com/apache/cassandra.git

# Edit the variables at the top of the `prepare_release.sh` file
edit cassandra-builds/cassandra-release/prepare_release.sh

# Ensure your 4096 RSA key is the default secret key
edit ~/.gnupg/gpg.conf # update the `default-key` line
edit ~/.rpmmacros # update the `%gpg_name <key_id>` line

# Ensure DEBFULLNAME and DEBEMAIL is defined and exported, in the debian scripts configuration
edit ~/.devscripts

# The prepare_release.sh is run from the actual cassandra git checkout,
# on the branch/commit that we wish to tag for the tentative release along with version number to tag.
cd cassandra
git switch cassandra-<version-branch>

# The following cuts the release artifacts (including deb and rpm packages) and deploy to staging environments
../cassandra-builds/cassandra-release/prepare_release.sh -v <version>

Follow the prompts.

If building the deb or rpm packages fail, those steps can be repeated individually using the -d and -r flags, respectively.

Call for a Vote

Fill out the following email template and send to the dev mailing list:

I propose the following artifacts for release as <version>.

sha1: <git-sha>

Git: https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/<version>-tentative

Artifacts: https://repository.apache.org/content/repositories/orgapachecassandra-<nexus-id>/org/apache/cassandra/apache-cassandra/<version>/

Staging repository: https://repository.apache.org/content/repositories/orgapachecassandra-<nexus-id>/

The distribution packages are available here: https://dist.apache.org/repos/dist/dev/cassandra/${version}/

The vote will be open for 72 hours (longer if needed).

[1]: (CHANGES.txt) https://git1-us-west.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=<version>-tentative
[2]: (NEWS.txt) https://git1-us-west.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=<version>-tentative

Post-vote operations

Any PMC member can perform the following steps to formalize and publish a successfully voted release.

Publish Artifacts

Run the following commands to publish the voted release artifacts:

cd ~/git
# edit the variables at the top of the `finish_release.sh` file
edit cassandra-builds/cassandra-release/finish_release.sh

# After cloning cassandra-builds repo, `finish_release.sh` is run from the actual cassandra git checkout,
# on the tentative release tag that we wish to tag for the final release version number tag.
cd ~/git/cassandra/
git checkout <version>-tentative
../cassandra-builds/cassandra-release/finish_release.sh -v <version>

If successful, take note of the email text output which can be used in the next section "Send Release Announcement". The output will also list the next steps that are required.

Promote Nexus Repository

  • Login to Nexus repository again.

  • Click on "Staging" and then on the repository with id "cassandra-staging".

  • Find your closed staging repository, right click on it and choose "Promote".

  • Select the "Releases" repository and click "Promote".

  • Next click on "Repositories", select the "Releases" repository and validate that your artifacts exist as you expect them.

Publish the Bintray Uploaded Distribution Packages

Log into bintray and publish the uploaded artifacts.

Update and Publish Website

See docs for building and publishing the website.

Also update the CQL doc if appropriate.

Release version in JIRA

Release the JIRA version.

  • In JIRA go to the version that you want to release and release it.

  • Create a new version, if it has not been done before.

Update to Next Development Version

Update the codebase to point to the next development version:

cd ~/git/cassandra/
git checkout cassandra-<version-branch>
edit build.xml          # update `<property name="base.version" value="…"/> `
edit debian/changelog   # add entry for new version
edit CHANGES.txt        # add entry for new version
git commit -m "Increment version to <next-version>" build.xml debian/changelog CHANGES.txt

# …and forward merge and push per normal procedure

Wait for Artifacts to Sync

Wait for the artifacts to sync at downloads.apache.org/cassandra/

Send Release Announcement

Fill out the following email template and send to both user and dev mailing lists:

The Cassandra team is pleased to announce the release of Apache Cassandra version <version>.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is <the first|a bug fix> release[1] on the <version-base> series. As always,
please pay attention to the release notes[2] and let us know[3] if you
were to encounter any problem.

Enjoy!

[1]: (CHANGES.txt) https://git1-us-west.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=<version>
[2]: (NEWS.txt) https://git1-us-west.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=<version>
[3]: https://issues.apache.org/jira/browse/CASSANDRA

Update Slack Cassandra topic ---------------------------

Update topic in cassandra Slack room <slack>

/topic cassandra.apache.org | Latest releases: 3.11.4, 3.0.18, 2.2.14, 2.1.21 | ask, don’t ask to ask

Tweet from @Cassandra

Tweet the new release, from the @Cassandra account

Delete Old Releases

As described in When to Archive.

An example of removing old releases:

svn co https://dist.apache.org/repos/dist/release/cassandra/ cassandra-dist
svn rm <previous_version> debian/pool/main/c/cassandra/<previous_version>*
svn st
# check and commit