Cassandra Case Studies

From startups to the largest enterprises, the world runs on Cassandra.

Ably

Ably

Apache Cassandra is trusted to scale at internet level and designed to scale without limits. Which is why, at Ably Realtime, we use Cassandra for our persistent storage of messages.

Activision

Activision

Activision built a new system to message players with highly personalised communication. It used large amounts of real-time data and was built with Apache Cassandra

AdStage

AdStage

AdStage is constantly monitoring performance trends and optimizing campaigns on behalf of advertisers. Apache Cassandra delivered the operational speed the company needed to ensure that the platform has low latency and the required throughput.

Airship

Airship

Where we originally stored device data in a set of Postgres shards, our scale quickly outpaced our capacity to add new shards, so we moved to a multiple database architecture using HBase and Cassandra.

Apple

Apple

A year ago, Apple said that it was running over 75,000 Cassandra nodes, storing more than 10 petabytes of data. At least one cluster was over 1,000 nodes, and Apple regularly gets millions of operations per second (reads/writes) with Cassandra.

Backblaze

Backblaze

We needed something that would handle really high write throughput and keep scaling on the write throughput. That forced us to look at distributed stores, and Apache Cassandra was the option that fitted what we needed.

Bazaarvoice

Bazaarvoice

EmoDB is an open source RESTful data store built on top of Cassandra that stores JSON documents and, most notably, offers a databus that allows subscribers to watch for changes to those documents in real time.

Best Buy

Best Buy

Best Buy uses Apache Cassandra to manage massive spikes in holiday traffic — 7x traffic spikes and bursts > 50,000 rps — and calls it “flawless.”

Bigmate

Bigmate

In vetting MySQL, MongoDB, and other potential databases for IoT scale, Bigmate found they couldn’t match the scalability they could get with open source Apache Cassandra, which allows them to handle millions of operations or concurrent users each second.

BlackBerry

BlackBerry

BlackBerry deployed Apache Cassandra as the NoSQL database solution for its Internet of Things (IoT) platform. The BlackBerry IoT platform powers the BlackBerry Radar IoT solution designed to provide continuous visibility into an organization’s transportation fleet.

BlackRock

BlackRock

At BlackRock, we use Apache Cassandra in a variety of ways to help power our Aladdin investment management platform. In this talk I will give an overview of our use of Cassandra, with an emphasis on how we manage multi-tenancy in our Cassandra infrastructure.

Bloomberg Engineering

Bloomberg

Bloomberg Engineering is working on a multi-year build, creating a new Index Construction Platform to handle the daily production of the Bloomberg Barclays fixed income indices, using Apache Cassandra and Apache Solr.

Bundesagentur für Arbeit

Bundesagentur für Arbeit (Federal Agency)

The IT system department needed a new solution for real-time monitoring of applications and business processes, and to be able to quickly counteract any negative influences. They selected Apache Cassandra because it could be tailored to their needs.

Campaign Monitor

Campaign Monitor

Campaign Monitor knew that shifting to a new database technology was a major undertaking. They chose Apache Cassandra as their strategic operational database platform due to its exceptional reliability, manageability at scale and open source community.

CERN

CERN

P-BEAST consists of 20,000 applications running on 2,400 interconnected computers. CERN uses Apache Cassandra to satisfy the large time series data rates, flexibility and scalability requirements entailed by the project.

Clear Capital

Clear Capital

Clear Capital is a leader in property valuation solutions for North America. Cassandra provides the foundation of the Clear Capital technology platform.

Cloudkick

CloudKick

Cloudkick uses Apache Cassandra for configuration data as well as metrics storage, a key element in keeping up with metrics processing as well as providing a high quality user experience with fast loading graphs.

CloudTrax

CloudTrax

The Open-Mesh team knew that Apache Cassandra was ideal for their intended capability. The solution had the scalability and data storage requirements to meet the needs of the CloudTrax platform.

Constant Contact

Constant Contact

Constant Contact uses Cassandra to manage social media data for over 400k small business customers. Its largest production cluster has over 100 TB of data in over 150 machines.

DataCloud

DataCloud

The oil & gas industry stores sensor data in an industry-specific document database, where data access is only available through a proprietary API based on SOAP and XML. DataCloud solved this by transferring this data into an Apache Cassandra database cluster

Discord

Discord

Cassandra was the only database that fulfilled all of Discord’s requirements, as they can add nodes to scale it and it can tolerate a loss of nodes without any impact on the application. Related data is stored contiguously on disk providing minimum seeks and easy distribution around the cluster.

Dream11

Dream11

The company started its operations in 2008 and started offering single match fantasy sports in 2012. It is India’s Biggest Sports Gaming platform with users playing Fantasy Cricket, Football, Kabaddi, Basketball & Hockey. Dream11 is the Official Fantasy partner of the VIVO Indian Premier League (IPL), International Council of Cricket (ICC)

eBay

eBay

A glimpse on our Cassandra deployment: Dozens of nodes across multiple clusters 200 TB+ storage provisioned 400M+ writes & 100M+ reads per day, and growing QA, LnP, and multiple Production clusters.

Equinix

Equinix

Equinix uses Cassandra for its ease of operation, and always-on node architecture — and its peer-to-peer architecture guarantees no single point of failure to collect and store streaming data from infrastructure instruments.

Flant

Flant

Flant has been successfully using the Rook operator to operate its Cassandra cluster in Kubernetes and provides tips on how it changed some parameters in the Cassandra config.

Fractal Labs

Fractal Labs

Fractal’s APIs aggregates data, and analyses permission-based banking, accounting and payments data so that financial institutions can provide timely nudges and insights to help their small business clients with funding and to better understand their finances.

Grubhub

Grubhub

Grubhub runs a service oriented platform that primarily operates out of multiple AWS data centers (regions). It moved to cloud infrastructure to accelerate its growth, using Apache Cassandra as its primary persistent data store.

Home Depot

Home Depot

Home Depot also used DataStax and Apache Cassandra to stand up curbside apps quickly. Siddiqui said Home Depot is a big open source shop.

Hornet

Hornet

This is probably going to be the most engineering non-answer ever, which is simply that I haven’t really had to care about Cassandra since we made the changes and upgrades. Usually if I was getting paged in the middle of the night, it probably had something to do with a brief Cassandra blip that was causing an increased response time. That has just gone away completely.

Hulu

Hulu

Hulu selected the Apache Cassandra system when its previous system was having trouble expanding to its growing subscriber base. “We needed something that could scale quickly and would be easy to maintain because we have a very small team.”

IBM

IBM

IBM determined that the Apache Cassandra NoSQL database would be the platform architecture’s key technology to deliver the requirements of scalability, performance and high availability.

Instaclustr

Instaclustr

At Instaclustr we also have a big data challenge that we are solving with Apache Cassandra and Apache Spark. Instametrics provides us with the perfect opportunity to dogfood the Instaclustr technology stack.

Instana

Instana

“Cassandra works well; it runs really nicely and smoothly. We’ve never lost data, and things are easy to fix. Quite frankly, without Cassandra, we couldn’t run Instana.”

Instagram

Instagram

Constant At Instagram, we have one of the world’s largest deployments of the Apache Cassandra database. We began using Cassandra in 2012 to replace Redis and support product use cases like fraud detection, Feed, and the Direct inbox.

Intuit Mint

Intuit Mint

Mint Bills selected Apache Cassandra to store user account data. “When you are selecting between accounts on your Mint Bills app, you are actually retrieving information from Cassandra directly,” Csasznik-Shaked added

Intuit Turbo Tax

Intuit Turbo Tax

Intuit is supporting over 42,000 Peak TPS in production in AWS, over eight clusters in production. Cassandra has to process massive amounts of data, such as entitlements, tax returns, filings, user experience, and everything needed to support TurboTax.

Keen

Keen

Keen leverages Kafka, Apache Cassandra NoSQL database and the Apache Spark analytics engine, adding a RESTful API and a number of SDKs for different languages. It enriches streaming data with relevant metadata and enables customers to stream enriched data to Amazon S3 or any other data store.

Kinetic Data

Kinetic Data

“Once it’s set up and running it’s hands-off. Quite frankly, it’s easy from an operations perspective. So our customers, they’re using Cassandra, but they don’t really realize it. But they do say, ‘it’s always up. It’s always fast.’ It’s all these benefits that you really want the end-user to know about.”

Liquibase

Liquibase

"We hear our customers say all the time that there is no platform that can take all that data as well as Apache Cassandra. If you’re generating tons of data, you need global resiliency; you are going to pick Cassandra. When you need to scale, it does that."

Locstat

Locstat

Locstat showed a Geotrellis generated heat map with flight data from aircraft and flight patterns around the Cape Town International Airport. Data is stored in Cassandra and then pushed through Apache Spark and visualized using Geotrellis in a Cesium spatial interface.

Macquarie Bank

Macquarie Bank

Cassandra provides a smart data storage layer that is fed with information from back-end systems within Macquarie through an open API platform and then serves customer requests with great speed, due largely to its in-memory capabilities.

Macy’s

Macy’s

Growth in business led us to want 10x growth in data, move from a read-mostly model to one which could handle near-real-time updates, and a move into multiple data centers. POC Result: Cassandra & ActiveSpaces - Very close. MongoDB - Failed tests. YMMV!

Maths Pathway

Maths Pathway

Maths Pathway is a Learning and Teaching Model that supports students along an individual pathway to build a deep appreciation and knowledge of mathematics. Maths Pathway delivers that individual and personalized learning with the help of Apache Cassandra.

METRO

METRO

METRO wanted to consolidate development and top management believed Apache Cassandra would be a good starting point. The entire platform has been migrated and teams are beginning to use native services from Google Cloud to interact with Cassandra effectively.

MobilePay

MobilePay

“We wanted to implement a distributed database that would fit with our microservices-based application strategy and that would be able to handle the availability and scalability needs of the applications too,” Jakobsen said. “Cassandra matched this model perfectly…”

Monzo

Monzo

Monzo employs a microservice architecture (on Go and Kubernetes) and profiled and optimized key platform components such as Apache Cassandra and Linkerd for a recent crowdfunding effort run entirely through its app.

Netflix

Netflix

Netflix manages petabytes of data in Apache Cassandra which must be reliably accessible to users in mere milliseconds. They built sophisticated control planes that turn their persistence layer based on Apache Cassandra into a truly self-driving system.

The New York Times

The New York Times

The New York times uses Apache Cassandra with Python for the company’s ⨍aбrik messaging platform.

NHN Techorus

NHN Techorus

NHN Techorus provides IT infrastructure and managed services through the company’s Data Hotel division. The team has identified that there are a rapidly growing number of customers looking to deploy applications and solutions using Apache Cassandra as their data store.

Ooyala

Ooyala

Ooyala built a real-time analytics engine using Cassandra. Evan Chan (Software Engineer at Ooyala), describes his experience using the Spark and Shark frameworks for running real-time queries on top of Cassandra data.

Outbrain

Outbrain

Outbrain has 30 production clusters of Apache Cassandra of different sizes, ranging from small ones to clusters with 100 nodes across 3 datacenters. Cassandra has proven to be a very reliable choice as a datastore which employs an eventual consistency model.

Paidy

Paidy

Paidy offers real-time monthly consolidated credit services across Japan. The company identified Apache Cassandra as the most suitable database technology for its event sourcing and reactive architecture.

Penn Mutual

Penn Mutual

Penn Mutual stores their data information in a 6-node Cassandra ring. Now, they’re able to leverage data to innovate and make more informed decisions so they can provide a truly personalized and premium experience to their customers.

ProtectWise

ProtectWise

“With the advent of the Internet of Things, the need to keep track of the growing number of touch points of a network is becoming increasingly challenging. Fortunately, Stevens and his team had some previous experience with Apache Cassandra…”

PubNub

PubNub

PubNub offers realtime infrastructure-as-a-service, and provides enterprise-grade security, 99.999% SLA-backed reliability, and global scalability to support the largest realtime deployments, all via simple APIs and 70+ SDKs.

Revtrax

Revtrax

RevTrax chose Cassandra for its uptime and linear scale: “If we need to scale out, it’s easier to scale the reads and writes with Cassandra than it is with MySQL.” But most of all, it was chosen for its durability and no single point of failure.

Sky

Sky

Sky uses Cassandra for database persistence in its Online Video Platform - the system which delivers all OTT video content to both Sky and NOW TV customers - including handling huge spikes in traffic for popular sports games and TV shows.

Spotify

Spotify

We’ve overall been very satisfied with Cassandra as a solution for all our personalization needs and are confident to scale it up to serve personalized experience to our ever growing size of engaged user base.

Stibo Systems

Stibo Systems

“At the operational level, being on Cassandra, with an infrastructure in containers and microservices, based on Docker, allows services to be resized dynamically,” explains Jérôme Reboul.

Target

Target

Apache Cassandra has been used for many years at Target - since around 2014. Here, they discuss how they learned to deploy Cassandra as a Docker container in Kubernetes, while still maintaining stability and consistency — reliably in every location on their map.

Uber

Uber

Uber has been running an open-source Apache Cassandra® database as a service that powers a variety of mission-critical OLTP workloads for more than six years now at Uber scale, with millions of queries per second and petabytes of data.

Walmart

Walmart

We had good experience with Cassandra in past, hence, it was the first choice. Apache Cassandra has best write and read performance. Like Kafka it is distributed, highly scalable and fault-tolerant.

Woods Hole Oceanographic Institution

Woods Hole Oceanographic Institution

The Ocean Observatories Initiative (OOI) is a science-driven ocean observing network that delivers real-time data from more than 800 instruments to address critical science questions regarding the world’s oceans. Apache Cassandra has served as the heart of this system, which lives on hybrid infrastructure.

Yelp

Yelp

Yelp is transitioning from the management of Cassandra clusters in EC2 to orchestrating the same clusters in production on Kubernetes. Here, they discuss the EC2-based deployment and how they are using the Cassandra operator and etcd for cross-region coordination.