Cassandra Case Studies

From startups to the largest enterprises, the world runs on Cassandra.

ably full logo

Ably

Apache Cassandra is trusted to scale at internet level and designed to scale without limits. Which is why, at Ably Realtime, we use Cassandra for our persistent storage of messages.

activision black text

Activision

Activision built a new system to message players with highly personalised communication. It used large amounts of real-time data and was built with Apache Cassandra

adstage full white text

AdStage

AdStage is constantly monitoring performance trends and optimizing campaigns on behalf of advertisers. Apache Cassandra delivered the operational speed the company needed to ensure that the platform has low latency and the required throughput.

urban airship

Airship

Where we originally stored device data in a set of Postgres shards, our scale quickly outpaced our capacity to add new shards, so we moved to a multiple database architecture using HBase and Cassandra.

Apple

Apple

A year ago, Apple said that it was running over 75,000 Cassandra nodes, storing more than 10 petabytes of data. At least one cluster was over 1,000 nodes, and Apple regularly gets millions of operations per second (reads/writes) with Cassandra.

backblaze

Backblaze

We needed something that would handle really high write throughput and keep scaling on the write throughput. That forced us to look at distributed stores, and Apache Cassandra was the option that fitted what we needed.

bazaar full

BazaarVoice

EmoDB is an open source RESTful data store built on top of Cassandra that stores JSON documents and, most notably, offers a databus that allows subscribers to watch for changes to those documents in real time.

best buy

Best Buy

Best Buy uses Apache Cassandra to manage massive spikes in holiday traffic — 7x traffic spikes and bursts > 50,000 rps — and calls it “flawless.”

bigmate

Bigmate

In vetting MySQL, MongoDB, and other potential databases for IoT scale, Bigmate found they couldn’t match the scalability they could get with open source Apache Cassandra, which allows them to handle millions of operations or concurrent users each second.

blackberry black text

Blackberry

BlackBerry deployed Apache Cassandra as the NoSQL database solution for its Internet of Things (IoT) platform. The BlackBerry IoT platform powers the BlackBerry Radar IoT solution designed to provide continuous visibility into an organization’s transportation fleet.

blackrock logo

BlackRock

At BlackRock, we use Apache Cassandra in a variety of ways to help power our Aladdin investment management platform. In this talk I will give an overview of our use of Cassandra, with an emphasis on how we manage multi-tenancy in our Cassandra infrastructure.

bloomberg

Bloomberg

Bloomberg Engineering is working on a multi-year build, creating a new Index Construction Platform to handle the daily production of the Bloomberg Barclays fixed income indices, using Apache Cassandra and Apache Solr.

bundesagentur_fur_arbeit

Bundesagentur für Arbeit (Federal Agency)

The IT system department needed a new solution for real-time monitoring of applications and business processes, and to be able to quickly counteract any negative influences. They selected Apache Cassandra because it could be tailored to their needs.

campaign monitor black text

Campaign Monitor

Campaign Monitor knew that shifting to a new database technology was a major undertaking. They chose Apache Cassandra as their strategic operational database platform due to its exceptional reliability, manageability at scale and open source community.

cern blue text

CERN

P-BEAST consists of 20,000 applications running on 2,400 interconnected computers. CERN uses Apache Cassandra to satisfy the large time series data rates, flexibility and scalability requirements entailed by the project.

clearcapital full

Clear Capital

Clear Capital is a leader in property valuation solutions for North America. Cassandra provides the foundation of the Clear Capital technology platform.

cloudkick

CloudKick

Cloudkick uses Apache Cassandra for configuration data as well as metrics storage, a key element in keeping up with metrics processing as well as providing a high quality user experience with fast loading graphs.

cloudtrax logo

CloudTrax

The Open-Mesh team knew that Apache Cassandra was ideal for their intended capability. The solution had the scalability and data storage requirements to meet the needs of the CloudTrax platform.

constant contact white text

Constant Contact

Constant Contact uses Cassandra to manage social media data for over 400k small business customers. Its largest production cluster has over 100 TB of data in over 150 machines.

Coursera

Coursera

Cassandra outperformed MongoDB and others with its masterless architecture that ensures 100% application uptime, and flexible linear scalability. Software engineer Daniel Chia said, “We found it very challenging to manage MongoDB operationally, and HBase had some latency issues along with difficulty in deployment.

datacloud

Datacloud

The oil & gas industry stores sensor data in an industry-specific document database, where data access is only available through a proprietary API based on SOAP and XML. DataCloud solved this by transferring this data into an Apache Cassandra database cluster

discord full

Discord

Cassandra was the only database that fulfilled all of Discord’s requirements, as they can add nodes to scale it and it can tolerate a loss of nodes without any impact on the application. Related data is stored contiguously on disk providing minimum seeks and easy distribution around the cluster.

dream11

Dream11

The company started its operations in 2008 and started offering single match fantasy sports in 2012. It is India’s Biggest Sports Gaming platform with users playing Fantasy Cricket, Football, Kabaddi, Basketball & Hockey. Dream11 is the Official Fantasy partner of the VIVO Indian Premier League (IPL), International Council of Cricket (ICC)

ebay logo

eBay

A glimpse on our Cassandra deployment: Dozens of nodes across multiple clusters 200 TB+ storage provisioned 400M+ writes & 100M+ reads per day, and growing QA, LnP, and multiple Production clusters.

Equinix

Equinix

Equinix uses Cassandra for its ease of operation, and always-on node architecture — and its peer-to-peer architecture guarantees no single point of failure to collect and store streaming data from infrastructure instruments.

flant

Flant

Flant has been successfully using the Rook operator to operate its Cassandra cluster in Kubernetes and provides tips on how it changed some parameters in the Cassandra config.

fractal labs

Fractal Labs

Fractal’s APIs aggregates data, and analyses permission-based banking, accounting and payments data so that financial institutions can provide timely nudges and insights to help their small business clients with funding and to better understand their finances.

grubhub red text

Grubhub

Grubhub runs a service oriented platform that primarily operates out of multiple AWS data centers (regions). It moved to cloud infrastructure to accelerate its growth, using Apache Cassandra as its primary persistent data store.

home depot

Home Depot

Home Depot also used DataStax and Apache Cassandra to stand up curbside apps quickly. Siddiqui said Home Depot is a big open source shop.

hornet

Hornet

This is probably going to be the most engineering non-answer ever, which is simply that I haven’t really had to care about Cassandra since we made the changes and upgrades. Usually if I was getting paged in the middle of the night, it probably had something to do with a brief Cassandra blip that was causing an increased response time. That has just gone away completely.

hulu white background

Hulu

Hulu selected the Apache Cassandra system when its previous system was having trouble expanding to its growing subscriber base. “We needed something that could scale quickly and would be easy to maintain because we have a very small team.”

ibm black text

IBM

IBM determined that the Apache Cassandra NoSQL database would be the platform architecture’s key technology to deliver the requirements of scalability, performance and high availability.

instaclustr

Instaclustr

At Instaclustr we also have a big data challenge that we are solving with Apache Cassandra and Apache Spark. Instametrics provides us with the perfect opportunity to dogfood the Instaclustr technology stack.

instana full

Instana

The Instana components are already containerized and run in our SaaS platform, but we still needed to create containers for our databases, Clickhouse, Cassandra, etc., and set up the release pipeline for them.

instagram

Instagram

Constant At Instagram, we have one of the world’s largest deployments of the Apache Cassandra database. We began using Cassandra in 2012 to replace Redis and support product use cases like fraud detection, Feed, and the Direct inbox.

Intuit Mint

Intuit Mint

Mint Bills selected Apache Cassandra to store user account data. “When you are selecting between accounts on your Mint Bills app, you are actually retrieving information from Cassandra directly,” Csasznik-Shaked added

Intuit Turbo Tax

Intuit Turbo Tax

Intuit is supporting over 42,000 Peak TPS in production in AWS, over eight clusters in production. Cassandra has to process massive amounts of data, such as entitlements, tax returns, filings, user experience, and everything needed to support TurboTax.

keen

Keen.io

Keen leverages Kafka, Apache Cassandra NoSQL database and the Apache Spark analytics engine, adding a RESTful API and a number of SDKs for different languages. It enriches streaming data with relevant metadata and enables customers to stream enriched data to Amazon S3 or any other data store.

locstat full

Locstat

Locstat showed a Geotrellis generated heat map with flight data from aircraft and flight patterns around the Cape Town International Airport. Data is stored in Cassandra and then pushed through Apache Spark and visualized using Geotrellis in a Cesium spatial interface.

macquarie bank black text

Macquarie Bank

Cassandra provides a smart data storage layer that is fed with information from back-end systems within Macquarie through an open API platform and then serves customer requests with great speed, due largely to its in-memory capabilities.

macys black text

Macy’s

Growth in business led us to want 10x growth in data, move from a read-mostly model to one which could handle near-real-time updates, and a move into multiple data centers. POC Result: Cassandra & ActiveSpaces - Very close. MongoDB - Failed tests. YMMV!

maths pathway full

Maths Pathway

Maths Pathway is a Learning and Teaching Model that supports students along an individual pathway to build a deep appreciation and knowledge of mathematics. Maths Pathway delivers that individual and personalized learning with the help of Apache Cassandra.

METRO

METRO

METRO wanted to consolidate development and top management believed Apache Cassandra would be a good starting point. The entire platform has been migrated and teams are beginning to use native services from Google Cloud to interact with Cassandra effectively.

mobile pay

MobilePay

“We wanted to implement a distributed database that would fit with our microservices-based application strategy and that would be able to handle the availability and scalability needs of the applications too,” Jakobsen said. “Cassandra matched this model perfectly…”

monzo full

Monzo

Monzo employs a microservice architecture (on Go and Kubernetes) and profiled and optimized key platform components such as Apache Cassandra and Linkerd for a recent crowdfunding effort run entirely through its app.

netflix

Netflix

Netflix manages petabytes of data in Apache Cassandra which must be reliably accessible to users in mere milliseconds. They built sophisticated control planes that turn their persistence layer based on Apache Cassandra into a truly self-driving system.

new york times

New York Times

The New York times uses Apache Cassandra with Python for the company’s ⨍aбrik messaging platform.

nhn techorus

NHN Techorus

NHN Techorus provides IT infrastructure and managed services through the company’s Data Hotel division. The team has identified that there are a rapidly growing number of customers looking to deploy applications and solutions using Apache Cassandra as their data store.

ooyala vertical

Ooyala

Ooyala built a real-time analytics engine using Cassandra. Evan Chan (Software Engineer at Ooyala), describes his experience using the Spark and Shark frameworks for running real-time queries on top of Cassandra data.

outbrain full

Outbrain

Outbrain has 30 production clusters of Apache Cassandra of different sizes, ranging from small ones to clusters with 100 nodes across 3 datacenters. Cassandra has proven to be a very reliable choice as a datastore which employs an eventual consistency model.

paidy

Paidy

Paidy offers real-time monthly consolidated credit services across Japan. The company identified Apache Cassandra as the most suitable database technology for its event sourcing and reactive architecture.

penn mutual

Penn Mutual

Penn Mutual stores their data information in a 6-node Cassandra ring. Now, they’re able to leverage data to innovate and make more informed decisions so they can provide a truly personalized and premium experience to their customers.

protectwise

ProtectWise

“With the advent of the Internet of Things, the need to keep track of the growing number of touch points of a network is becoming increasingly challenging. Fortunately, Stevens and his team had some previous experience with Apache Cassandra…”

pubnub

Pubnub

PubNub offers realtime infrastructure-as-a-service, and provides enterprise-grade security, 99.999% SLA-backed reliability, and global scalability to support the largest realtime deployments, all via simple APIs and 70+ SDKs.

Revtrax

ProtectWise

RevTrax chose Cassandra for its uptime and linear scale: “If we need to scale out, it’s easier to scale the reads and writes with Cassandra than it is with MySQL.” But most of all, it was chosen for its durability and no single point of failure.

sky logo

Sky

Sky uses Cassandra for database persistence in its Online Video Platform - the system which delivers all OTT video content to both Sky and NOW TV customers - including handling huge spikes in traffic for popular sports games and TV shows.

spotify full

Spotify

We’ve overall been very satisfied with Cassandra as a solution for all our personalization needs and are confident to scale it up to serve personalized experience to our ever growing size of engaged user base.

stibo systems

Stibo Systems

“At the operational level, being on Cassandra, with an infrastructure in containers and microservices, based on Docker, allows services to be resized dynamically,” explains Jérôme Reboul.

target full

Target

Apache Cassandra has been used for many years at Target - since around 2014. Here, they discuss how they learned to deploy Cassandra as a Docker container in Kubernetes, while still maintaining stability and consistency — reliably in every location on their map.

uber black text

Uber

Uber’s solution is comprehensive. They built their own system that runs Cassandra on top of Mesos. It’s all explained in a good talk by Abhishek Verma, Software Engineer at Uber: Cassandra on Mesos Across Multiple Datacenters at Uber (slides).

walmart

Walmart

We had good experience with Cassandra in past, hence, it was the first choice. Apache Cassandra has best write and read performance. Like Kafka it is distributed, highly scalable and fault-tolerant.

woods hole oceanographic

Woods Hole Oceanographic Institution

The Ocean Observatories Initiative (OOI) is a science-driven ocean observing network that delivers real-time data from more than 800 instruments to address critical science questions regarding the world’s oceans. Apache Cassandra has served as the heart of this system, which lives on hybrid infrastructure.

yelp

Yelp

Yelp is transitioning from the management of Cassandra clusters in EC2 to orchestrating the same clusters in production on Kubernetes. Here, they discuss the EC2-based deployment and how they are using the Cassandra operator and etcd for cross-region coordination.