Size Tiered Compaction Strategy
The basic idea of
SizeTieredCompactionStrategy (STCS) is to merge
sstables of approximately the same size. All sstables are put in
different buckets depending on their size. An sstable is added to the
bucket if size of the sstable is within
of the current average size of the sstables already in the bucket. This
will create several buckets and the most interesting of those buckets
will be compacted. The most interesting one is decided by figuring out
which bucket’s sstables takes the most reads.
When running a major compaction with STCS you will end up with two sstables per data directory (one for repaired data and one for unrepaired data). There is also an option (-s) to do a major compaction that splits the output into several sstables. The sizes of the sstables are approximately 50%, 25%, 12.5%… of the total size.
Sstables smaller than this are put in the same bucket.
How much smaller than the average size of a bucket a sstable should be before not being included in the bucket. That is, if
bucket_low * avg_bucket_size < sstable_size(and the
bucket_highcondition holds, see below), then the sstable is added to the bucket.
How much bigger than the average size of a bucket a sstable should be before not being included in the bucket. That is, if
sstable_size < bucket_high * avg_bucket_size(and the
bucket_lowcondition holds, see above), then the sstable is added to the bucket.