sstabledump
Dump contents of a given SSTable to standard output in JSON format.
You must supply exactly one sstable.
Cassandra must be stopped before this tool is executed, or unexpected results will occur. Note: the script does not verify that Cassandra is stopped.
Usage
sstabledump <options> <sstable file path>
-d |
CQL row per line internal representation |
-e |
Enumerate partition keys only |
-k <arg> |
Partition key |
-x <arg> |
Excluded partition key(s) |
-t |
Print raw timestamps instead of iso8601 date strings |
-l |
Output each row as a separate JSON object |
If necessary, use sstableutil first to find out the sstables used by a table.
Dump entire table
Dump the entire table without any options.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db > eventlog_dump_2018Jul26 cat eventlog_dump_2018Jul26 [ { "partition" : { "key" : [ "3578d7de-c60d-4599-aefb-3f22a07b2bc6" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 61, "liveness_info" : { "tstamp" : "2018-07-20T20:23:08.378711Z" }, "cells" : [ { "name" : "event", "value" : "party" }, { "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:08.384Z" }, { "name" : "source", "value" : "asdf" } ] } ] }, { "partition" : { "key" : [ "d18250c0-84fc-4d40-b957-4248dc9d790e" ], "position" : 62 }, "rows" : [ { "type" : "row", "position" : 123, "liveness_info" : { "tstamp" : "2018-07-20T20:23:07.783522Z" }, "cells" : [ { "name" : "event", "value" : "party" }, { "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:07.789Z" }, { "name" : "source", "value" : "asdf" } ] } ] }, { "partition" : { "key" : [ "cf188983-d85b-48d6-9365-25005289beb2" ], "position" : 124 }, "rows" : [ { "type" : "row", "position" : 182, "liveness_info" : { "tstamp" : "2018-07-20T20:22:27.028809Z" }, "cells" : [ { "name" : "event", "value" : "party" }, { "name" : "insertedtimestamp", "value" : "2018-07-20 20:22:27.055Z" }, { "name" : "source", "value" : "asdf" } ] } ] } ]
Dump table in a more manageable format
Use the -l option to dump each row as a separate JSON object. This will make the output easier to manipulate for large data sets. ref: issues.apache.org/jira/browse/CASSANDRA-13848
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -l > eventlog_dump_2018Jul26_justlines cat eventlog_dump_2018Jul26_justlines [ { "partition" : { "key" : [ "3578d7de-c60d-4599-aefb-3f22a07b2bc6" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 61, "liveness_info" : { "tstamp" : "2018-07-20T20:23:08.378711Z" }, "cells" : [ { "name" : "event", "value" : "party" }, { "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:08.384Z" }, { "name" : "source", "value" : "asdf" } ] } ] }, { "partition" : { "key" : [ "d18250c0-84fc-4d40-b957-4248dc9d790e" ], "position" : 62 }, "rows" : [ { "type" : "row", "position" : 123, "liveness_info" : { "tstamp" : "2018-07-20T20:23:07.783522Z" }, "cells" : [ { "name" : "event", "value" : "party" }, { "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:07.789Z" }, { "name" : "source", "value" : "asdf" } ] } ] }, { "partition" : { "key" : [ "cf188983-d85b-48d6-9365-25005289beb2" ], "position" : 124 }, "rows" : [ { "type" : "row", "position" : 182, "liveness_info" : { "tstamp" : "2018-07-20T20:22:27.028809Z" }, "cells" : [ { "name" : "event", "value" : "party" }, { "name" : "insertedtimestamp", "value" : "2018-07-20 20:22:27.055Z" }, { "name" : "source", "value" : "asdf" } ] } ] }
Dump only keys
Dump only the keys by using the -e option.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -e > eventlog_dump_2018Jul26_justkeys cat eventlog_dump_2018Jul26b [ [ "3578d7de-c60d-4599-aefb-3f22a07b2bc6" ], [ "d18250c0-84fc-4d40-b957-4248dc9d790e" ], [ "cf188983-d85b-48d6-9365-25005289beb2" ]
Dump row for a single key
Dump a single key using the -k option.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -k 3578d7de-c60d-4599-aefb-3f22a07b2bc6 > eventlog_dump_2018Jul26_singlekey cat eventlog_dump_2018Jul26_singlekey [ { "partition" : { "key" : [ "3578d7de-c60d-4599-aefb-3f22a07b2bc6" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 61, "liveness_info" : { "tstamp" : "2018-07-20T20:23:08.378711Z" }, "cells" : [ { "name" : "event", "value" : "party" }, { "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:08.384Z" }, { "name" : "source", "value" : "asdf" } ] } ] }
Exclude a key or keys in dump of rows
Dump a table except for the rows excluded with the -x option. Multiple keys can be used.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -x 3578d7de-c60d-4599-aefb-3f22a07b2bc6 d18250c0-84fc-4d40-b957-4248dc9d790e > eventlog_dump_2018Jul26_excludekeys cat eventlog_dump_2018Jul26_excludekeys [ { "partition" : { "key" : [ "cf188983-d85b-48d6-9365-25005289beb2" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 182, "liveness_info" : { "tstamp" : "2018-07-20T20:22:27.028809Z" }, "cells" : [ { "name" : "event", "value" : "party" }, { "name" : "insertedtimestamp", "value" : "2018-07-20 20:22:27.055Z" }, { "name" : "source", "value" : "asdf" } ] } ] }
Display raw timestamps
By default, dates are displayed in iso8601 date format. Using the -t option will dump the data with the raw timestamp.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -t -k cf188983-d85b-48d6-9365-25005289beb2 > eventlog_dump_2018Jul26_times cat eventlog_dump_2018Jul26_times [ { "partition" : { "key" : [ "cf188983-d85b-48d6-9365-25005289beb2" ], "position" : 124 }, "rows" : [ { "type" : "row", "position" : 182, "liveness_info" : { "tstamp" : "1532118147028809" }, "cells" : [ { "name" : "event", "value" : "party" }, { "name" : "insertedtimestamp", "value" : "2018-07-20 20:22:27.055Z" }, { "name" : "source", "value" : "asdf" } ] } ] }
Display internal structure in output
Dump the table in a format that reflects the internal structure.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -d > eventlog_dump_2018Jul26_d cat eventlog_dump_2018Jul26_d [3578d7de-c60d-4599-aefb-3f22a07b2bc6]@0 Row[info=[ts=1532118188378711] ]: | [event=party ts=1532118188378711], [insertedtimestamp=2018-07-20 20:23Z ts=1532118188378711], [source=asdf ts=1532118188378711] [d18250c0-84fc-4d40-b957-4248dc9d790e]@62 Row[info=[ts=1532118187783522] ]: | [event=party ts=1532118187783522], [insertedtimestamp=2018-07-20 20:23Z ts=1532118187783522], [source=asdf ts=1532118187783522] [cf188983-d85b-48d6-9365-25005289beb2]@124 Row[info=[ts=1532118147028809] ]: | [event=party ts=1532118147028809], [insertedtimestamp=2018-07-20 20:22Z ts=1532118147028809], [source=asdf ts=1532118147028809]