ALTER TABLE
Modifies the columns and properties of a table.
Add new columns, drop existing columns, renames columns, and modify table properties. The command returns no results.
|
See also: CREATE TABLE, DROP TABLE, CREATE CUSTOM INDEX for Storage-Attached Indexes (SAI), CREATE INDEX for secondary indexes (2i)
Syntax
BNF definition:
alter_table_statement::= ALTER TABLE [ IF EXISTS ] table_name alter_table_instruction
alter_table_instruction::= ADD [ IF NOT EXISTS ] column_definition ( ',' column_definition)*
| DROP [ IF EXISTS ] column_name ( ',' column_name )*
| RENAME [ IF EXISTS ] column_name to column_name (AND column_name to column_name)*
| ALTER [ IF EXISTS ] column_name ( column_mask | DROP MASKED )
| WITH options
column_definition::= column_name cql_type [ column_mask]
column_mask::= MASKED WITH ( DEFAULT | function_name '(' term ( ',' term )* ')' )
ALTER TABLE [<keyspace_name>.]<table_name> [ ADD ( <column_definition> | <column_definition_list> ) [ , ... ] ] [ DROP <column_name> [ , ... ] ] [ [ RENAME <column_name> TO <column_name> ] ] [ WITH <table_properties> [ , ... ] ];
Syntax conventions | Description |
---|---|
UPPERCASE |
Literal keyword. |
Lowercase |
Not literal. |
|
Variable value. Replace with a user-defined value. |
|
Optional.
Square brackets ( |
|
Group.
Parentheses ( |
|
Or.
A vertical bar ( |
|
Repeatable.
An ellipsis ( |
|
Single quotation ( |
|
Map collection.
Braces ( |
Set, list, map, or tuple.
Angle brackets ( |
|
|
End CQL statement.
A semicolon ( |
|
Separate the command line options from the command arguments with two hyphens ( |
|
Search CQL only: Single quotation marks ( |
|
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrConfig files. |
Required parameters
- table_name
-
Name of the table to alter.
- column_name
-
Name of the column to alter, drop, or add.
column_definition
Enclosed in parentheses after the table name, use a comma-separated list to define multiple columns.
All tables must have at least one primary key column.
Each column is defined using the following syntax: column_name cql_type_definition [STATIC | PRIMARY KEY] [, ...]
Restriction:
-
A table must have at least one
PRIMARY KEY
. -
When
PRIMARY KEY
is at the end of a column definition, that column is the only primary key for the table, and is defined as the partition-key[partition key]. -
A static column cannot be a primary key.
-
Primary keys can include frozen collections.
- column_name
-
Use a unique name for each column in a table. To preserve case or use special characters, enclose the name in double-quotes.
- cql_type_definition
-
Defines the type of data allowed in the column. See CQL data type or a user-defined type.
- STATIC
-
Optional, the column has a single value.
- PRIMARY KEY
-
When the
PRIMARY KEY
is one column, append PRIMARY KEY to the end of the column definition. This is only schema information required to create a table. When there is one primary key, it is the partition key; the data is divided and stored by the unique values in this column:column_name cql_type_definition PRIMARY KEY
.Alternatively, you can declare the primary key consisting of only one column in the same way as you declare a compound primary key.
Optional parameters
- keyspace_name
-
Name of the keyspace that contains the table to alter. If no name is specified, the current keyspace is used.
- ADD ( <column_definition> | <column_definition_list> )
-
Add one or more columns and set the column data types. Specify the column names followed by the data types. The column value is automatically set to null. To add multiple columns, use a comma separated list of columns placed inside parentheses.
<column_name> <cql_type> [ , ] [ <column_name> <cql_type> [ , ... ]
Restriction: Adding columns to a primary key is not supported after a table has been created.
- DROP ( <column> | <column_list> )
-
Drop one or more columns. The values contained in the row are also dropped and not recoverable. To drop multiple columns, use a comma separated list of columns placed inside parentheses.
- RENAME <column_name> TO <column_name>
-
Changes the name of a primary key column and preserves the existing values.
Restriction: Not supported on materialized view base-tables, or tables with secondary indexes.
table_options
Tunes data handling, including I/O operations, compression, and compaction. Table property options use the following syntax:
-
Single values:
<option_name> = '<value>'
-
Multiple values:
<option_name> = { '<subproperty>' : '<value>' [, ...] } [AND ...]
Simple JSON format, key-value pairs in a comma-separated list enclosed by curly brackets.
When no value is specified, the default is used. |
In a CREATE TABLE (or ALTER TABLE) CQL statement, use a WITH
clause to define table property options.
Separate multiple values with AND
.
CREATE TABLE [<keyspace_name>.]<table_name>
WITH option_name = '<value>'
AND option_name = {<option_map>};
- bloom_filter_fp_chance = <N>
-
False-positive probability for SSTable bloom filter. When a client requests data, the bloom filter checks if the row exists before executing disk I/O. Values range from 0 to 1.0, where:
0
is the minimum value use to enable the largest possible bloom filter (uses the most memory) and1.0
is the maximum value disabling the bloom filter.
Recommended setting: |
Default: bloom_filter_fp_chance = '0.01'
- caching = { 'keys' : 'value', 'rows_per_partition' : 'value'}
-
Optimizes the use of cache memory without manual tuning. Weighs the cached data by size and access frequency. Coordinate this setting with the global caching properties in the cassandra.yaml file. Valid values:
-
ALL
-- all primary keys or rows -
NONE
-- no primary keys or rows -
<N>
: (rows per partition only) — specify a whole number Default:{ 'keys': 'ALL', 'rows_per_partition': 'NONE' }
-
- cdc
-
Creates a Change Data Capture (CDC) log on the table.
Valid values:
-
TRUE
- create CDC log -
FALSE
- do not create CDC log
-
- comment = 'some text that describes the table'
-
Provide documentation on the table.
Enter a description of the types of queries the table was designed to satisfy. |
- default_time_to_live
-
TTL (Time To Live) in seconds, where zero is disabled. The maximum configurable value is
630720000
(20 years). Beginning in 2018, the expiration timestamp can exceed the maximum value supported by the storage engine; see the warning below. If the value is greater than zero, TTL is enabled for the entire table and an expiration timestamp is added to each column. A new TTL timestamp is calculated each time the data is updated and the row is removed after all the data expires.Default value:
0
(disabled).The database storage engine can only encode TTL timestamps through
January 19 2038 03:14:07 UTC
due to the Year 2038 problem. The TTL date overflow policy determines whether requests with expiration timestamps later than the maximum date are rejected or inserted. - gc_grace_seconds
-
Seconds after data is marked with a tombstone (deletion marker) before it is eligible for garbage-collection. Default value: 864000 (10 days). The default value allows time for the database to maximize consistency prior to deletion.
Tombstoned records within the grace period are excluded from hints or batched mutations.
In a single-node cluster, this property can safely be set to zero. You can also reduce this value for tables whose data is not explicitly deleted — for example, tables containing only data with TTL set, or tables with
default_time_to_live
set. However, if you lower thegc_grace_seconds
value, consider its interaction with these operations:-
hint replays: When a node goes down and then comes back up, other nodes replay the write operations (called hints) that are queued for that node while it was unresponsive. The database does not replay hints older than gc_grace_seconds after creation. The max_hint_window setting in the cassandra.yaml file sets the time limit (3 hours by default) for collecting hints for the unresponsive node.
-
batch replays: Like hint queues, batch operations store database mutations that are replayed in sequence. As with hints, the database does not replay a batched mutation older than gc_grace_seconds after creation. If your application uses batch operations, consider the possibility that decreasing gc_grace_seconds increases the chance that a batched write operation may restore deleted data. The configuration/cass_yaml_file.html#batchlog_replay_throttle[batchlog_replay_throttle] property in the cassandra.yaml file give some control of the batch replay process. The most important factors, however, are the size and scope of the batches you use.
-
- memtable_flush_period_in_ms
-
Milliseconds before
memtables
associated with the table are flushed. When memtable_flush_period_in_ms=0, the memtable will flush when:-
the flush threshold is met
-
on shutdown
-
on nodetool flush
-
when commitlogs get full Default:
0
-
- min_index_interval
-
Minimum gap between index entries in the index summary. A lower min_index_interval means the index summary contains more entries from the index, which allows the database to search fewer index entries to execute a read. A larger index summary may also use more memory. The value for min_index_interval is the densest possible sampling of the index.
- max_index_interval
-
If the total memory usage of all index summaries reaches this value, Apache Cassandra decreases the index summaries for the coldest SSTables to the maximum set by max_index_interval. The max_index_interval is the sparsest possible sampling in relation to memory pressure.
- speculative_retry
-
Configures rapid read protection. Normal read requests are sent to just enough replica nodes to satisfy the consistency level. In rapid read protection, extra read requests are sent to other replicas, even after the consistency level has been met. The speculative retry property specifies the trigger for these extra read requests.
-
ALWAYS: The coordinator node sends extra read requests to all other replicas after every read of that table.
-
<X>percentile: Track each table’s typical read latency (in milliseconds). Coordinator node retrieves the typical latency time of the table being read and calculates X percent of that figure. The coordinator sends redundant read requests if the number of milliseconds it waits without responses exceeds that calculated figure.
For example, if the speculative_retry property for Table_A is set to
80percentile
, and that table’s typical latency is 60 milliseconds, the coordinator node handling a read of Table_A would send a normal read request first, and send out redundant read requests if it received no responses within 48ms, which is 80% of 60ms. -
<N>ms: The coordinator node sends extra read requests to all other replicas if the coordinator node has not received any responses within
N
milliseconds. -
NONE: The coordinator node does not send extra read requests after any read of that table.
-
Compaction Strategy | |||
---|---|---|---|
- table_properties
-
You can modify an existing table’s properties. Some properties are single options that are set to a value:
<option_name> = <value> [ AND ... ]
For example,
speculative_retry = '10ms'
. Enclose the value for a string property in single quotation marks.Other table properties are set using a JSON map:
option_name = { <subproperty_name> : <value> [ , ... ] }
See table_options for more details.
Usage notes
Restrictions:
-
Can only rename clustering columns in the primary key.
-
Cannot change the data type of a column.
-
For a table that has a materialized view, cannot drop a column from the table even if the column is not used in the materialized view.
-
Cannot rename or drop columns that have dependent secondary indexes.
-
Do not add a column with the same name as an existing column but with a different data type. It will prevent commit log replays and corrupt existing SSTables with old data.
Examples
This section uses the cyclist_races table.
Adding a column
To add a column, use the ADD instruction:
ALTER TABLE cycling.cyclist_races
ADD manager UUID;
To add a column of a collection type:
ALTER TABLE cycling.cyclist_races
ADD completed list<text>;
This operation does not validate the existing data.
Restriction: You cannot use the ADD
instruction to add:
-
A column with the same name as an existing column
-
A static column if the table has no clustering columns.
Dropping a column
To remove a column from the table, use the DROP instruction:
ALTER TABLE cycling.cyclist_races
DROP manager;
DROP
removes the column from the table definition.
The column becomes unavailable for queries immediately after it is dropped.
The database drops the column data during the next compaction.
Restriction:
-
If you drop a column then re-add it, Apache Cassandra does not restore the values written before the column was dropped.
-
Do not re-add a dropped column that contained timestamps generated by a client; you can re-add columns with timestamps generated by the write time facility.
Renaming a column
To rename a column in the race_times table:
ALTER TABLE cycling.race_times
RENAME race_date TO date;
Restriction: The following restrictions apply to RENAME
:
-
You can only rename clustering columns, which are part of the primary key.
-
You cannot rename the partition key because the partition key determines the data storage location on a node. If a different partition name is required, the table must be recreated and the data migrated.
There are many restrictions when using
RENAME
because SSTables are immutable. To change the state of the data on disk, everything must be rewritten. -
You can index a renamed column.
-
You cannot rename a column if an index has been created on it.
-
You cannot rename a static column.
Modifying table properties
To change an existing table’s properties, use ALTER TABLE
and WITH
.
You can specify a:
-
Single property name and value.
-
Property map to set the names and values, as shown in the next section on compression and compaction.
For example, to add a comment to the cyclist_base table using WITH:
ALTER TABLE cycling.cyclist_base
WITH comment = 'basic cyclist information';
Enclose a text property value in single quotation marks.
Modifying compression and compaction
Use a property map to alter the comments table’s compression or compaction setting:
ALTER TABLE cycling.cyclist_base
WITH comment = 'basic cyclist information';
Enclose the name of each key in single quotes. If the value is a string, enclose the string in quotes as well.
If you change the compaction strategy of a table with existing data, the database rewrites all existing SSTables using the new strategy. This can take hours, which can be a major problem for a production system. For strategies to minimize this disruption, see How to change the compaction strategy on a production cluster and Impact of Changing Compaction Strategy. |
Changing caching
Set the number of rows per partition to store in the row cache for the comments table to 10 rows:
ALTER TABLE cycling.comments
WITH caching = {
'keys' : 'NONE',
'rows_per_partition' : 10
};
Change the speculative retries
Modify the cyclist_base
table to 95th percentile for speculative retry:
ALTER TABLE cycling.cyclist_base
WITH speculative_retry = '95percentile';
Modify the cyclist_base
table to use 10 milliseconds for speculative retry:
ALTER TABLE cycling.cyclist_base
WITH speculative_retry = '10ms';
Enabling and disabling background compaction
The following example sets the enabled
property to false
to disable background compaction:
ALTER TABLE cycling.comments
WITH COMPACTION = {
'class' : 'SizeTieredCompactionStrategy',
'enabled' : 'false'
};
Disabling background compaction can be harmful: without it, the database does not regain disk space, and could allow zombies to propagate. Although compaction uses I/O, it is better to leave it enabled in most cases. |
Reading extended compaction logs
Set the log_all
subproperty to true
to collect in-depth information about compaction activity on a node in a dedicated log file.
If you enable extended compaction logging for any table on any node, it is enabled for all tables on all nodes in the cluster. |
When extended compaction is enabled, the database creates a file named compaction-%d.log (where %d
is a sequential number) in home/logs.
The compaction logging service logs detailed information about the following types of compaction events:
-
type:enable
Lists SSTables that have been flushed previously.
{"type":"enable","keyspace":"test","table":"t","time":1470071098866,"strategies": [ {"strategyId":"0","type":"LeveledCompactionStrategy","tables":[],"repaired":true,"folders": ["/home/carl/oss/cassandra/bin/../data/data"]}, {"strategyId":"1","type":"LeveledCompactionStrategy","tables":[],"repaired":false,"folders": ["/home/carl/oss/cassandra/bin/../data/data"] } ] }
-
type: flush
Logs a flush event from a memtable to an SSTable on disk, including the CompactionStrategy for each table.
{"type":"flush","keyspace":"test","table":"t","time":1470083335639,"tables": [ {"strategyId":"1","table": {"generation":1,"version":"mb","size":106846362,"details": {"level":0,"min_token":"-9221834874718566760","max_token":"9221396997139245178"} } } ] }
-
type: compaction
Logs a compaction event.
{"type":"compaction","keyspace":"test","table":"t","time":1470083660267, "start":"1470083660188","end":"1470083660267","input": [ {"strategyId":"1","table": {"generation":1372,"version":"mb","size":1064979,"details": {"level":1,"min_token":"7199305267944662291","max_token":"7323434447996777057"} } } ],"output": [ {"strategyId":"1","table": {"generation":1404,"version":"mb","size":1064306,"details": {"level":2,"min_token":"7199305267944662291","max_token":"7323434447996777057"} } } ] }
-
type: pending
Lists the number of pending tasks for a compaction strategy.
{"type":"pending","keyspace":"test","table":"t", "time":1470083447967,"strategyId":"1","pending":200}
Reviewing the table definition
Use DESCRIBE
or DESC
to view the table definition.
DESC cycling.comments;
The table details including the column names are returned.
CREATE TABLE cycling.comments (
id uuid,
created_at timestamp,
comment text,
commenter text,
record_id timeuuid,
PRIMARY KEY (id, created_at)
) WITH CLUSTERING ORDER BY (created_at DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'NONE', 'rows_per_partition': '10'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'enabled': 'true', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.DeflateCompressor'}
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND speculative_retry = '99PERCENTILE';