CREATE TABLE

Define a new table.

Synopsis

CREATE TABLE [ IF NOT EXISTS ] table_ident ( [
    {
        base_column_definition
      | generated_column_definition
      | table_constraint
    }
    [, ... ] ]
)
[ PARTITIONED BY (column_name [, ...] ) ]
[ CLUSTERED [ BY (routing_column) ] INTO num_shards SHARDS ]
[ WITH ( table_parameter [= value] [, ... ] ) ]

where base_column_definition:

column_name data_type [ column_constraint [ ... ] ]

where generated_column_definition is:

column_name [ data_type GENERATED ALWAYS ]
AS [ ( ] generation_expression [ ) ]
[ column_constraint [ ... ] ]

where column_constraint is:

{ PRIMARY KEY |
  NOT NULL |
  INDEX { OFF | USING { PLAIN |
                        FULLTEXT [ WITH ( analyzer = analyzer_name ) ]  }
}

and table_constraint is:

{ PRIMARY KEY ( column_name [, ... ] ) |
  INDEX index_name USING FULLTEXT ( column_name [, ... ] )
       [ WITH ( analyzer = analyzer_name ) ]
}

Description

CREATE TABLE will create a new, initially empty table.

If the table_ident does not contain a schema, the table is created in the doc schema. Otherwise it is created in the given schema, which is implicitly created, if it didn’t exist yet.

A table consists of one or more base columns and any number of generated columns and/or table_constraints.

The optional constraint clauses specify constraints (tests) that new or updated rows must satisfy for an insert or update operation to succeed. A constraint is an SQL object that helps define the set of valid values in the table in various ways.

There are two ways to define constraints: table constraints and column constraints. A column constraint is defined as part of a column definition. A table constraint definition is not tied to a particular column, and it can encompass more than one column. Every column constraint can also be written as a table constraint; a column constraint is only a notational convenience for use when the constraint only affects one column.

Table Elements

Base Columns

A base column is a persistent column in the table metadata. In relational terms it is an attribute of the tuple of the table-relation. It has a name, a type and optional constraints. Base columns are readable and writable (if the table itself is writable). Values for base columns are given in DML statements explicitly or omitted, in which case their value is null.

Generated Columns

A generated column is a persistent column that is generated by applying a generation_expression which is evaluated in the context of the current row. So the value of a generated column is usually generated by applying a function or transformation to the values of other base columns in the same row. It is not possible to reference other generated columns from within the generation_expression.

Generated columns are read-only.

A generated column is not virtual. It is stored in the table like a base column is. The origin of its value is what makes it different.

Table Constraints

Table constraints are constraints that are applied to more than one column or to the table as a whole.

For further details see TABLE CONSTRAINTS.

Column Constraints

Column constraints are constraints that are applied on each column of the table separately. The supported column constraints are:

For further details see COLUMN CONSTRAINTS.

Parameters

table_ident:The name (optionally schema-qualified) of the table to be created.
column_name:The name of a column to be created in the new table.
data_type:The data type of the column. This can include array and object specifiers. For more information on the data types supported by Crate see .
generation_expression:An expression (usually a function call) that is applied in the context of the current row. As such it can reference other base columns of the table. Referencing other generated columns (including itself) is not supported. The generation expression is evaluated each time a row is inserted or the referenced base columns are updated.

IF NOT EXISTS Clause

If the optional IF NOT EXISTS clause is used this statement won’t do anything if the table exists already.

CLUSTERED Clause

The optional CLUSTERED clause specifies how a table should be distributed accross a cluster.

num_shards:specifies the number of shards a table is stored in. Must be greater than 0. If not provided the number of shards is calculated based on the number of currently active data nodes with the following formula:
num_shards = max(4, num_data_nodes * 2)

Note

The minimum value of num_shards is set to 4. This means if the calculation of num_shards does not exceeds its minimum it applies the minimum value to each table or partition as default.

routing_column:allows to explicitly specify a column or field on which basis rows are sharded. All rows having the same value in routing_column are stored in the same shard. The default is the primary key if specified, otherwise the internal _id column.

PARTITIONED BY Clause

The PARTITIONED clause splits the created table into separate partitions for every distinct combination of values in the listed columns.

[ PARTITIONED BY ( column_name [ , ... ] ) ]
column_name:a column from the table definition this table gets partitioned by.

Several restrictions apply to columns that can be used here:

Note

Columns referenced in the PARTITIONED clause cannot be altered by an UPDATE statement.

WITH Clause

The optional WITH clause can specify parameters for tables.

[ WITH ( table_parameter [= value] [, ... ] ) ]
table_parameter:specifies an optional parameter for the table.

Available parameters are:

number_of_replicas

Specifies the number or range of replicas each shard of a table should have for normal operation, the default is to have 1 replica.

The number of replicas is defined like this:

min_replicas [ - [ max_replicas ] ]
min_replicas:The minimum number of replicas required.
max_replicas:The maximum number of replicas. The actual maximum number of replicas is max(num_replicas, N-1), where N is the number of data nodes in the cluster. If max_replicas is the string all then it will always be N.

For further details and examples see Replication.

refresh_interval

Specifies the refresh interval of a shard in milliseconds. The default is set to 1000 milliseconds.

value:The refresh interval in milliseconds. A value of smaller or equal than 0 turns off the automatic refresh. A value of greater than 0 schedules a periodic refresh of the table.

Note

a refresh_interval of 0 does not guarantee that new writes are NOT visible to subsequent reads. Only the periodic refresh is disabled. There are other internal factors that might trigger a refresh.

For further details see Refresh or REFRESH.

blocks.read_only

Allows to have a read only table.

value:Table is read only if value set to true. Allows writes and table settings changes if set to false.

blocks.read

disable/enable all the read operations

value:Set to true to disable all read operations for a table, otherwise set false.

blocks.write

disable/enable all the write operations

value:Set to true to disable all write operations and table settings modifications, otherwise set false.

blocks.metadata

disable/enable the table settings modifications.

values:Disables the table settings modifications if set to true, if set to false — table settings modifications are enabled.

translog.flush_threshold_ops

Sets the number of operations before flushing.

value:Number of operations prior to flushing.

translog.flush_threshold_size

Sets size of transaction log prior to flushing.

value:Size (bytes) of translog.

translog.flush_threshold_period

Sets period of no flushing after which force flush occurs.

value:Period length in milliseconds.

translog.disable_flush

enable/disable flushing.

value:set true to disable flushing, otherwise set to false.

Note

It is recommended to use disable_flush only for short periods of time.

translog.interval

Sets frequency of flush necessity check.

value:frequency in milliseconds.

translog.sync_interval

How often the translog is fsynced to disk. Defaults to 5s

value:interval in milliseconds.

allocation.total_shards_per_node

Controls the total number of shards (replicas and primaries) allowed to be allocated on a single node. Defaults to unbounded (-1).

value:number of shards per node.

allocation.enable

Controls shard allocation for a specific table. Can be set to:

all:Allows shard allocation for all shards. (Default)
primaries:Allows shard allocation only for primary shards.
new_primaries:Allows shard allocation only for primary shards for new tables.
none:No shard allocation allowed.

recovery.initial_shards

When using local gateway a particular shard is recovered only if there can be allocated quorum of its copies in the cluster. It can be set to:

value:quorum (default), quorum-1, full, full-1 and number values (e.g. 1, 2 ...)

warming.enable

disable/enable table warming. Table warming allows to run registered queries to warm up the table before it is available. Enabled by default.

value:true to enable warming up, otherwise false

column_policy

Specifies the column policy of the table. The default column policy is dynamic.

The column policy is defined like this:

WITH ( column_policy = {'dynamic' | 'strict'} )
strict:Rejecting any column on insert, update or copy from which is not defined in the schema
dynamic:New columns can be added using insert, update or copy from. New columns added to dynamic tables are, once added, usable as usual columns. One can retrieve them, sort by them and use them in where clauses.

For futher details and examples see Column Policy or Configuration.