Glossary

This glossary defines key terms used in the CrateDB reference manual.

Table of contents

Terms

B

Binary operator

See operation.

C

CLUSTERED BY column

See routing column.

F

Function

A token (e.g., replace) that takes zero or more arguments (e.g., three strings), performs a specific task, and may return one or more values (e.g., a modified string). Functions that return more than one value are called multi-valued functions.

Functions may be called in an SQL statement, like so:

cr> SELECT replace('Hello world!', 'world', 'friend') as result;
+---------------+
| result        |
+---------------+
| Hello friend! |
+---------------+
SELECT 1 row in set (... sec)

M

Metadata gateway

Persists cluster metadata on disk every time the metadata changes. This data is stored persistently across full cluster restarts and recovered after nodes are started again.

Multi-valued function

A function that returns two or more values.

N

Nonscalar

A data type that can have more than one value (e.g., arrays and objects).

Contrary to a scalar.

O

Operand

See operator.

Operation

See operator.

Operator

A reserved keyword (e.g., IN) or sequence of symbols (e.g., >=) that can be used in an SQL statement to manipulate one or more expressions and return a result (e.g., true or false). This process is known as an operation and the expressions can be called operands or arguments.

An operator that takes one operand is known as a unary operator and an operator that takes two is known as a binary operator.

P

Partition column

A column used to partition a table. Specified by the PARTITIONED BY clause.

Also known as a PARTITIONED BY column or partitioned column.

A table may be partitioned by one or more columns:

  • If a table is partitioned by one column, a new partition is created for every unique value in that partition column

  • If a table is partitioned by multiple columns, a new partition is created for every unique combination of row values in those partition columns

PARTITIONED BY column

See partition column.

Partitioned column

See partition column.

R

Regular expression

An expression used to search for patterns in a string.

Routing column

Values in this column are used to compute a hash which is then used to route the corresponding row to a specific shard.

Also known as the CLUSTERED BY column.

All rows that have the same routing column row value are stored in the same shard.

Note

The routing of rows to a specific shard is not the same as the routing of shards to a specific node (also known as shard allocation).

S

Scalar

A data type with a single value (e.g., numbers and strings).

Contrary to a nonscalar.

See also

Primitive types

Shard allocation

The process by which CrateDB allocates shards to a specific nodes.

Note

Shard allocation is sometimes referred to as shard routing, which is not to be confused with row routing.

Shard recovery

The process by which CrateDB synchronizes a replica shard from a primary shard.

Shard recovery can happen during node startup, after node failure, when replicating a primary shard, when moving a shard to another node (i.e., when rebalancing the cluster), or during snapshot restoration.

A shard that is being recovered cannot be queried until the recovery process is complete.

Shard routing

See shard allocation.

Statement

Any valid SQL that serves as a database instruction (e.g., CREATE TABLE, INSERT, and SELECT) instead of producing a value.

Contrary to an expression.

Subquery

A SELECT statement used as a relation in the FROM clause of a parent SELECT statement.

Also known as a subselect.

Subselect

See subquery.

U

Unary operator

See operation.

Uncorrelated subquery

A scalar subquery that does not reference any relations (e.g., tables) in the parent SELECT statement.

V

Value expression

See expression.