Retrieve rows from a table.


SELECT [ ALL | DISTINCT ] * | expression [ [ AS ] output_name ] [, ...]
  [ FROM relation ]
  [ WHERE condition ]
  [ GROUP BY expression [, ...] [HAVING condition] ]
  [ ORDER BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ]
  [ LIMIT num_results ]
  [ OFFSET start ]

where relation is:

table_reference | joined_relation | table_function | sub_select


SELECT retrieves rows from a table. The general processing of SELECT is as follows:

  • The FROM item points to the table where the data should be retrieved from. If no FROM item is specified, the query is executed against sys.cluster.
  • If the WHERE clause is specified, all rows that do not satisfy the condition are eliminated from the output. (See WHERE Clause below.)
  • If the GROUP BY clause is specified, the output is combined into groups of rows that match on one or more values.
  • The actual output rows are computed using the SELECT output expressions for each selected row or row group.
  • If the ORDER BY clause is specified, the returned rows are sorted in the specified order. If ORDER BY is not given, the rows are returned in whatever order the system finds fastest to produce.
  • If DISTINCT is specified, one unique row is kept. All other duplicate rows are removed from the result set.
  • If the LIMIT or OFFSET clause is specified, the SELECT statement only returns a subset of the result rows.



The SELECT list specifies expressions that form the output rows of the SELECT statement. The expressions can (and usually do) refer to columns computed in the FROM clause.

SELECT [ ALL | DISTINCT ] * | expression [ [ AS ] output_name ] [, ...]

Just as in a table, every output column of a SELECT has a name. In a simple SELECT this name is just used to label the column for display. To specify the name to use for an output column, write AS output_name after the column’s expression. (You can omit AS, but only if the desired output name does not match any reserved keyword. For protection against possible future keyword additions, it is recommended that you always either write AS or double-quote the output name.) If you do not specify a column name, a name is chosen automatically by Crate. If the column’s expression is a simple column reference then the chosen name is the same as that column’s name. In more complex cases a function or type name may be used, or the system may fall back on a generated name.

An output column’s name can be used to refer to the column’s value in ORDER BY and GROUP BY clauses, but not in the WHERE clause; there you must write out the expression instead.

Instead of an expression, * can be written in the output list as a shorthand for all the columns of the selected rows. Also, you can write table_name.* as a shorthand for the columns coming from just that table. In these cases it is not possible to specify new names with AS; the output column names will be the same as the table columns’ names.

FROM Clause

The FROM clause specifies the source relation for the SELECT:

FROM relation

The relation can be either a table reference or a joined relation.

Table Reference

A table_reference is a table ident with an optional table alias:

table_ident [ [AS] table_alias ]
table_ident:The name (optionally schema-qualified) of an existing table.
table_alias:A substitute name for the FROM item containing the alias. An alias is used for brevity. When an alias is provided, it completely hides the actual name of the table. For example given FROM foo AS f, the remainder of the SELECT must refer to this ‘FROM’ item as ‘f’ not ‘foo’.

Joined Relation

A joined_relation is a relation which joins two relations together. See Joins

relation { , | join_type JOIN } relation [ ON join_condition ]
join_condition:An expression which specifies which rows in a join are considered a match. The join_condition is not applicable for joins of type CROSS and must have a returning value of type boolean.

Table Function

table_function is a function that produces a set of rows and has columns.


The call declaration of the function. Usually in the form of `` function_name ( [ args ] )``.

Depending on the function the parenthesis and arguments are either optional or required.

Available functions are documented in the table functions section.

Sub Select

A sub_select can appear in the FROM clause. This acts like a temporary table for the outer select that wraps it. The sub_select must be surrounded by parentheses and an alias must be provided for it. Multiple levels of nested sub selects is also supported.

( select_stmt ) [ AS ] alias
select_stmt:A SELECT statement.
alias:An alias for the sub select.


Only sub selects that can be rewritten to a single select are supported. If the query cannot be rewritten by collapsing all nested sub selects then an error is thrown.

WHERE Clause

The optional WHERE clause defines the condition to be met for a row to be returned:

WHERE condition
condition:a where condition is any expression that evaluates to a result of type boolean. Any row that does not satisfy this condition will be eliminated from the output. A row satisfies the condition if it returns true when the actual row values are substituted for any variable references.


The optional GROUP BY clause will condense into a single row all selected rows that share the same values for the grouped expressions.

Aggregate expressions, if any are used, are computed across all rows making up each group, producing a separate value for each group.

GROUP BY expression [, ...] [HAVING condition]
expression:An arbitrary expression formed from column references of the queried relation that are also present in the result column list. Numeric literals are interpreted as ordinals referencing an output column from the select list. It can also reference output columns by name. In case of ambiguity, a GROUP BY name will be interpreted as a name of a column from the queried relation rather than an output column name.


The optional HAVING clause defines the condition to be met for values whitin a resulting row of a group by clause.

condition:a having condition is any expression that evaluates to a result of type boolean. Every row for which the condition is not satisfied will be eliminated from the output.


When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions, since there would otherwise be more than one possible value to return for an ungrouped column.


Grouping can only be applied on indexed fields. For more information, please refer to Disable indexing.


The ORDER BY clause causes the result rows to be sorted according to the specified expression(s).

ORDER BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...]
expression:can be the name or ordinal number of an output column, or it can be an arbitrary expression formed from input-column values.

The optional keyword ASC (ascending) or DESC (descending) after any expression allows to define the direction in which values have are sorted. The default is ascending.

If NULLS FIRST is specified, null values sort before non null values. If NULLS LAST is specified null values sort after non null values. If neither is specified nulls are considered larger than any value. That means the default for ASC is NULLS LAST and the default for DESC is NULLS FIRST.


If two rows are equal according to the leftmost expression, they are compared according to the next expression and so on. If they are equal according to all specified expressions, they are returned in an implementation-dependent order.


Sorting can only be applied on indexed fields. For more information, please refer to Disable indexing.

Character-string data is sorted by its UTF-8 representation.


Sorting on geo_point, geo_shape, array, and object is not supported.

LIMIT Clause

The optional LIMIT Clause allows to limit the number or retured result rows:

LIMIT num_results

specifies the maximum number of result rows to return.

If a client using the HTTP or Transport protocol is used a default limit of 10000 is implicitly added.


It is possible for repeated executions of the same LIMIT query to return different subsets of the rows of a table, if there is not an ORDER BY to enforce selection of a deterministic subset.


The optional OFFSET Clause allows to skip result rows at the beginning:

OFFSET start
start:specifies the number of rows to skip before starting to return rows.