Node Specific Settings

Table of Contents

Basics

cluster.name
Default: crate
Runtime: no

The name of the CrateDB cluster the node should join to.

node.name
Runtime: no

The name of the node. If no name is configured a random one will be generated.

Note

Node names must be unique in a CrateDB cluster.

node.max_local_storage_nodes
Default: 1
Runtime: no

Defines how many nodes are allowed to be started on the same machine using the same configured data path defined via path.data.

Node Types

CrateDB supports different kinds of nodes.

The following settings can be used to differentiate nodes upon startup:

node.master
Default: true
Runtime: no

Whether or not this node is able to get elected as master node in the cluster.

node.data
Default: true
Runtime: no

Whether or not this node will store data.

Using different combinations of these two settings, you can create four different types of node. Each type of node is differentiate by what types of load it will handle.

The four types of node possible are:

  Master No Master
Data Can handle all loads. Handles request handling and query execution loads.
No Data Can handle cluster management loads. Handles request handling loads.

Nodes marked as node.master will only handle cluster management loads if they are elected as the cluster master. All other loads are shared equally.

Read-only node

node.sql.read_only
Default: false
Runtime: no

If set to true, the node will only allow SQL statements which are resulting in read operations.

Hosts

network.host
Default: _local_
Runtime: no

The IP address CrateDB will bind itself to. This setting sets both the network.bind_host and network.publish_host values.

network.bind_host
Default: _local_
Runtime: no

This setting determines to which address CrateDB should bind itself to.

network.publish_host
Default: _local_
Runtime: no

This setting is used by a CrateDB node to publish its own address to the rest of the cluster.

Note

Apart from IPv4 and IPv6 addresses there are some special values that can be used for all above settings:

_local_ Any loopback addresses on the system, for example 127.0.0.1.
_site_ Any site-local addresses on the system, for example 192.168.0.1.
_global_ Any globally-scoped addresses on the system, for example 8.8.8.8.
_[networkInterface]_ Addresses of a network interface, for example _en0_.

Ports

http.port
Runtime: no

This defines the TCP port range to which the CrateDB HTTP service will be bound to. It defaults to 4200-4300. Always the first free port in this range is used. If this is set to an integer value it is considered as an explicit single port.

The HTTP protocol is used for the REST endpoint which is used by all clients except the Java client.

http.publish_port
Runtime: no

The port HTTP clients should use to communicate with the node. It is necessary to define this setting if the bound HTTP port (http.port) of the node is not directly reachable from outside, e.g. running it behind a firewall or inside a Docker container.

transport.tcp.port
Runtime: no

This defines the TCP port range to which the CrateDB transport service will be bound to. It defaults to 4300-4400. Always the first free port in this range is used. If this is set to an integer value it is considered as an explicit single port.

The transport protocol is used for internal node-to-node communication.

transport.publish_port
Runtime: no

The port that the node publishes to the cluster for its own discovery. It is necessary to define this setting when the bound tranport port (transport.tcp.port) of the node is not directly reachable from outside, e.g. running it behind a firewall or inside a Docker container.

psql.port
Runtime: no

This defines the TCP port range to which the CrateDB Postgres service will be bound to. It defaults to 5432-5532. Always the first free port in this range is used. If this is set to an integer value it is considered as an explicit single port.

Paths

path.conf
Runtime: no

Filesystem path to the directory containing the configuration files crate.yml and log4j2.properties.

path.data
Runtime: no

Filesystem path to the directory where this CrateDB node stores its data (table data and cluster metadata).

Multiple paths can be set by using a comma separated list and each of these paths will hold full shards (instead of striping data across them). In case CrateDB finds striped shards at the provided locations (from CrateDB <0.55.0), these shards will be migrated automatically on startup.

path.logs
Runtime: no

Filesystem path to a directory where log files should be stored.

Can be used as a variable inside log4j2.properties.

For example:

appender:
  file:
    file: ${path.logs}/${cluster.name}.log
path.repo
Runtime: no

A list of filesystem or UNC paths where repositories of type fs may be stored.

Without this setting a CrateDB user could write snapshot files to any directory that is writable by the CrateDB process. To safeguard against this security issue, the possible paths have to be whitelisted here.

See also location setting of repository type fs.

Plugins

plugin.mandatory
Runtime: no

A list of plugins that are required for a node to startup.

If any plugin listed here is missing, the CrateDB node will fail to start.

Memory

bootstrap.memory_lock
Runtime: no
Default: false

CrateDB performs poorly when the JVM starts swapping: you should ensure that it never swaps. If set to true, CrateDB will use the mlockall system call on startup to ensure that the memory pages of the CrateDB process are locked into RAM.

Garbage Collection

CrateDB logs if JVM garbage collection on different memory pools takes too long. The following settings can be used to adjust these timeouts:

monitor.jvm.gc.collector.young.warn
Default: 1000ms
Runtime: no

CrateDB will log a warning message if it takes more than the configured timespan to collect the Eden Space (heap).

monitor.jvm.gc.collector.young.info
Default: 700ms
Runtime: no

CrateDB will log an info message if it takes more than the configured timespan to collect the Eden Space (heap).

monitor.jvm.gc.collector.young.debug
Default: 400ms
Runtime: no

CrateDB will log a debug message if it takes more than the configured timespan to collect the Eden Space (heap).

monitor.jvm.gc.collector.old.warn
Default: 10000ms
Runtime: no

CrateDB will log a warning message if it takes more than the configured timespan to collect the Old Gen / Tenured Gen (heap).

monitor.jvm.gc.collector.old.info
Default: 5000ms
Runtime: no

CrateDB will log an info message if it takes more than the configured timespan to collect the Old Gen / Tenured Gen (heap).

monitor.jvm.gc.collector.old.debug
Default: 2000ms
Runtime: no

CrateDB will log a debug message if it takes more than the configured timespan to collect the Old Gen / Tenured Gen (heap).

Elasticsearch HTTP REST API

es.api.enabled
Default: false
Runtime: no

Enable or disable elasticsearch HTTP REST API.

Warning

Manipulating your data via elasticsearch API and not via SQL might result in inconsistent data. You have been warned!

Cross-Origin Resource Sharing (CORS)

Many browsers support the same-origin policy which requires web applications to explicitly allow requests across origins. The cross-origin resource sharing settings in CrateDB allow for configuring these.

http.cors.enabled
Default: false
Runtime: no

Enable or disable cross-origin resource sharing.

http.cors.allow-origin
Default: <empty>
Runtime: no

Define allowed origins of a request. * allows any origin (which can be a substantial security risk) and by prepending a / the string will be treated as a regular expression. For example /https?:\/\/crate.io/ will allow requests from http://crate.io and https://crate.io. This setting disallows any origin by default.

http.cors.max-age
Default: 1728000 (20 days)
Runtime: no

Max cache age of a preflight request in seconds.

http.cors.allow-methods
Default: OPTIONS, HEAD, GET, POST, PUT, DELETE
Runtime: no

Allowed HTTP methods.

http.cors.allow-headers
Default: X-Requested-With, Content-Type, Content-Length
Runtime: no

Allowed HTTP headers.

http.cors.allow-credentials
Default: false
Runtime: no

Add the Access-Control-Allow-Credentials header to responses.

Blobs

blobs.path
Runtime: no

Path to a filesystem directory where to store blob data allocated for this node.

By default blobs will be stored under the same path as normal data. A relative path value is interpreted as relative to CRATE_HOME.

Repositories

Repositories are used to backup a CrateDB cluster.

repositories.url.allowed_urls
Runtime: no

This setting only applies to repositories of type url.

With this setting a list of urls can be specified which are allowed to be used if a repository of type url is created.

Wildcards are supported in the host, path, query and fragment parts.

This setting is a security measure to prevent access to arbitrary resources.

In addition, the supported protocols can be restricted using the repositories.url.supported_protocols setting.

repositories.url.supported_protocols
Default: http, https, ftp, file and jar
Runtime: no

A list of protocols that are supported by repositories of type url.

The jar protocol is used to access the contents of jar files. For more info, see the java JarURLConnection documentation.

See also the path.repo Setting.

Queries

indices.query.bool.max_clause_count
Default: 8192
Runtime: no

This setting defines the maximum number of elements an array can have so that the != ANY(), LIKE ANY() and the NOT LIKE ANY() operators can be applied on it.

Note

Increasing this value to a large number (e.g. 10M) and applying those ANY operators on arrays of that length can lead to heavy memory, consumption which could cause nodes to crash with OutOfMemory exceptions.

Custom Attributes

The node.attr namespace is a bag of custom attributes.

You can create any attribute you want under this namespace, like node.attr.key: value. These attributes use the node.attr namespace to distinguish them from core node attribute like node.name.

Custom attributes are not validated by CrateDB, unlike core node attributes.

Custom attributes can, however, be used to control shard allocation.