import-jobs and export-jobs

The clusters import-jobs and clusters export jobs commands let you manage, respectively, the import and export jobs in your CrateDB Cloud cluster.

Tip

Import jobs are the easiest way to get data into CrateDB Cloud. Use them to import from a local file, an arbitrary URL, or from an AWS S3-compatible service.

Most JSON, CSV and Parquet files are supported.

clusters import-jobs

Usage: croud clusters import-jobs [-h] {delete,list,create,progress} ...

clusters import-jobs create

Usage: croud clusters import-jobs create [-h]
                                         {from-url,from-file,from-s3,from-azure-blob-storage}
                                         ...

clusters import-jobs create from-url

Usage: croud clusters import-jobs create from-url [-h] --url URL --cluster-id
                                                  CLUSTER_ID --file-format
                                                  {csv,json,parquet}
                                                  [--compression {gzip}]
                                                  --table TABLE
                                                  [--create-table CREATE_TABLE]
                                                  [--transformations TRANSFORMATIONS]
                                                  [--region REGION]
                                                  [--output-fmt {table,wide,json,yaml}]
                                                  [--sudo]

Required Arguments

--url

The URL the import file will be read from.

--cluster-id

The cluster the data will be imported into.

--file-format

Possible choices: csv, json, parquet

The format of the structured data in the file.

--table

The table the data will be imported into.

Optional Arguments

--compression

Possible choices: gzip

The compression method the file uses.

--create-table

Whether the table should be created automatically if it does not exist. If true new columns will also be added when the data requires them.

--transformations

The transformations to apply when fetching data. This is the SELECT statement from an SQL query that is executed on the internal DuckDB database that the data is loaded to before inserting into CrateDB. This can be used to apply arbitrary SQL functions on your data before inserting into CrateDB, i.e. UNNEST(), SUM() and similar.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$ croud clusters import-jobs create from-url --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
    --file-format csv --table my_table_name --url https://s3.amazonaws.com/my.import.data.gz --compression gzip
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| dca4986d-f7c8-4121-af81-863cca1dab0f | e1e38d92-a650-48f1-8a70-8133f2d5c400 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: REGISTERED (Your import job was received and is pending processing.)
==> Info: Done importing 3 records and 36 Bytes.
==> Success: Operation completed.

clusters import-jobs create from-file

Usage: croud clusters import-jobs create from-file [-h] [--file-id FILE_ID]
                                                   [--file-path FILE_PATH]
                                                   --cluster-id CLUSTER_ID
                                                   --file-format
                                                   {csv,json,parquet}
                                                   [--compression {gzip}]
                                                   --table TABLE
                                                   [--create-table CREATE_TABLE]
                                                   [--transformations TRANSFORMATIONS]
                                                   [--region REGION]
                                                   [--output-fmt {table,wide,json,yaml}]
                                                   [--sudo]

Required Arguments

--cluster-id

The cluster the data will be imported into.

--file-format

Possible choices: csv, json, parquet

The format of the structured data in the file.

--table

The table the data will be imported into.

Optional Arguments

--file-id

The file ID that will be used for the import. If not specified then –file-path must be specified. Please refer to croud organizations files for more info.

--file-path

The file in your local filesystem that will be used. If not specified then –file-id must be specified. Please note the file will become visible under croud organizations files list.

--compression

Possible choices: gzip

The compression method the file uses.

--create-table

Whether the table should be created automatically if it does not exist. If true new columns will also be added when the data requires them.

--transformations

The transformations to apply when fetching data. This is the SELECT statement from an SQL query that is executed on the internal DuckDB database that the data is loaded to before inserting into CrateDB. This can be used to apply arbitrary SQL functions on your data before inserting into CrateDB, i.e. UNNEST(), SUM() and similar.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

sh$ croud clusters import-jobs create from-file --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
    --file-format csv --table my_table_name --file-id 2e71e5a6-a21a-4e99-ae58-705a1f15635c
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| 9164f886-ae37-4a1b-b3fe-53f9e1897e7d | e1e38d92-a650-48f1-8a70-8133f2d5c400 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: REGISTERED (Your import job was received and is pending processing.)
==> Info: Done importing 3 records and 36 Bytes.
==> Success: Operation completed.

clusters import-jobs create from-s3

Usage: croud clusters import-jobs create from-s3 [-h] --bucket BUCKET
                                                 --file-path FILE_PATH
                                                 --secret-id SECRET_ID
                                                 [--endpoint ENDPOINT]
                                                 --cluster-id CLUSTER_ID
                                                 --file-format
                                                 {csv,json,parquet}
                                                 [--compression {gzip}]
                                                 --table TABLE
                                                 [--create-table CREATE_TABLE]
                                                 [--transformations TRANSFORMATIONS]
                                                 [--region REGION]
                                                 [--output-fmt {table,wide,json,yaml}]
                                                 [--sudo]

Required Arguments

--bucket

The name of the S3 bucket that contains the file to be imported.

--file-path

The absolute path in the S3 bucket that points to the file to be imported. Globbing (use of *) is allowed.

--secret-id

The secret that contains the access key and secret key needed to access the file to be imported.

--cluster-id

The cluster the data will be imported into.

--file-format

Possible choices: csv, json, parquet

The format of the structured data in the file.

--table

The table the data will be imported into.

Optional Arguments

--endpoint

An Amazon S3 compatible endpoint.

--compression

Possible choices: gzip

The compression method the file uses.

--create-table

Whether the table should be created automatically if it does not exist. If true new columns will also be added when the data requires them.

--transformations

The transformations to apply when fetching data. This is the SELECT statement from an SQL query that is executed on the internal DuckDB database that the data is loaded to before inserting into CrateDB. This can be used to apply arbitrary SQL functions on your data before inserting into CrateDB, i.e. UNNEST(), SUM() and similar.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

sh$ croud clusters import-jobs create from-s3 --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
    --secret-id 71e7c5da-51fa-44f2-b178-d95052cbe620 --bucket cratedbtestbucket \
    --file-path myfiles/cratedbimporttest.csv --file-format csv --table my_table_name
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| f29fdc02-edd0-4ad9-8839-9616fccf752b | e1e38d92-a650-48f1-8a70-8133f2d5c400 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: REGISTERED (Your import job was received and is pending processing.)
==> Info: Done importing 3 records and 36 Bytes.
==> Success: Operation completed.

clusters import-jobs list

Usage: croud clusters import-jobs list [-h] --cluster-id CLUSTER_ID
                                       [--region REGION]
                                       [--output-fmt {table,wide,json,yaml}]
                                       [--sudo]

Required Arguments

--cluster-id

The cluster the import jobs belong to.

Optional Arguments

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$  croud clusters import-jobs list --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400
+--------------------------------------+--------------------------------------+-----------+--------+-------------------+
| id                                   | cluster_id                           | status    | type   | destination       |
|--------------------------------------+--------------------------------------+-----------+--------+-------------------|
| dca4986d-f7c8-4121-af81-863cca1dab0f | e1e38d92-a650-48f1-8a70-8133f2d5c400 | SUCCEEDED | url    | my_table_name     |
| 00de6048-3af6-41da-bfaa-661199d1c106 | e1e38d92-a650-48f1-8a70-8133f2d5c400 | SUCCEEDED | s3     | my_table_name     |
| 035f5ec1-ba9e-4a5c-9ce1-44e9a9cab6c1 | e1e38d92-a650-48f1-8a70-8133f2d5c400 | SUCCEEDED | file   | my_table_name     |
+--------------------------------------+--------------------------------------+-----------+--------+-------------------+

clusters import-jobs delete

Usage: croud clusters import-jobs delete [-h] --cluster-id CLUSTER_ID
                                         --import-job-id IMPORT_JOB_ID
                                         [--region REGION]
                                         [--output-fmt {table,wide,json,yaml}]
                                         [--sudo]

Required Arguments

--cluster-id

The cluster the import job belongs to.

--import-job-id

The ID of the Import Job.

Optional Arguments

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$  croud clusters import-jobs delete \
      --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
      --import-job-id 00de6048-3af6-41da-bfaa-661199d1c106
==> Success: Success.

clusters import-jobs progress

Usage: croud clusters import-jobs progress [-h] --cluster-id CLUSTER_ID
                                           --import-job-id IMPORT_JOB_ID
                                           [--limit LIMIT] [--offset OFFSET]
                                           [--summary SUMMARY]
                                           [--region REGION]
                                           [--output-fmt {table,wide,json,yaml}]
                                           [--sudo]

Required Arguments

--cluster-id

The cluster the import jobs belong to.

--import-job-id

The ID of the Import Job.

Optional Arguments

--limit

The number of files returned.Use keywork ‘ALL’ to have no limit applied.

--offset

The offset to skip before beginning to return the files.

--summary

Show only global progress.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Examples

sh$  croud clusters import-jobs progress \
      --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
      --import-job-id 00de6048-3af6-41da-bfaa-661199d1c106 \
      --summary true
+-----------+-----------+------------------+-----------------+---------------+
|   percent |   records |   failed_records |   total_records |   total_files |
|-----------+-----------+------------------+-----------------+---------------+
|       100 |       891 |                0 |             891 |             2 |
+-----------+-----------+------------------+-----------------+---------------+
sh$  croud clusters import-jobs progress \
      --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
      --import-job-id 00de6048-3af6-41da-bfaa-661199d1c106 \
      --limit ALL
      --offset 0
+-----------+-----------+-----------+------------------+-----------------+
| name      |   percent |   records |   failed_records |   total_records |
|-----------+-----------+-----------+------------------+-----------------|
| file1.csv |       100 |       800 |                0 |             800 |
| file2.csv |       100 |        91 |                0 |              91 |
+-----------+-----------+-----------+------------------+-----------------+

clusters export-jobs

Usage: croud clusters export-jobs [-h] {delete,list,create} ...

clusters export-jobs create

Usage: croud clusters export-jobs create [-h] --cluster-id CLUSTER_ID --table
                                         TABLE --file-format
                                         {csv,json,parquet}
                                         [--compression {gzip}]
                                         [--save-as SAVE_AS] [--region REGION]
                                         [--output-fmt {table,wide,json,yaml}]
                                         [--sudo]

Required Arguments

--cluster-id

The cluster the data will be exported from.

--table

The table the data will be exported from.

--file-format

Possible choices: csv, json, parquet

The format of the data in the file.

Optional Arguments

--compression

Possible choices: gzip

The compression method of the exported file.

--save-as

The file on your local filesystem the data will be exported to. If not specified, you will receive the URL to download the file.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$  croud clusters export-jobs create --cluster-id f6c39580-5719-431d-a508-0cee4f9e8209 \
      --table nyc_taxi --file-format csv
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| 85dc0024-b049-4b9d-b100-4bf850881692 | f6c39580-5719-431d-a508-0cee4f9e8209 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: SENT (Your creation request was sent to the region.)
==> Info: Status: IN_PROGRESS (Export in progress)
==> Info: Exporting... 2.00 K records and 19.53 KiB exported so far.
==> Info: Exporting... 4.00 K records and 39.06 KiB exported so far.
==> Info: Done exporting 6.00 K records and 58.59 KiB.
==> Success: Download URL: https://cratedb-file-uploads.s3.amazonaws.com/some/download
==> Success: Operation completed.

Note

This command will wait for the operation to finish or fail. It is only available to organization admins.

clusters export-jobs list

Usage: croud clusters export-jobs list [-h] --cluster-id CLUSTER_ID
                                       [--region REGION]
                                       [--output-fmt {table,wide,json,yaml}]
                                       [--sudo]

Required Arguments

--cluster-id

The cluster the export jobs belong to.

Optional Arguments

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$  croud clusters export-jobs list \
      --cluster-id f6c39580-5719-431d-a508-0cee4f9e8209
+--------------------------------------+--------------------------------------+-----------+---------------------+-----------------------------------------------+
| id                                   | cluster_id                           | status    | source              | destination                                   |
|--------------------------------------+--------------------------------------+-----------+---------------------+-----------------------------------------------|
| b311ba9d-9cb4-404a-b58d-c442ae251dbf | f6c39580-5719-431d-a508-0cee4f9e8209 | SUCCEEDED | nyc_taxi            | Format: csv                                   |
|                                      |                                      |           |                     | File ID: 327ad0e6-607f-4f99-a4cc-c1e98bf28e4d |
+--------------------------------------+--------------------------------------+-----------+---------------------+-----------------------------------------------+

clusters export-jobs delete

Usage: croud clusters export-jobs delete [-h] --cluster-id CLUSTER_ID
                                         --export-job-id EXPORT_JOB_ID
                                         [--region REGION]
                                         [--output-fmt {table,wide,json,yaml}]
                                         [--sudo]

Required Arguments

--cluster-id

The cluster the job belongs to.

--export-job-id

The ID of the export job.

Optional Arguments

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$  croud clusters export-jobs delete \
      --cluster-id f6c39580-5719-431d-a508-0cee4f9e8209 \
      --export-job-id 3b311ba9d-9cb4-404a-b58d-c442ae251dbf
==> Success: Success.