Bulk Inserts with unnest

Claus Matzinger
Filed under
June 13, 2016

Table functions are functions that produce rows in the same way as a table, and with Crate's latest testing release (0.55.0) we have added support for unnest. Unnest uses arrays to generate a transient table with each array representing a column:

cr> select * from unnest([1, 2, 3], ['Arthur', 'Trillian', 'Marvin']);
| col1 | col2     |
|    1 | Arthur   |
|    2 | Trillian |
|    3 | Marvin   |
SELECT 3 rows in set (... sec)

Bulk Inserts

To insert large amounts of data efficiently, you are recommended to use bulk inserts. With a client that does not support these or if you have no direct access to the rest endpoint you can be out of luck! The same is also unfortunately true for our ODBC driver since the ODBC specification does not have a way to perform bulk requests.

With unnest there is a way around this by using insert-by-query you can select from an 'unnested' table which results in a bulk insert internally:

cr> insert into authors (id, name) (select * from unnest([1, 2, 3], ['Arthur', 'Trillian', 'Marvin']));
INSERT OK, 3 rows affected (... sec)

As the response suggested, the three rows have now been inserted:

cr> select * from authors order by id;
| id | name     |
|  1 | Arthur   |
|  2 | Trillian |
|  3 | Marvin   |
SELECT 3 rows in set (... sec)

The resulting performance is also comparable to bulk requests

Bulk inserts vs Unnest

Comparison of a single bulk request and insert via 'unnest' of 1000 rows, repeated 10 times on a single machine.

We have extended the crate-version-comparison Github repository to include a way to insert with unnest and compare single-threaded performance.

Try it Out!

As an Open Source company we rely on your feedback and would be happy if you try it out and tell us how you liked it. Or even better, submit a pull request 😉!

Back to topAll posts