CrateDB Doubling Down on Permissive Licensing and the Elasticsearch Lockdown

2021-01-27, by Bernd Dorn

TL;DR Crate.io will:

  • no longer use Elastic’s Elasticsearch as an upstream project for CrateDB and will switch to a fork
  • open source its entire codebase under the APLv2 with CrateDB 4.5

As you might already know, CrateDB relies on Elasticsearch code for its inner workings. A few days ago Elastic announced that they closed down their Apache licensed code by relicensing it to SSPL, which is merely GPLv3 with a SaaS protection on top. We, and several others, have been shocked that something like this happened especially since Elastic stated that they will never do so in their x-pack blog post: “We did not change the license of any of the Apache 2.0 code of Elasticsearch, Kibana, Beats, and Logstash — and we never will.” 

Crate founders, myself included, have a long history with Elastic, dating 10 years, when we operated some of the largest Elasticsearch deployments in Europe. We liked what Elasticsearch did for search and so our journey as good citizens in the open source community began; taking and giving back, and supporting large scale systems we built prior to founding Crate.io.

We realized that we wanted to have the same power and simplicity, not only for search, but for a database with Standard SQL; thus, we founded Crate.io and set off on the journey to build an open source, deep tech, product.

The problem is the infectious “GPL” in “SSPL”

With the goal to build a database product, we started to write our own Apache licensed Elasticsearch plugins (some artefacts still exist e.g: inout, timefacets) which eventually were merged into CrateDB.

All of this happened because Elasticsearch was licensed under the permissive Apache license. We would never have chosen Elasticsearch in the first place, had it been licensed under the GPL as some of our customers (and many large enterprises do by default) banned GPL licensed software from their application stacks for legal risks. Also, raising venture capital (and a lot is needed to build a database from scratch) is very difficult with copyleft code, since it simply reduces possible future opportunities.

Knowing that it’s extremely hard to be commercially successful with open source software, I was impressed at how big Elastic got and what they’ve achieved. Having said that, I’m also sure that using a permissive license was one of the key factors for the huge adoption of Elasticsearch besides being a sensational product, of course.

While I agree with Elastic’s position that it is “not ok” what Amazon has done with their trademark, it has nothing to do with Elastic’s licensing change. What is “not ok” is the fact that switching to a GPL based license forces projects depending on the code like CrateDB to use a fork since it kills the business model. I’m having a hard time believing that Elastic forgot about those projects while fighting against Amazon’s SaaS solution. Undoubtedly, the popularity of Elasticsearch soared when AWS started to offer it as well, and it helped them very likely to sell their own solution and SaaS.

Look forward to joining forces on an Elasticsearch Fork

With CrateDB 4.0, we switched away from using the Elasticsearch upstream directly to copying the code over into our repository, merely because we saw parts of the codebase lacking modularity. For example, things like the REST-API (which we do not use) are still a bit entangled across the codebase, also handling the “transparent arrays” in the SQL-world required adoptions in various places. With this switch, our contributions to the upstream Elastic project actually diminished and we ended up only backporting Elasticsearch code since it was hard to integrate our requirements back into the upstream Project.

Our plan is now to switch and contribute to a maintained fork like the one Amazon already announced. We are looking forward to joining forces on a new project fork. This move might also pave the way to a more modular design, which would allow downstream projects to easily contribute and use the upstream as a framework. There are a few functionalities which could be extracted into a library; first examples are the discovery, transport and cluster state handling.

Switching to full open source with CrateDB 4.5

We strongly trust in open source, and coincidentally, we at Crate.io had decided already in December 2020 to open our enterprise features with the 4.5 release in 2021 (before even Elastic announced their change). We decided this with an improved 2021 strategy, with developers in mind and further fostering growth in adoption. With these latest news, and seeing the reactions in the community about Elastic’s move, we are even more confident that this is the right decision.

I have to admit, we have been fully open source before when we started building CrateDB. We then went for an open core strategy by starting to license some of the new features under a commercial license. But times are changing, and we strongly believe that customers are looking for managed solutions operated by experts (call it cloud, fog or edge) and this is where we see our commercial success. Crate.io can be part of those emerging infrastructure designs by integrating an elastic horizontal scaling database in an “IaaS-provider” independent way.

I am closing with a personal invitation to try out CrateDB, and please also support us on Github with your very welcomed contributions, or even just giving us a star (we love that!).

Bernd Dorn, Founder and CTO

Want to try CrateDB 4.4?

Newsletter

Stay up to date

Sign up here to keep informed about CrateDB product news,
events, how-to articles, and community update.