Feedback
Machine Learning¶
Integrate CrateDB with machine learning frameworks and tools, for MLOps and Vector database operations.
Machine Learning Operations
Training a machine learning model, running it in production, and maintaining it, requires a significant amount of data processing and bookkeeping operations.
CrateDB, as a universal SQL database, supports this process through adapters to best-of-breed software components for MLOps procedures.
MLOps is a paradigm that aims to deploy and maintain machine learning models in production reliably and efficiently, including experiment tracking, and in the spirit of continuous development and DevOps.
Vector Store
CrateDB’s FLOAT_VECTOR data type implements a vector store and the k-nearest neighbour (kNN) search algorithm to find vectors that are similar to a query vector.
These feature vectors may be computed from raw data using machine learning methods such as feature extraction algorithms, word embeddings, or deep learning networks.
Vector databases can be used for similarity search, multi-modal search, recommendation engines, large language models (LLMs), retrieval-augmented generation (RAG), and other applications.
Anomaly Detection and Forecasting¶
MLflow¶
Tutorials and Notebooks about using MLflow together with CrateDB.
Blog: Running Time Series Models in Production using CrateDB
Part 1: Introduction to Time Series Modeling using Machine Learning
The article will introduce you to the concept of time series modeling, discussing the main obstacles running it in production. It will introduce you to CrateDB, highlighting its key features and benefits, why it stands out in managing time series data, and why it is an especially good fit for supporting machine learning models in production.
Fundamentals
Time Series Modeling
Notebook: Create a Time Series Anomaly Detection Model
Guidelines and runnable code to get started with MLflow and CrateDB, exercising time series anomaly detection and time series forecasting / prediction using NumPy, Salesforce Merlion, and Matplotlib.
Fundamentals
Time Series
Anomaly Detection
Prediction / Forecasting
scikit-learn¶
Use scikit-learn with CrateDB.
Regression analysis with pandas and scikit-learn
Use pandas and scikit-learn to run a regression analysis within a Jupyter Notebook.
Fundamentals
Regression Analysis
TensorFlow¶
Use TensorFlow with CrateDB.
Predictive Maintenance
Build a machine learning model that will predict whether a machine will fail within a specified time window in the future.
Fundamentals
Prediction
LLMs / RAG¶
One of the most powerful applications enabled by LLMs is sophisticated question-answering (Q&A) chatbots. These are applications that can answer questions about specific sources of information, using a technique known as Retrieval Augmented Generation, or RAG. RAG is a technique for augmenting LLM knowledge with additional data.
LangChain¶
Tutorials and Notebooks about using LangChain together with CrateDB. LangChain has a number of components designed to help build Q&A applications, and RAG applications more generally. This feature uses CrateDB’s Vector Store implementation.
What can you build with LangChain?
Tutorial: Set up LangChain with CrateDB
LangChain is a framework for developing applications powered by language models. For this tutorial, we are going to use it to interact with CrateDB using only natural language without writing any SQL.
To achieve that, you will need a CrateDB instance running, an OpenAI API key, and some Python knowledge.
Fundamentals
Vector Store
LLM
RAG