This week's agenda:
Open Source of the Week - The Orbital project
New learning resources - AI chatbot with Docker Model Runner, concept drift, dltHub workspace demo, introduction to Python OOP, data visualization with Svelte and D3
Book of the week - LLMs in Production by Chris Brousseau and Matthew Sharp
I share daily updates on Substack, Facebook, Telegram, WhatsApp, and Viber.
Are you interested in learning how to set up automation using GitHub Actions? If so, please check out my course on LinkedIn Learning:
Open Source of the Week
This week's focus is on a new project from Posit PBC - Orbital.
Project repo: https://github.com/posit-dev/orbital
The Orbital is a Python library that provides a set of functions to convert SKLearn pipelines into executable SQL queries, and therefore, enables running ML models directly on the database. Or in other words, it saves the normal process of pulling the data from the database, executing the model, and pushing the results back to the database by simply executing the full process on the database using SQL queries.
The Orbital has an R version that has functionality similar to the tidymodels framework.
The following example from the project documentation illustrates how the library converts a pipeline using linear regression to train a model into a SQL query:
And this is the SQL query with the model coefficients:
Currently, the library supports the following SKLearn models:
Linear Regression
Logistic Regression
Lasso Regression
Elastic Net
Decision Tree Regressor
Decision Tree Classifier
Random Forest Classifier
Gradient Boosting Regressor
Gradient Boosting Classifier
You can find more information about the library in the release post and in the library documentation.
License: MIT
New Learning Resources
Here are some new learning resources that I came across this week.
AI chatbot using Docker Model Runner
This is a cool demo by Ajeet Singh Raina of setting up an AI chatbot using Docker Model Runner 🐳. The demo provides a high-level architecture of a containerized AI chatbot, which includes the following components:
Running LLaMA 3.2 models locally with GPU acceleration
Building a real-time streaming chatbot UI with React and Go
Containerizing the entire stack with Docker Compose
Adding observability using Prometheus, Grafana, and Jaeger
Detection and Interpretability of Concept Drift
The following talk by Gurgen Hovakimyan focuses on concept drift.
Concept drift refers to a change in the relationship between input data and the target variable in a machine learning model over time.
dltHub Workspace Demo
The following demo from the dlt team provides an example of using a dlt pipeline to monitor OpenAI API cost.
Part 1:
Part 2:
Python Object-Oriented Programming Explained
A short and concise introduction to object-oriented programming (OOP) in Python by NearalNine.
Data Visualization with Svelte and D3
The following tutorial by Gregory Kirchoff provides an introduction to interactive data visualization with Svelte and D3.
Book of the Week
This week's focus is on the LLM book - LLMs in Production by Chris Brousseau 🤯 and Matthew Sharp. This book, as its name implies, focuses on deploying LLM-based applications into production. The book provides an overview of how LLMs work and their integration with applications, and it covers the following topics:
Introduction to LLM
Building a platform for LLM
Data engineering for LLM
Training LLM
Prompt engineering
Building an application with LLM
Productionize
The book includes examples for practical use cases of LLM-based applications, such as building a VScode AI coding extension and deploying a small model to a Raspberry Pi.
The book is ideal for data scientists and AI developers who wish to dive into the world of LLMOps.
The book is available for purchase via the book's publisher's website or via Amazon.
Have any questions? Please comment below!
See you next Saturday!
Thanks,
Rami