The PandasAI Project, Learning SQL Book, Fine-Tuning Local LLMs

A weekly curated update on data science and engineering topics and resources.

Aug 23, 2025

This week's agenda:

Open Source of the Week - The PandasAI project
New learning resources - Getting started with Docker Model Runner, MCP Dev days, context engineering, training Qwen 3 model, fine-tuning local LLMs
Book of the week - Learning SQL by Alan Beaulieu

I share daily updates on Substack, Facebook, Telegram, WhatsApp, and Viber.

Are you interested in learning how to set up automation using GitHub Actions? If so, please check out my course on LinkedIn Learning:

Get Started with GitHub Actions!

Open Source of the Week

This week's focus is on the PandasAI project. This project, as the name implies, is a combination of the Pandas library with LLMs. Hence, it enables users to interact with Pandas DataFrame using natural language.

Project repo: https://github.com/sinaptik-ai/pandas-ai

Key Highlights and Functionality

Data preparation layer that enables the user to define semantic data schemas and establish data relationships across different tables
Supports various data formats such as Parquet, CSV, etc.
Translate natural language to executable code
Generate dynamic visualization and charts
Provides automated data insights

Here is a simple workflow with the library (via the project documentation):

import pandasai as pai
from pandasai_openai.openai import OpenAI

llm = OpenAI("OPEN_AI_API_KEY")

pai.config.set({
    "llm": llm
})

# Sample DataFrame
df = pai.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "revenue": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})

df.chat('Which are the top 5 countries by sales?')

This returns:

China, United States, Japan, Germany, Australia

More details are available in the project documentation.

License: MIT

New Learning Resources

Here are some new learning resources that I came across this week.

Getting Started with Docker Model Runner

This tutorial focuses on getting started with Docker Model Runner, and it is the first in a series that focuses on using Docker Model Runner to run LLMs locally. This tutorial is available for Medium members:

https://medium.com/data-science-collective/getting-started-with-docker-model-runner-c2ed4ce4b5ee

Alternatively, this tutorial and the rest of the sequence are available on The AIOps Newsletter:

MCP Dev Days

The talks from the Microsoft Developer MCP Dev Days conference are now available online. The conference focuses on different use cases of the Model Context Protocol (MCP) and its ecosystem. This includes topics such as:

MCP in production
MCP with AI agents
Security
Supporting tools

Context Engineering Explained

This short and concise tutorial by Shaw Talebi explains what context engineering is and some practical tips for setting up a prompt.

Code & Train Qwen 3

This new tutorial by freeCodeCamp focuses on training and tuning the Qwen 3 model.

Fine-Tuning Local LLMs

The following tutorial by NeuralNine provides a step-by-step guide for fine-tuning LLMs locally with Unsloth and Ollama.

Book of the Week

I share a book here every week, and as far as I remember, I don't think we have had an SQL book so far. This week's book is Learning SQL by Alan Beaulieu. The book focuses on the foundation of SQL, covering core concepts with practical examples. This includes topics such as:

SQL basic operations
Creating and populating a database
Query primer
Working with sets
Grouping an daggregates
Conditional logic
Indexes and constraints
Working with multiple tables

This book is ideal for data practitioners—from aspiring data scientists and analysts to ML/AI architects and database administrators—who want a solid, comprehensive foundation in SQL, bridging from core querying to advanced and big data use cases.

The book is available online for reading on the O'Reilly platform (for subscribers) and can also be purchased in a printed version on Amazon.

Have any questions? Please comment below!

See you next Saturday!

Thanks,

Rami

Rami's Data Newsletter

Discussion about this post