Issue 61: The OpenMetadata Project, New ML Book, Stanford New LLM Course
A weekly curated update on data science and engineering topics and resources.
This week's agenda:
Open Source of the Week - The OpenMetadata project
New learning resources - Transformers & LLMs, Posit Conf 2025 talks, getting started with CodeX
Book of the week - Hands-On Machine Learning with Scikit-Learn and PyTorch by Aurélien Géron
The newsletter is also available on LinkedIn and Medium.
Enjoying this newsletter? Here’s how you can support:
Click 👍 and share ♻️
Have a Medium subscription? Please read it on Medium!
Are you interested in learning how to set up automation using GitHub Actions? If so, please check out my course on LinkedIn Learning:
Open Source of the Week
The newsletter is also available on Substack and Medium.
Enjoying this newsletter? Here’s how you can support:
Click 👍 and share ♻️
Have a Medium subscription? Please read it on Medium!
Are you interested in learning how to set up automation using GitHub Actions? If so, please check out my course on LinkedIn Learning:
Open Source of the Week
This week’s focus is on the OpenMetadata project for data engineering applications by OpenMetadata. The OpenMetadata provides a unified metadata platform for data discovery, data observability, and data governance, powered by a central metadata repository, in-depth column-level lineage, and seamless team collaboration.
Project repo: https://github.com/open-metadata/OpenMetadata
Key Features
The project provides a set of tools for the following data applications:
Discovery tools
Collation and integration
Quality and profiling
Insights and KPIs
Lineage and observability
Documentation
More details are available in the project documentation.
License: Apache 2.0
New Learning Resources
Here are some new learning resources that I came across this week.
Transformers & LLMs
A new course from Stanford focuses on Transformers and LLMs. The course is halfway through, and the first four lectures are available on the Stanford YouTube channel.
Posit Conference 2025
All the talks from the Posit Conference 2025 are now available to watch.
Getting Started with CodeX
The following playlist by Net Ninja provides an introduction to the OpenAI Codex code generator. This includes setup, code review, using agents, MCP servers, IDE extensions, etc.
Book of the Week
This week’s focus is on a core ML book - Hands-On Machine Learning with Scikit-Learn and PyTorch by Aurélien Géron. The book focuses on core topics from the foundation of ML to LLM models. This includes the following topics:
Foundation of machine learning
Classification models
Training ML models
Tree’s models
Dimension reduction approaches
Neural network and deep learning models
NLP and transformers
Reinforcement learning
This book is ideal for data scientists or ML engineers with some programming experience in Python who want to go beyond theory and actually build, train, and deploy machine-learning systems using modern tools.
The book is available online on the publisher’s website (for subscribers) and can be purchased in hard copy on Amazon.
Have any questions? Please comment below!
See you next Saturday!
Thanks,
Rami




