The JuLS Project, Model to Meaning Book, Claude Code Tips, Airflow DAG Testing, Python Dictionaries
A weekly curated update on data science and engineering topics and resources.
This week's agenda:
Open Source of the Week - The JuLS project
New learning resources - Claude Code tips, Airflow DAG testing framework, Python dictionaries
Book of the week - Model to Meaning by Vincent Arel-Bundock
I share daily updates on Substack, Facebook, Telegram, WhatsApp, and Viber.
Are you interested in learning how to set up automation using GitHub Actions? If so, please check out my course on LinkedIn Learning:
Open Source of the Week
It's been a while since we had a Julia project here, and this week's focus is on JuLS - a new open-source project from the Amazon Science team.
The JuLS project offers a local search solver that combines Constraint-Based Local Search (CBLS) and Constraint Programming (CP) to solve the Constraint Optimization Problem (COP). Solvers are often computationally intensive, and using Julia, a high-performance language, makes sense.
Project repository: https://github.com/amazon-science/JuLS
The JuLS solver has the following wrapper functions:
model: The Local Search model that will optimize the problem defined
dag: The Directed Acyclic Graph (DAG) structure to evaluate the constraints and objectives of the problem to optimize. Each DAG component is declared in the invariant section.
cp: The constraint programming solver to filter efficiently infeasible moves. A builder is provided to convert a DAG into the corresponding CSP (Constraint Satisfaction Problem)
heuristics: The heuristics to be used during model optimization for initialization, neighbourhood definition, and move selection.
experiments: Optimizes a certain problem based on input data. The place to declare the problem's DAG and the customized heuristics.
The following example from the project repo illustrates the steps for solving the famous Traveling Salesman Problem with a few lines of code:
The above example and more can be found in the following Jupyter notebook.
You can find more details about this search solver in the following paper.
License: Apache 2.0
New Learning Resources
Here are some new learning resources that I came across this week.
Claude Code Tips
The following short tutorial provides great tips for using Claude Code.
Create an Airflow DAG Testing Framework
The following concise tutorial by the Data Guy demonstrates how to set a testing framework for Airflow DAGs.
Python Dictionaries Explained
The following short tutorial by Visually Explained introduces what Python dictionaries are and how to use them.
Book of the Week
This week's focus is on a new stats book - Model to Meaning by Vincent Arel-Bundock. The book focuses on how to interpret statistical models with marginal effects with R and Python examples.
The book covers the following topics:
Models and meaning
Quantities and tests
Counterfactual comparisons
Causal inference with G-computation
Interactions and polynomials
Multilevel regression with poststratification
Categorical and ordinal outcomes
Uncertainty
This book is ideal for data scientists, statisticians, social scientists, or ML practitioners who already build models and want to deepen their ability to interpret them (especially non-linear, complex, or “black box” models) and communicate results meaningfully to nontechnical stakeholders.
Thanks to the author, a free version of the book is available online.
If you'd like to support the authors or get a physical copy, you can purchase the book on the CRC Press website (pre-order).
Have any questions? Please comment below!
See you next Saturday!
Thanks,
Rami