Using OpenAI’s Clip model to support natural language search on a collection of 70k book covers In a previous post I did a little PoC to see if I could use OpenAI’s Clip model to build a semantic book search. It worked surprisingly well, in my opinion, but I couldn’t help wondering if it would…
An accessible walkthrough of fundamental properties of this popular, yet often misunderstood metric from a predictive modeling perspective Photo by Josh Rakower on UnsplashR² (R-squared), also known as the coefficient of determination, is widely used as a metric to evaluate the performance of regression models. It is commonly used to quantify goodness of fit in…
For additional ideas on how to improve the performance of your RAG pipeline to make it production-ready, continue reading here: This section discusses the required packages and API keys to follow along in this article. Required Packages This article will guide you through implementing a naive and an advanced RAG pipeline using LlamaIndex in Python.…
LDA Convergence Explained with a Dog Pedigree Model “What if my a priori understanding of dog breed group distribution is inaccurate? Is my LDA model doomed?” My wife asked. Welcome back to part 2 of the series, where I share my journey of explaining LDA to my wife. In the previous blog post, we discussed…
How to Stream and Apply Real-Time Prediction Models on High-Throughput Time-Series Data Photo by JJ Ying on UnsplashMost of the stream processing libraries are not python friendly while the majority of machine learning and data mining libraries are python based. Although the Faust library aims to bring Kafka Streaming ideas into the Python ecosystem, it…
Simplified utilizing the HuggingFace trainer object Image from Unsplash by Markus SpiskeHuggingFace serves as a home to many popular open-source NLP models. Many of these models are effective as is, but often require some sort of training or fine-tuning to improve performance for your specific use-case. As the LLM implosion continues, we will take a…
Advanced techniques for beginners AI generated image using KandinskyIn this story, I would like to raise a discussion on how we transform data. Whether it’s a database, data warehouse or reporting solution we run data transformations based on data models but how do we organise them? I would like to talk about the modern data…
To start, I would choose an introduction/beginner course that I like the look of or recommendations from another person who I know has good Python skills. You might have heard me say in one of my previous posts that there is no such thing as the “right” course. While this is definitely true, some courses…
Running a multimodal LLaVA model, camera, and speech synthesis Image by Enoc Valenzuela, UnsplashModern large multimodal models (LMMs) can process not only text but also different types of data. Indeed, “a picture is worth a thousand words,” and this functionality can be crucial during the interaction with the real world. In this “weekend project,” I…
Meta’s open-source Seamless models: A deep dive into translation model architectures and a Python implementation guide using HuggingFace This post was co-authored with Rafael Guedes. The growth of an organization is not limited to its country boundaries. Some organizations only sell or operate on external markets. This globalization comes with several challenges, one being how…