Image by AuthorFunctions are essential in a data science project because they make the code more modular, reusable, readable, and testable. However, writing a messy function that tries to do too much can introduce maintenance hurdles and diminish the code’s readability. In the following code, the function impute_missing_values is long, messy, and tries to do…
Introduction to AutoGen and Mistral AI: AutoGen is a framework developed by Microsoft and designed to simplify the development of multi-agent applications, particularly in orchestrating LLM agents. Multi-agent applications involve systems where multiple LLM or multi-modal agents or entities interact with each other in the whole workflow to achieve specific goals or tasks. These agents…
Quick Success Data Science A hands-on guide for beginners (mche-lee-PC91Jm1DlWA-unsplash)If you’re going to do any serious programming with Python, you’ll need to understand object-oriented programming and the concept of a class and a dataclass. In this Quick Success Data Science article, you’ll get a quick and painless introduction to all three, including what they’re for,…
Optimize your data science workflow by automating matplotlib output — with 1 line of code. Here’s how. Naming things is hard. After a long enough day, we’ve all ended up with the highly-descriptive likes of “graph7(1)_FINAL(2).png” and “output.pdf" Look familiar? We can do better — and quite easily, actually. When we use data-oriented “seaborn-esque” plotting…
Learn how a neural network with one hidden layer using ReLU activation can represent any continuous nonlinear functions. Activation functions play an integral role in Neural Networks (NNs) since they introduce non-linearity and allow the network to learn more complex features and functions than just a linear regression. One of the most commonly used activation…
When working with time-series data it can be important to apply filtering to remove noise. This story shows how to implement a low-pass filter in SQL / BigQuery that can come in handy when improving ML features. Filtering of time-series data is one of the most useful preprocessing tools in Data Science. In reality, data…
Where it stands out from other swarm algorithms This article is a continuation of my nature-inspired series. Previously, I talked about Evolutionary Algorithm (EA), Particle Swarm Optimization (PSO), as well as Artificial Bee Colony (ABC). Nature is everywhere, and there’s certainly more areas where humans can benefit by learning from nature. Today, we focus on…
Automate resource provisioning with modern tools 12 min read · 13 hours ago Photo by Ehud Neuhaus on UnsplashModern data stacks consist of various tools and frameworks to process data. Typically it would be a large collection of different cloud resources aimed to transform the data and bring it to the state…
The question is not anymore whether we can solve the problem with AI but to what extent it returns sustainable and reliable results. Good craftsmanship, governance, ethics, and education on AI are what we need now. Photo by Karan Suthar on UnsplashSince I was a kid, I have always been intrigued and interested in new…
A fully offline use of Whisper ASR and LLaMA-2 GPT Model Raspberry Pi running a LLaMA model, Image by authorNowadays, nobody will be surprised by running a deep learning model in the cloud. But the situation can be much more complicated in the edge or consumer device world. There are several reasons for that. First,…