Graph neural networks (GNNs) and large language models (LLMs) have emerged as two major branches of artificial intelligence, achieving immense success in learning from graph-structured and natural language data respectively. As graph-structured and natural language data become increasingly interconnected in real-world applications, there is a growing need for artificial intelligence systems that can perform multi-modal…
With A Tail of Cat Food Preferences Photo by Anastasiia Rozumna on UnsplashWelcome to the ‘Courage to learn ML’. This series aims to simplify complex machine learning concepts, presenting them as a relaxed and informative dialogue, much like the engaging style of “The Courage to Be Disliked,” but with a focus on ML. In this…
Photo by ThisisEngineering RAEng on UnsplashDescribing the nature with the help of analytical expressions verified through experiments has been a hallmark of the success of science especially in physics from fundamental law of gravitation to quantum mechanics and beyond. As challenges such as climate change, fusion, and computational biology pivot our focus toward more compute,…
Data Science Fundamentals Beginner’s practical guide to discrete optimisation in Python 10 min read · 16 hours ago Data Scientists tackle a wide range of real-life problems using data and various techniques. Mathematical optimisation, a powerful technique that can be applied to a wide range of problems in many domains, makes a…
The class neighborhood of a dataset can be learned using soft nearest neighbor loss In this article, we discuss how to implement the soft nearest neighbor loss which we also talked about here. R epresentation learning is the task of learning the most salient features in a given dataset by a deep neural network. It…
Why static workload is insufficient and what I learned by comparing HNSWLIB and DiskANN using streaming workload Image by DALLE-3Vector databases are built for high-dimensional vector retrieval. Today, many vectors are embeddings generated by deep neural nets like GPTs and CLIP to represent data points such as pieces of text, images, or audio tracks. Embeddings…
How to use BigQuery GENERATE_TEXT remote function 10 min read · 13 hours ago “Everyone can code and do NLP analysis in BigQuery with SQL knowledge and a good prompt structure” [Photo by Adi Goldstein on Unsplash]Since I started working with the Google Platform, Google has not stopped surprising me with its…
Pet Project for Data/Analytics Engineers: Explore Modern Data Stack Tools — dbt Core, Snowflake, Fivetran, GitHub Actions. Photo by Gaining Visuals on UnsplashHere is a simple and fast pet project for Data/Analytics Engineers, who want to kick the tires on Modern Data Stack tools including dbt Core, Snowflake, Fivetran, and GitHub Actions. This hands-on experience…
When LLMs are used to evaluate qualities like the correctness, accuracy, or relevance of a piece of text, consistency is paramount. If an LLM exhibits inconsistent judgements, then its evaluations become unreliable and untrustworthy. If an LLM evaluates the reasoning quality of arguments, but contradicts itself by rating an invalid argument as more logically sound…
Transforming static plots into captivating narratives Photo by Teemu Paananen on UnsplashPlotly supports an excellent foundation for animated plots. I highly recommend their basic tutorial here. However, plotly animations are primarily set up to add another dimension to the visualization — usually time. This is fantastic for adding more meaning to a plot. Animation, however,…