What it is and How to apply it to a real-world scenario Photo by Google DeepMind on UnsplashThis year, my resolution is to go back to the basics of data science. I work with data every day, but it’s easy to forget how some of the core algorithms function if you’re completing repetitive tasks. I’m…
Text-to-image generation is a unique field where language and visuals converge, creating an interesting intersection in the ever-changing world of AI. This technology converts textual descriptions into corresponding images, merging the complexities of understanding language with the creativity of visual representation. As the field matures, it encounters challenges, particularly in generating high-quality images efficiently from…
Exploring the Transformer’s Decoder Architecture: Masked Multi-Head Attention, Encoder-Decoder Attention, and Practical Implementation This post was co-authored with Rafael Nardi. In this article, we delve into the decoder component of the transformer architecture, focusing on its differences and similarities with the encoder. The decoder’s unique feature is its loop-like, iterative nature, which contrasts with the…
In the rapidly evolving domain of augmented and virtual reality, creating 3D environments is a formidable challenge, particularly due to the complexities of 3D modeling software. This situation often deters end-users from crafting personalized virtual spaces, an increasingly significant aspect in diverse applications ranging from gaming to educational simulations.
Central to this challenge is the…
Inspired by progress in large-scale language modelling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks…
Turn a government PDF into a financial planning tool Photo by Robert Murray on Unsplash!Hierarchical data is a data model where items are linked to each other in parent-child relationships, forming a tree structure. Some obvious examples are family trees and corporate organization charts. A treemap is a diagram that represents hierarchical data using nested…
The increasing number of Internet of Things (IoT) devices makes everyday life easier and more convenient. However, they can also pose many security risks. Criminals are quick to take advantage of the expanding attack surface. Luckily, there are ways you can leverage developing cybersecurity measures like “zero-trust” architecture to prevent bad actors from succeeding. …
Bank Reconciliation is the process of matching the company's cash balance to the bank statement. The aim is to ensure all transactions, like customer payments, bank fees, outstanding checks, and refunds, are accurately recorded in the company's cashbooks. Bank reconciliation is crucial for identifying accounting errors and detecting fraud or theft. Without proper reconciliation of…
Large Language Models (LLMs) have recently extended their reach beyond traditional natural language processing, demonstrating significant potential in tasks requiring multimodal information. Their integration with video perception abilities is particularly noteworthy, a pivotal move in artificial intelligence. This research takes a giant leap in exploring LLMs’ capabilities in video grounding (VG), a critical task in…
Image by Author
I like to think of ChatGPT as a smarter version of StackOverflow. Very helpful, but not replacing professionals any time soon. As a former data scientist, I spent a solid amount of time playing around with ChatGPT when it came out. I was pretty impressed with its coding capacity. It…