Skip to content Skip to sidebar Skip to footer

Beyond English: Implementing a multilingual RAG solution | by Jesper Alkestrup | Dec, 2023

Splitting text, the simple way (Image generated by author w. Dall-E 3)When preparing data for embedding and retrieval in a RAG system, splitting the text into appropriately sized chunks is crucial. This process is guided by two main factors, Model Constraints and Retrieval Effectiveness. Model Constraints Embedding models have a maximum token length for input;…

Read More

Illuminating the Black Box of Textual GenAI | by Anthony Alcaraz | Dec, 2023

The need for insights LLMs like ChatGPT, Claude 2, Gemini, and Mistral captivate the world with their articulateness and erudition. Yet these large language models remain black boxes, concealing the intricate machinery powering their responses. Their prowess at generating human-quality text outstrips our prowess at understanding how their machine minds function. But as artificial intelligence…

Read More

A Guide to 21 Feature Importance Methods and Packages in Machine Learning (with Code) | by Theophano Mitsa | Dec, 2023

From the OmniXAI, Shapash, and Dalex interpretability packages to the Boruta, Relief, and Random Forest feature selection algorithms Image created by the author at DALL-E“We are our choices.” —Jean-Paul Sartre We live in the era of artificial intelligence, mostly because of the incredible advancement of Large Language Models (LLMs). As important as it is…

Read More

Visualizing trade flow in Python maps — Part I: Bi-directional trade flow maps | by Himalaya Bir Shrestha | Dec, 2023

In the trade flow maps, I aimed to represent two-way trade relationships between countries. For example, the export from Nepal to India would be represented by the first arrow (A1-A2) and the import by Nepal from India would be represented by a second arrow (A3-A4). In this way, each country pair relationship would require four…

Read More