This Week in AI: Exploring LLMs & Building RAG-Based Chatbots


AI Tech Circle

Stay Ahead in AI with Weekly AI Roundup; read and listen on AITechCircle:

Welcome to the weekly AI Newsletter, your go-to source for practical and actionable ideas. I’m here to give you tips that you can immediately apply to your job and business.

Before we start, share this week’s updates with a friend or a colleague:

Today at a Glance:

  • Start your learning journey with code to build a RAG-based Chat with knowledge sources.
  • Generative AI Usecase: AI-powered virtual assistants can provide personalized responses to citizen inquiries about public services
  • AI Weekly news and updates covering Small Language Models
  • Open Tech Talk Podcast, the latest episode on Developing AI Products From Tech Stack to User Feedback with Jason Agouris

Chat with Knowledge Base through RAG

Retrieval-augmented generation (RAG) frequently arises when implementing large language models in business. RAG is considered the ideal solution for scenarios requiring leveraging business data alongside Generative AI. It acts as a bridge between your organizational data and the LLM, ensuring that you receive the desired outputs.

This topic was briefly covered in the earlier edition of the newsletter “Build Your Business Specific LLMs Using RAG.“​ Read this to understand the fundamentals.

This week, I reviewed the technical aspects of RAG based on the article published by Cohere.

The following steps have been followed in the notebook:

  • Step 1: Ingest the documents. In this scenario, web sources are used to get documents, chunk, embed, and index.

For each user-chatbot interaction:

  • Step 2: Get the user message through the chat
  • Step 3: Call the Chat endpoint in query-generation mode
  • If at least one query is generated
  • Step 3: Retrieve and rerank relevant documents
  • Step 4: Call the Chat endpoint in document mode to generate a grounded response with citations
  • If no query is generated
  • Step 4: Call the Chat endpoint in normal mode to generate a response

The notebook is also available on GitHub, and you can download it to go through it.

Eventually, in the business context, we will develop the chat interface using a chatbot or digital assistant platform.

Weekly News & Updates…

Last week’s AI breakthroughs marked another leap forward in the tech revolution.

  1. SearchGPT from OpenAI, a prototype of new AI search features, gives you fast and timely answers with clear and relevant sources. Link
  2. Mistral and Nvidia have collaboratively launched Mistral NeMo, a model with 12 billion parameters. It features a context window of 128,000 tokens, matching that of GPT-4 mini and surpassing most models of similar size. This model is available under the Apache 2.0 open-source license. Link
  3. Hugging Face has introduced SmolLM, a family of three compact models with 135 million, 362 million, and 1.71 billion parameters, specifically designed for mobile devices. Both the base and instruction-tuned versions, including weights, are freely available without any restrictions on commercial use. Link
  4. Cohere has released a new version of the Rerank 3 nimble model, part of a series designed to improve search quality while maximizing throughput. The latest model in this series is approximately three times faster than Rerank 3 while maintaining high accuracy.
  5. Salesforce has unveiled xLAM, a new family of large action models (LAMs) engineered to plan and execute tasks autonomously. Available in 1.35 billion and 7 billion parameter versions, these models exhibit impressive capabilities despite their relatively compact size. Link
  6. Stable Audio Open: This open-weight text-to-audio model generates high-quality stereo audio 44.1kHz from text prompts. It runs on consumer-grade GPUs and is perfect for synthesizing realistic sounds and field recordings, making it accessible for academic and artistic use. Link

The Cloud: the backbone of the AI revolution

  • Using GenAI to transform DevOps on OCI
  • Customize Generative AI Models for Enterprise Applications with Llama 3.1 from Nvidia

Gen AI Use Case of the Week:

Generative AI use cases in the Government and Public Sector :

Utilizing large language models (LLMs), AI-powered virtual assistants can provide personalized responses to citizen inquiries about public services

Implementing Large Language Models (LLMs) will streamlines inquiry management, enhances citizen satisfaction, and reduces operational costs by leveraging advanced natural language processing (NLP) capabilities. Integrating these AI solutions into existing government communication channels offers a scalable and efficient way to improve public service delivery

Business Challenges

  1. Managing a large volume of citizen inquiries efficiently
  2. Ensuring accurate and timely responses to diverse questions
  3. Reducing the operational costs associated with customer service
  4. Enhancing citizen satisfaction and trust in public services

AI Solution Description

Utilize large language models (LLMs) to develop AI-powered virtual assistants that can provide personalized responses to citizen questions about public services. This will be implemented by training LLMs on comprehensive datasets, including public service information, FAQs, and previous inquiries. The virtual assistant will use natural language processing (NLP) to understand and respond to citizen questions, offering real-time, accurate, and contextually relevant answers.

Steps:

1. Data Collection and Training: Gather extensive datasets from government databases, public service records, and historical inquiry logs to train the LLM.

2. Integration and Deployment: Integrate the trained LLM into government communication channels, such as websites, mobile apps, and call centers.

3. Continuous Improvement: Implement feedback mechanisms to continuously update and improve the model based on citizen interactions and new public service data.

Expected Impact/Business Outcome

  • Revenue: Indirect improvement by optimizing resource allocation and reducing costs
  • User Experience: Enhanced citizen satisfaction through quick, accurate, and personalized responses
  • Operations: Streamlined inquiry handling, freeing up human resources for more complex tasks
  • Process: Improved efficiency in managing and responding to a high volume of inquiries
  • Cost: Significant reduction in operational costs related to customer service.

Required Data Sources

  • Historical inquiry logs
  • Frequently Asked Questions (FAQs) documents
  • Government policies and service guidelines
  • Government services offering systems

Strategic Fit and Impact

Enhances public trust and satisfaction with government services

Increases operational efficiency and reduces costs

Leverages advanced AI technology to provide scalable and sustainable solutions

Aligns with digital transformation initiatives in the public sector.

Rating: High Impact & strategic fit

Favorite Tip Of The Week:

Here’s my favorite resource of the week.

  • A must-read paper on technical governance for AI and documented 100+ open technical problems Link

Potential of AI

  • AI to solve International Mathematical Olympiad problems at a silver medalist level from Google Deep Mind. AlphaProof and AlphaGeometry 2 solve advanced reasoning problems in mathematics.

Things to Know…

This week, we should all read in-depth about the massive Windows crash worldwide, which ​TheVerge​ narrated comprehensively, to understand the background and what can be done in the future.

“Inside the 78 minutes that took down millions of Windows machines.”

The Opportunity…

Podcast:

  • This week’s Open Tech Talks episode 140 is “Developing AI Products From Tech Stack to User Feedback with Jason Agouris.” Jason’s extensive experience in systems integration across retail, fintech, wholesale, and supply chain logistics makes him our go-to data integration and strategy guru.

Apple | Spotify | Youtube

Courses to attend:

  • Federated Learning course from Flower, an open-source framework, to build a federated learning system and implement federated fine-tuning of LLMs with private data

Events:

Tech and Tools…

  • Dify is an open-source LLM app development platform.
  • Sourcery is an automated code reviewer that will review any pull request in any language to provide instant feedback on the proposed changes.
  • Everything-ai: fully proficient, AI-powered and local chatbot assistant

Data Sets…

  • MELD: Multimodal EmotionLines Dataset, A dataset for Emotion Recognition in Multiparty Conversations.
  • Objaverse is a massive dataset of annotated 3D objects; it has two releases, Objaverse 1.0 A Universe of Annotated 3D Objects (800K) and Objaverse-XL: A Universe of 10M+ 3D Objects

Other Technology News

Want to stay on the cutting edge?

Here’s what else is happening in Information Technology you should know about:

  • FACT SHEET: Biden-⁠Harris Administration Announces New AI Actions and Receives Additional Major Voluntary Commitment on AI
  • AI training costs are growing exponentially — IBM says quantum computing could be a solution, a story published by VentureBeat

Join a mini email course on Generative AI …

Introduction to Generative AI for Newbies

The opinions expressed here are solely my conjecture based on experience, practice, and observation. They do not represent the thoughts, intentions, plans, or strategies of my current or previous employers or their clients/customers. The objective of this newsletter is to share and learn with the community.