vvdCRrr9m8z46yzjuR7khF This Week in AI: Exploring LLMs & Building RAG-Based Chatbots

AI Tech Circle

Stay Ahead in AI with Weekly AI Roundup; read and listen on AITechCircle:

Welcome to the weekly AI Newsletter, your go-to source for practical and actionable ideas. I’m here to give you tips that you can immediately apply to your job and business.

Before we start, share this week’s updates with a friend or a colleague:

Today at a Glance:

Start your learning journey with code to build a RAG-based Chat with knowledge sources.
Generative AI Usecase: AI-powered virtual assistants can provide personalized responses to citizen inquiries about public services
AI Weekly news and updates covering Small Language Models
Open Tech Talk Podcast, the latest episode on Developing AI Products From Tech Stack to User Feedback with Jason Agouris

Chat with Knowledge Base through RAG

Retrieval-augmented generation (RAG) frequently arises when implementing large language models in business. RAG is considered the ideal solution for scenarios requiring leveraging business data alongside Generative AI. It acts as a bridge between your organizational data and the LLM, ensuring that you receive the desired outputs.

This topic was briefly covered in the earlier edition of the newsletter “Build Your Business Specific LLMs Using RAG.“ Read this to understand the fundamentals.

This week, I reviewed the technical aspects of RAG based on the article published by Cohere.

The following steps have been followed in the notebook:

Step 1: Ingest the documents. In this scenario, web sources are used to get documents, chunk, embed, and index.

For each user-chatbot interaction:

Step 2: Get the user message through the chat
Step 3: Call the Chat endpoint in query-generation mode
If at least one query is generated
Step 3: Retrieve and rerank relevant documents
Step 4: Call the Chat endpoint in document mode to generate a grounded response with citations
If no query is generated
Step 4: Call the Chat endpoint in normal mode to generate a response

jreqc5SmBTBj8PwQDGqVpp This Week in AI: Exploring LLMs & Building RAG-Based Chatbots

The notebook is also available on GitHub, and you can download it to go through it.

Eventually, in the business context, we will develop the chat interface using a chatbot or digital assistant platform.

Weekly News & Updates…

Last week’s AI breakthroughs marked another leap forward in the tech revolution.

SearchGPT from OpenAI, a prototype of new AI search features, gives you fast and timely answers with clear and relevant sources. Link
Mistral and Nvidia have collaboratively launched Mistral NeMo, a model with 12 billion parameters. It features a context window of 128,000 tokens, matching that of GPT-4 mini and surpassing most models of similar size. This model is available under the Apache 2.0 open-source license. Link
Hugging Face has introduced SmolLM, a family of three compact models with 135 million, 362 million, and 1.71 billion parameters, specifically designed for mobile devices. Both the base and instruction-tuned versions, including weights, are freely available without any restrictions on commercial use. Link
Cohere has released a new version of the Rerank 3 nimble model, part of a series designed to improve search quality while maximizing throughput. The latest model in this series is approximately three times faster than Rerank 3 while maintaining high accuracy.
Salesforce has unveiled xLAM, a new family of large action models (LAMs) engineered to plan and execute tasks autonomously. Available in 1.35 billion and 7 billion parameter versions, these models exhibit impressive capabilities despite their relatively compact size. Link
Stable Audio Open: This open-weight text-to-audio model generates high-quality stereo audio 44.1kHz from text prompts. It runs on consumer-grade GPUs and is perfect for synthesizing realistic sounds and field recordings, making it accessible for academic and artistic use. Link

The Cloud: the backbone of the AI revolution

Using GenAI to transform DevOps on OCI
Customize Generative AI Models for Enterprise Applications with Llama 3.1 from Nvidia

Gen AI Use Case of the Week:

Generative AI use cases in the Government and Public Sector :

Utilizing large language models (LLMs), AI-powered virtual assistants can provide personalized responses to citizen inquiries about public services

Implementing Large Language Models (LLMs) will streamlines inquiry management, enhances citizen satisfaction, and reduces operational costs by leveraging advanced natural language processing (NLP) capabilities. Integrating these AI solutions into existing government communication channels offers a scalable and efficient way to improve public service delivery

Business Challenges

Managing a large volume of citizen inquiries efficiently
Ensuring accurate and timely responses to diverse questions
Reducing the operational costs associated with customer service
Enhancing citizen satisfaction and trust in public services

AI Solution Description

Utilize large language models (LLMs) to develop AI-powered virtual assistants that can provide personalized responses to citizen questions about public services. This will be implemented by training LLMs on comprehensive datasets, including public service information, FAQs, and previous inquiries. The virtual assistant will use natural language processing (NLP) to understand and respond to citizen questions, offering real-time, accurate, and contextually relevant answers.

Steps:

1. Data Collection and Training: Gather extensive datasets from government databases, public service records, and historical inquiry logs to train the LLM.

2. Integration and Deployment: Integrate the trained LLM into government communication channels, such as websites, mobile apps, and call centers.

3. Continuous Improvement: Implement feedback mechanisms to continuously update and improve the model based on citizen interactions and new public service data.

Expected Impact/Business Outcome

Revenue: Indirect improvement by optimizing resource allocation and reducing costs
User Experience: Enhanced citizen satisfaction through quick, accurate, and personalized responses
Operations: Streamlined inquiry handling, freeing up human resources for more complex tasks
Process: Improved efficiency in managing and responding to a high volume of inquiries
Cost: Significant reduction in operational costs related to customer service.

Required Data Sources

Historical inquiry logs
Frequently Asked Questions (FAQs) documents
Government policies and service guidelines
Government services offering systems

Strategic Fit and Impact

Enhances public trust and satisfaction with government services

Increases operational efficiency and reduces costs

Leverages advanced AI technology to provide scalable and sustainable solutions

Aligns with digital transformation initiatives in the public sector.

Rating: High Impact & strategic fit

Favorite Tip Of The Week:

Here’s my favorite resource of the week.

A must-read paper on technical governance for AI and documented 100+ open technical problems Link

r41VgsFQJDnJ3VmtJSE1Dy This Week in AI: Exploring LLMs & Building RAG-Based Chatbots

Source: Cohere for AI

Potential of AI

AI to solve International Mathematical Olympiad problems at a silver medalist level from Google Deep Mind. AlphaProof and AlphaGeometry 2 solve advanced reasoning problems in mathematics.

Things to Know…

This week, we should all read in-depth about the massive Windows crash worldwide, which TheVerge narrated comprehensively, to understand the background and what can be done in the future.

“Inside the 78 minutes that took down millions of Windows machines.”

The Opportunity…

Podcast:

This week’s Open Tech Talks episode 140 is “Developing AI Products From Tech Stack to User Feedback with Jason Agouris.” Jason’s extensive experience in systems integration across retail, fintech, wholesale, and supply chain logistics makes him our go-to data integration and strategy guru.

Apple | Spotify | Youtube

Courses to attend:

Federated Learning course from Flower, an open-source framework, to build a federated learning system and implement federated fine-tuning of LLMs with private data

Events:

GITEX GLOBAL, Oct 14-18, 2024, Dubai, UAE
EUROPEAN Conference on Artificial Intelligence, Oct 19-24, 2024 Santiago de Compostela
TED Conference on AI, October 17-19, 2024 | Vienna, Austria

Tech and Tools…

Dify is an open-source LLM app development platform.
Sourcery is an automated code reviewer that will review any pull request in any language to provide instant feedback on the proposed changes.
Everything-ai: fully proficient, AI-powered and local chatbot assistant

Data Sets…

MELD: Multimodal EmotionLines Dataset, A dataset for Emotion Recognition in Multiparty Conversations.
Objaverse is a massive dataset of annotated 3D objects; it has two releases, Objaverse 1.0 A Universe of Annotated 3D Objects (800K) and Objaverse-XL: A Universe of 10M+ 3D Objects

Other Technology News

Want to stay on the cutting edge?

Here’s what else is happening in Information Technology you should know about:

FACT SHEET: Biden-⁠Harris Administration Announces New AI Actions and Receives Additional Major Voluntary Commitment on AI
AI training costs are growing exponentially — IBM says quantum computing could be a solution, a story published by VentureBeat

Join a mini email course on Generative AI …

Introduction to Generative AI for Newbies

Earlier week’s Post:

That’s it!

As always, thanks for reading.

Hit reply and let me know what you found most helpful this week – I’d love to hear from you!

Until next week,

Kashif Manzoor

The opinions expressed here are solely my conjecture based on experience, practice, and observation. They do not represent the thoughts, intentions, plans, or strategies of my current or previous employers or their clients/customers. The objective of this newsletter is to share and learn with the community.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Open Tech Talks – Technology worth Talking

OTechTalks.tv Lets Talk OPEN. Shares the best Technology ideas, tools & tips with the community.

OTechTalks.tv Lets Talk OPEN. Shares the best Technology ideas, tools & tips with the community.

AI Tech Circle

Today at a Glance:

Chat with Knowledge Base through RAG

Weekly News & Updates…

The Cloud: the backbone of the AI revolution

Gen AI Use Case of the Week:

Favorite Tip Of The Week:

Potential of AI

Things to Know…

The Opportunity…

Tech and Tools…

Data Sets…

Other Technology News

Join a mini email course on Generative AI …

Earlier week’s Post:

That’s it!

​AI Tech Circle​

Today at a Glance:

Chat with Knowledge Base through RAG

Weekly News & Updates…

The Cloud: the backbone of the AI revolution

Gen AI Use Case of the Week:

Favorite Tip Of The Week:

Potential of AI

Things to Know…

The Opportunity…

Tech and Tools…

Data Sets…

Other Technology News

Join a mini email course on Generative AI …

Earlier week’s Post:

That’s it!

Related posts:

AI Tech Circle