DeepSeek Special Edition: Action Steps for the Week   Recently updated !


Welcome to your weekly AI Newsletter from AITechCircle!

I’m Building, Implementing AI solutions, and sharing everything I learn along the way…

Check out the updates from this week! Please take a moment to share them with a friend or colleague who might benefit from these valuable insights!

Today at a Glance:

  • Everything about DeepSeek
  • Generative AI Use cases repository
  • AI Weekly news and updates covering newly released LLMs
  • Courses and events to attend

Everything about Deepseek

DeepSeek R1 and DeepSeek-R1-Zero have taken the world by shock, surprise, and excitement; the fastest downloadable app in the world and everything on the top of the chart.

There is a lot written about Deepseek, and I considered writing about it; however, I realized that we should curate several resources in one place to help people start learning and utilizing AI in their business processes. This can help grow revenue, reduce costs, and increase productivity.

As the Deepseek founders highlighted:

“DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models”

This week, there is a lot of stuff for you to read and practice on Deepseek, but before you go to that stage, I want just to leave the summary:

Another thing I want to leave with you is to spare some time during the coming week and read the paper: “DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.” link

Can I run DeepSeek R1 models locally?

  • Download LM Studio , according to your OS
  • Search for the Model DeepSeek and choose based on the resources you have on your laptop
  • If you have 16GB RAM, you can run the 7B or 8B parameter distilled model

Weekly News & Updates…

Last week’s AI breakthroughs marked another leap forward in the tech revolution.

  1. Mistral Small 3, a latency-optimized 24B-parameter model. Mistral Small 3 is a pre-trained and instructed model catered to the ‘80%’ of generative AI tasks—those that require robust language and instruction following performance, with very low latency. link
  2. Pika Art 2.1: Video Creation introduces stunning high-definition visuals and advanced motion controls. link
  3. Kimi k1.5: an o1-level multi-modal model. link
  4. Qwen2.5-Max: Exploring the Intelligence of Large-scale MoE Model. A large-scale MoE model that has been pre-trained on over 20 trillion tokens and further post-trained with curated Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methodologies. link
  5. Janus Pro: DeepSeek’s Multimodal AI Model link
  6. WeatherNext, a new family of AI models from Google DeepMind and Google Research, produces state-of-the-art weather forecasts. link

The Cloud: the backbone of the AI revolution

  • DeepSeek-R1 Now Live With NVIDIA NIM link

Generative AI Use Case of the Week:

Several Generative AI use cases are documented, and you can access the library of Gen AI Use cases, link here:

Favorite Tip Of The Week:

Here’s my favorite resource of the week.

DeepSeek RAG Agent: 100% Local using Ollama link

Build a text-to-image generation and understanding app using the DeepSeek-Janus local link

Things to Know…

DeepSeek R to run in $2k: Deepseek R1 671b Running and Testing on a $2000 Local AI Server

Complete hardware and software setup for running Deepseek-R1 locally. The actual model, no distillations, and Q8 quantization for full quality. Total cost: $6,000.

The Opportunity…

Podcast:

  • This week’s Open Tech Talks episode 154 is “Generative AI Risks and Governance: What Business Leaders Need to Know with Terry Ziemniak” he is a Fractional CISO and Partner at TechCXO

Apple | Spotify | Amazon Music

Courses to attend:

  • DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models. Youtube Video
  • How to Use DeepSeek-R1 from freecodecamp

Events:

Tech and Tools…

  • PDF Q&A Chatbot Tutorial: Create an AI chatbot that answers questions from PDFs using DeepSeek’s LLM and LangChain’s document processing. Features intelligent text processing, advanced reasoning, and an intuitive interface.
  • Compare OpenAI o3-mini and DeepSeek-R1 using RAG

Data Sets…

  • Human-Like-DPO-Dataset: This dataset was created as part of research aimed at improving conversational fluency and engagement in large language models. It is suitable for formats like Direct Preference Optimization (DPO), which guides models toward generating more human-like responses. link

Other Technology News

Want to stay updated on the latest information in the field of Information Technology? Here’s what you should know:

  • From HuggingFace, a summary of progress to reproduce DeepSeek-R1, all discoveries and discussions about DeepSeek-R1
  • AI systems with ‘unacceptable risk’ are now banned in the EU, as reported by TechCrunch. “As of Sunday in the European Union, the bloc’s regulators can ban the use of AI systems they deem to pose “unacceptable risk” or harm.”

And that’s a wrap!

Thank you, as always, for taking the time to read.

I’d love to hear your thoughts. Please reply and let me know what you find most valuable this week. Your feedback means a lot.

Until next week,

Kashif Manzoor

The opinions expressed here are solely my conjecture based on experience, practice, and observation. They do not represent the thoughts, intentions, plans, or strategies of my current or previous employers or their clients/customers. The objective of this newsletter is to share and learn with the community.