Gradient Descent is the most widely used optimization strategy in machine learning and deep learning. Whenever the question comes to train data models, gradient descent is joined with other algorithms and ease to implement and understand. There is a common understanding that whoever wants to work with the machine learning must understand the concepts in detail.
This article will also try to curate the information available with us from different sources, as a result, you will learn the basics. This week, I have got a task in my MSc AI course on gradient descent.
If you are new to this journal, Open Tech Talks is your weekly sandbox for technology insights, experimentation, and inspiration with the primary objective of learning and sharing. This podcast is after the break of 3-4 months. Due to the busy schedule in my office work along with focusing on my MSc AI, I was not able to spare the time. after careful consideration and keeping in my professional and personal passion, I am gone focus on the following three areas:
- AI: being a student of AI, I will write on topics related to Artificial Intelligence, Data Science, and machine learning
- Oracle Technologies: In a professional capacity, my career is working on Oracle technologies therefore, related to Oracle, whatever I am new learning and a good fit to share with the rest of the community.
- Digital/Business Transformation: As we are going through a massive change in all areas of our lives so it is very important to keep eye on what is working and what is not and how we can transform our businesses, our communities to utilize emerging technologies to build a better place to live.
Introduction to Gradient Descent
There is always a question on the model’s performance, and to find outperformance, ‘optimization’ is a technique used across the board. Optimizing is focusing on how we can find out the best possible output from the model. It is also another way to check the preciseness of output. In contrast to this, while you work in deep learning, you don’t know the outcome you will get from the new data. Is this correct? Is this what you were aiming for?
So question arises what to do? Now imagine if you use the optimization algorithm to train a model, you will get the performance outcome. In the other way, it is the Gradient Descent, which helps us identify the parameters that minimize the cost function (prediction error). Gradient Descent does this by iteratively moves toward a set of parameter values that minimize the function, taking steps in the gradient’s opposite direction. Now let’s dive into the formal definition of gradient descent.
What is Gradient Descent?
According to Wikipedia, this gives us a primary definition and history on this, who is behind this.
Gradient descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. To find a local minimum of a function using gradient descent, we take steps proportional to the negative of the gradient (or approximate gradient) of the function at the current point. But if we instead take steps proportional to the positive of the gradient, we approach a local maximum of that function; the procedure is then known as gradient ascent. Gradient descent is generally attributed to Cauchy, who first suggested it in 1847,[1] but its convergence properties for non-linear optimization problems were first studied by Haskell Curry in 1944.[2]
if we will see the definition of only the word ‘gradient.’
A gradient is a vector-valued function representing the slope of the tangent of the graph of the function, pointing the direction of the most significant rate of increase of the function. It is a derivative that indicates the incline or the slope of the cost function.
Now what I understood, the word gradient means an increase and decrease in a property or something
You’ll Learn:
The most common algorithm of machine learning
- Introduction to Gradient Descent
- What is Gradient Descent?
- How it works?
- Types of Gradient Descent
1-Gradient Descent, Step-by-Step
2-Gradient Descent from Andrew Ng
3-Khan Academy: Why the gradient is the direction of steepest ascent
4-Gradient descent, how neural networks learn
5- MIT – Gradient Descent: Downhill to a Minimum
Resources:
- Open Tech Talk Sessions on AI, Data Science:
- Wikipedia, this gives us a primary definition and history on this, who is behind this
- An analogy for understanding gradient descent – Wikipedia
- Machine Learning Crash Course” from google
- Another excellent explanation is from our favorite source ‘Khan Academy‘on the Gradient descent
- According to DeepAI
- A paper at Cornell University ‘An overview of gradient descent optimization algorithms’
To share your thoughts:
- Leave a comment on the section below on this post
- You want to suggest any new topic we should cover in future Podcast
- Join us in Mastermind tribe
- Share this on Twitter, Facebook, If you enjoyed this episode and we together are learning new Technologies.
To help out this initiative:
- Leave a candid review for the OTechTalks Podcast on iTunes! Your ratings and reviews will help the session on iTunes.
- Subscribe to the Podcast on iTunes to get the next sessions