Gradient Descent Methods

Last modified on September 15, 2025 • 1 min read • 125 words

Gradient Descent Optimization Machine Learning

Gradient Descent Optimization Machine Learning

Share via

Link copied to clipboard

Understanding gradient descent and its variants for optimization

Gradient Descent Methods

Introduction

Gradient descent is a fundamental optimization algorithm widely used in machine learning for minimizing loss functions.

Basic Gradient Descent

The simplest form updates parameters in the direction of the negative gradient:

θ = θ - α∇f(θ)

Where:

θ: parameters
α: learning rate
∇f(θ): gradient of the objective function

Variants

Stochastic Gradient Descent (SGD)

Updates parameters using individual data points or small batches.

Momentum

Adds momentum to accelerate convergence:

v = βv + α∇f(θ)
θ = θ - v

Adam

Adaptive learning rates with momentum:

m = β₁m + (1-β₁)∇f(θ)
v = β₂v + (1-β₂)(∇f(θ))²
θ = θ - α * m̂/(√v̂ + ε)

Applications

Neural network training
Linear regression
Logistic regression
Support vector machines

Convex Optimization

Metaheuristic Algorithms

On this page:

Follow me

I work on everything coding and tweet developer memes

Code copied to clipboard