Gradient Descent Methods

Last modified on September 15, 2025 • 1 min read • 125 words
Share via

Understanding gradient descent and its variants for optimization

Gradient Descent Methods  

Introduction  

Gradient descent is a fundamental optimization algorithm widely used in machine learning for minimizing loss functions.

Basic Gradient Descent  

The simplest form updates parameters in the direction of the negative gradient:

θ = θ - α∇f(θ)

Where:

  • θ: parameters
  • α: learning rate
  • ∇f(θ): gradient of the objective function

Variants  

Stochastic Gradient Descent (SGD)  

Updates parameters using individual data points or small batches.

Momentum  

Adds momentum to accelerate convergence:

v = βv + α∇f(θ)
θ = θ - v

Adam  

Adaptive learning rates with momentum:

m = β₁m + (1-β₁)∇f(θ)
v = β₂v + (1-β₂)(∇f(θ))²
θ = θ - α * m̂/(√v̂ + ε)

Applications  

  • Neural network training
  • Linear regression
  • Logistic regression
  • Support vector machines
Follow me

I work on everything coding and tweet developer memes