How Gradient Descent Works for Linear Regression
Linear Regression Model
The model predicts: hθ(x) = θ₀ + θ₁x
Cost Function (Mean Squared Error)
J(θ) = 1/(2m) ∑i=1m (hθ(x(i)) - y(i))2
Gradient Descent Algorithm
Repeat until convergence:
θj := θj - α ∂J(θ)/∂θj (for j = 0,1)
Partial Derivatives
∂J(θ)/∂θ₀ = 1/m ∑i=1m (hθ(x(i)) - y(i))
∂J(θ)/∂θ₁ = 1/m ∑i=1m (hθ(x(i)) - y(i))x(i)
Key Parameters
- Learning Rate (α): Controls step size (too small = slow convergence, too large = may diverge)
- Iterations: Number of optimization steps