Basic Calculus for AI, part 5

Mon, Dec 12, 2022 The point of calculus – or most math, frankly – is to come up with cool shortcuts to bypass complicated, repetitive, and error-prone techniques. The idea of slopes and gradients, for example, makes it unnecessary to go through every .. single .. coordinate .. possible in a function to find minimums. Likewise, the chain rule is a short cut to calculating the derivatives of clustered functions.

A.6 The Chain Rule

DL uses the chain rule for backpropagation, its version of machine learning.

The formal definition is:

If $g$ is differentiable at $x$ and $f$ is differentiable at $g(x)$, then the composite function $F = f \circ g$ defined by $F(x) = f(g(x)))$ is differentiable at $x$ and $F’$ is given by the production $F’(x) = f’(g(x))g’(x)$.

That’s straight from the course slide, but it hurts the brain. What does it mean?

Differentiable: the derivative exists at the given input variable¹.
If $f$ is differentiable at $g(x)$…: If you feed the results of $g(x)$ at a given point into f and the function still has a derivative…
Composite function $F = f \circ g$: the function where you stuff the results of g into f.
$F’(x) = f’(g(x))g’(x)$: let’s do an example:

A.6.1 Example

$$ F(x) = (x^3 - 1)^2 $$

Ooh look. An exponent that works on a bracketed expression. That screams that a function is being passed to another function.

The inner function is $g(x) = x^3 - 1$, i.e. the expression inside the brackets.
The outer function is $f(x) = g(x)^2$, i.e. the expression outside the brackets.

The derivative is:

$$\begin{aligned} f’(g(x))g’(x) &= 2(x^3 - 1)3x^2 \\ &= 6x^2(x^3 -1) \end{aligned}$$

Just to show it works, let’s do the derivative the hard way, without the chain rule:

$$\begin{aligned} F(x) &= (x^3 - 1)^2 \\ &= (x^3-1)(x^3-1) \\ &= x^6 - x^3 - x^3 + 1 \\ &= x^6 - 2x^3 + 1 \\ F’(x) &= 6x^5-6x^2 \\ &= 6x^2(x^3-1) \end{aligned}$$

This ends differential calculus. I won’t do integral calculus unless I find it really is necessary to understand area under the curve.

Calc Workshop ↩︎