Skip to main content
Logo image

Section 4.3 Taylor’s theorem

Note: less than a lecture (optional section)

Subsection 4.3.1 Derivatives of higher orders

When \(f \colon I \to \R\) is differentiable, we obtain a function \(f' \colon I \to \R\text{.}\) The function \(f'\) is called the first derivative of \(f\text{.}\) If \(f'\) is differentiable, we denote by \(f'' \colon I \to \R\) the derivative of \(f'\text{.}\) The function \(f''\) is called the second derivative of \(f\text{.}\) We similarly obtain \(f'''\text{,}\) \(f''''\text{,}\) and so on. With a larger number of derivatives the notation would get out of hand; we denote by \(f^{(n)}\) the \(n\)th derivative of \(f\text{.}\) When \(f\) possesses \(n\) derivatives, we say \(f\) is \(n\) times differentiable.

Subsection 4.3.2 Taylor’s theorem

Taylor’s theorem
 1 
Named for the English mathematician Brook Taylor (1685–1731). It was found by the Scottish mathematician James Gregory (1638–1675). The statement we give is due to Joseph-Louis Lagrange (1736–1813).
(at least the version we give here) is a generalization of the mean value theorem. Mean value theorem says that up to a small error \(f(x)\) for \(x\) near \(x_0\) can be approximated by \(f(x_0)\text{:}\)
\begin{equation} f(x) = f(x_0) + f'(c)(x-x_0), \end{equation}
where the “error” \(f'(c)(x-x_0)\) is measured in terms of the first derivative at some point \(c\) between \(x\) and \(x_0\text{.}\) Taylor’s theorem generalizes this result to higher derivatives. It tells us that up to a small error, any \(n\) times differentiable function can be approximated at a point \(x_0\) by a polynomial of degree \(n\text{.}\) The error of this approximation goes to zero faster than \({(x-x_0)}^{n}\) as \(x\) goes to \(x_0\text{.}\) To see why this is a good approximation, notice that for a big \(n\text{,}\) \({(x-x_0)}^n\) is very small in a small interval around \(x_0\text{.}\) We will require one more (so \(n+1\)) derivative to write the error as we do in the mean value theorem.

Definition 4.3.1.

For an \(n\) times differentiable function \(f\) defined near a point \(x_0 \in \R\text{,}\) define the \(n\)th order Taylor polynomial for \(f\) at \(x_0\) as
\begin{equation} \begin{split} P_n^{x_0}(x) & \coloneqq \sum_{k=0}^n \frac{f^{(k)}(x_0)}{k!}{(x-x_0)}^k \\ & = f(x_0) + f'(x_0)(x-x_0) + \frac{f''(x_0)}{2}{(x-x_0)}^2 + \frac{f^{(3)}(x_0)}{6}{(x-x_0)}^3 \\ & \qquad + \cdots + \frac{f^{(n)}(x_0)}{n!}{(x-x_0)}^n . \end{split} \end{equation}
See Figure 4.8 for the odd-degree Taylor polynomials for the sine function at \(x_0=0\text{.}\) The even-degree terms are all zero, as even derivatives of sine are again sines, which are zero at the origin.

The graph of the sine function is given. Also several approximations are given. First a straight line that is tangent at the origin is labeled as y equals P sub 1 super 0 of x. Second a graph of a cubic that is tangent to the graph of sine at the origin and somewhat approximates it nearby is labeled as y equals P sub 3 super 0 of x. Similarly with degree 5 and degree 7 approximations. The higher the degree the further away from the origin is the graph reasonably close to the sine function, for example the degree 7 approximation seems relatively close between two points where the sine is zero around the origin, that is, plus and minus pi.
Figure 4.8. The odd degree Taylor polynomials for the sine function.

The statement of Taylor’s theorem that we give includes the mean value theorem, which can be thought of as Taylor’s theorem for \(n=0\text{.}\)
The term \(R_n^{x_0}(x)\coloneqq f(x)-P_n^{x_0}(x) = \frac{f^{(n+1)}(c)}{(n+1)!}{(x-x_0)}^{n+1}\) is called the remainder term. This form is called the Lagrange form of the remainder. There are other ways to write the remainder term, but we skip those. Note that \(c\) depends on both \(x\) and \(x_0\text{.}\)

Proof.

Let \(M_{x,x_0}\) be the number (depending on \(x\) and \(x_0\)) solving the equation
\begin{equation} f(x)=P_{n}^{x_0}(x)+M_{x,x_0}{(x-x_0)}^{n+1} . \end{equation}
Define a function \(g(s)\) by
\begin{equation} g(s) \coloneqq f(s)-P_n^{x_0}(s)-M_{x,x_0}{(s-x_0)}^{n+1} . \end{equation}
We compute the \(k\)th derivative at \(x_0\) of the Taylor polynomial \({(P_n^{x_0})}^{(k)}(x_0) = f^{(k)}(x_0)\) for \(k=0,1,2,\ldots,n\) (the zeroth derivative of a function is the function itself). Therefore,
\begin{equation} g(x_0) = g'(x_0) = g''(x_0) = \cdots = g^{(n)}(x_0) = 0 . \end{equation}
In particular, \(g(x_0) = 0\text{.}\) We also have \(g(x) = 0\text{.}\) By the mean value theorem, there exists an \(x_1\) between \(x_0\) and \(x\) such that \(g'(x_1) = 0\text{.}\) Applying the mean value theorem to \(g'\text{,}\) we obtain that there exists \(x_2\) between \(x_0\) and \(x_1\) (and therefore between \(x_0\) and \(x\)) such that \(g''(x_2) = 0\text{.}\) We repeat the argument \(n+1\) times to obtain a number \(x_{n+1}\) between \(x_0\) and \(x_n\) (and therefore between \(x_0\) and \(x\)) such that \(g^{(n+1)}(x_{n+1}) = 0\text{.}\)
Let \(c \coloneqq x_{n+1}\text{.}\) We compute the \((n+1)\)th derivative of \(g\) to find
\begin{equation} g^{(n+1)}(s) = f^{(n+1)}(s)-(n+1)!\,M_{x,x_0} . \end{equation}
Plugging in \(c\) for \(s\text{,}\) we obtain \(M_{x,x_0} = \frac{f^{(n+1)}(c)}{(n+1)!}\text{,}\) and we are done.
In the proof, we found \({(P_n^{x_0})}^{(k)}(x_0) = f^{(k)}(x_0)\) for \(k=0,1,2,\ldots,n\text{.}\) Therefore, the Taylor polynomial has the same derivatives as \(f\) at \(x_0\) up to the \(n\)th derivative. That is why the Taylor polynomial is a good approximation to \(f\text{.}\) Notice how in Figure 4.8 the Taylor polynomials are reasonably good approximations to the sine near \(x=0\text{.}\)
We do not necessarily get good approximations by the Taylor polynomial everywhere. Consider expanding the function \(f(x) \coloneqq \frac{x}{1-x}\) around 0, for \(x < 1\text{,}\) we get the graphs in Figure 4.9. The dotted lines are the first, second, and third degree approximations. The dashed line is the 20th degree polynomial. The approximation does seem to get better as the degree rises for \(x > -1\text{.}\) For \(x < -1\text{,}\) it in fact gets visibly worse. The polynomials are the partial sums of the geometric series \(\sum_{n=1}^\infty x^n\text{,}\) and the series only converges on \((-1,1)\text{.}\) See the discussion of power series in Section 2.6.

In bold line is a graph of a function on the interval from minus 2 to 1, that comes in flat from the right from below the x-axis, crosses the x-axis at the origin, and then turns sharply upward leaving the picture before we get to 1. The graphs of the first three Taylor polynomials are shown in dotted line. That is, a straight line, a quadratic, and a cubic. They all seem to be somewhat reasonable approximation very near the origin but get worse as we get further. A dashed line signifying the degree 20 approximation is drawn and from some point slightly more than minus 1 until where the bold line leaves the picture on the right, we cannot tell the difference between it and the bold line. However, right around minus 1 if we move from right to left, it shoots upwards leaving the picture, so for x less than minus 1 the approximation would be ludicrously worse than the small degree approximations (which are not good already).
Figure 4.9. The function \(\frac{x}{1-x}\text{,}\) and the Taylor polynomials \(P_1^0\text{,}\) \(P_2^0\text{,}\) \(P_3^0\) (all dotted), and the polynomial \(P_{20}^0\) (dashed).

If \(f\) is infinitely differentiable, that is, if \(f\) can be differentiated any number of times, then we define the Taylor series:
\begin{equation} \sum_{k=0}^\infty \frac{f^{(k)}(x_0)}{k!}{(x-x_0)}^k . \end{equation}
There is no guarantee that this series converges for any \(x \neq x_0\text{.}\) Even where it does converge, there is no guarantee that it converges to the function \(f\text{.}\) Functions \(f\) whose Taylor series at every point \(x_0\) converges to \(f\) in some open interval containing \(x_0\) are called analytic functions. Many functions one tends to see in practice are analytic. See Exercise 5.4.11, for an example of a non-analytic function.
The definition of derivative says that a function is differentiable if it is locally approximated by a line. We mention in passing that there exists a converse to Taylor’s theorem, which we will neither state nor prove, saying that if a function is locally approximated in a certain way by a polynomial of degree \(d\text{,}\) then it has \(d\) derivatives.
Taylor’s theorem gives a quick proof of a version of the second derivative test. By a strict relative minimum of \(f\) at \(c\text{,}\) we mean that there exists a \(\delta > 0\) such that \(f(x) > f(c)\) for all \(x \in (c-\delta,c+\delta)\) where \(x\neq c\text{.}\) A strict relative maximum is defined similarly. Continuity of the second derivative is not needed, but the proof is more difficult and is left as an exercise. The proof also generalizes into the \(n\)th derivative test, which is also left as an exercise.

Proof.

As \(f''\) is continuous, there exists a \(\delta > 0\) such that \(f''(c) > 0\) for all \(c \in (x_0-\delta,x_0+\delta)\text{,}\) see Exercise 3.2.11. Take \(x \in (x_0-\delta,x_0+\delta)\text{,}\) \(x \neq x_0\text{.}\) Taylor’s theorem says that for some \(c\) between \(x_0\) and \(x\text{,}\)
\begin{equation} f(x) = f(x_0) + f'(x_0) (x-x_0) + \frac{f''(c)}{2}{(x-x_0)}^{2} = f(x_0) + \frac{f''(c)}{2}{(x-x_0)}^{2} . \end{equation}
As \(f''(c) > 0\) and \({(x-x_0)}^2 > 0\text{,}\) we have \(f(x) > f(x_0)\text{.}\)

Exercises 4.3.3 Exercises

4.3.1.

Compute the \(n\)th Taylor polynomial at \(0\) for the exponential function.

4.3.2.

Suppose \(p\) is a polynomial of degree \(d\text{.}\) Given \(x_0 \in \R\text{,}\) show that the \(d\)th Taylor polynomial for \(p\) at \(x_0\) is equal to \(p\text{.}\)

4.3.3.

Let \(f(x) \coloneqq \sabs{x}^3\text{.}\) Compute \(f'(x)\) and \(f''(x)\) for all \(x\text{,}\) but show that \(f^{(3)}(0)\) does not exist.

4.3.4.

Suppose \(f \colon \R \to \R\) has \(n\) continuous derivatives. Show that for every \(x_0 \in \R\text{,}\) there exist polynomials \(P\) and \(Q\) of degree \(n\) and an \(\epsilon > 0\) such that \(P(x) \leq f(x) \leq Q(x)\) for all \(x \in [x_0,x_0+\epsilon]\) and \(Q(x)-P(x) = \lambda {(x-x_0)}^n\) for some \(\lambda \geq 0\text{.}\)

4.3.5.

If \(f \colon [a,b] \to \R\) has \(n+1\) continuous derivatives
 2 
The same statement holds if we only require \(n\) derivatives, but it is quite a bit harder to prove.
and \(x_0 \in [a,b]\text{,}\) prove \(\lim\limits_{x\to x_0} \frac{R_n^{x_0}(x)}{{(x-x_0)}^n} = 0\text{.}\)

4.3.6.

Suppose \(f \colon [a,b] \to \R\) has \(n+1\) continuous derivatives and \(x_0 \in (a,b)\text{.}\) Prove: \(f^{(k)}(x_0) = 0\) for all \(k = 0, 1, 2, \ldots, n\) if and only if \(\lim\limits_{x\to x_0} \frac{f(x)}{{(x-x_0)}^{n+1}}\) exists.

4.3.7.

Suppose \(a,b,c \in \R\) and \(f \colon \R \to \R\) is twice differentiable, \(f''(x) = a\) for all \(x\text{,}\) \(f'(0) = b\text{,}\) and \(f(0) = c\text{.}\) Find \(f\) and prove that it is the unique differentiable function with this property.

4.3.8.

(Challenging)   Show that a simple converse to Taylor’s theorem does not hold. Find a function \(f \colon \R \to \R\) with no second derivative at \(x=0\) such that \(\babs{f(x)} \leq \babs{x^3}\text{,}\) that is, \(f\) goes to zero at 0 faster than \(x^2\text{,}\) and while \(f'(0)\) exists, \(f''(0)\) does not.

4.3.9.

Suppose \(f \colon (0,1) \to \R\) is differentiable and \(f''\) is bounded.
  1. Show that there exists a once differentiable function \(g \colon [0,1) \to \R\) such that \(f(x) = g(x)\) for all \(x \neq 0\text{.}\) Hint: See Exercise 4.2.14.
  2. Find an example where the \(g\) is not twice differentiable at \(x=0\text{.}\)

4.3.10.

Prove the \(n\)th derivative test. Suppose \(n \in \N\text{,}\) \(x_0 \in (a,b)\text{,}\) and \(f \colon (a,b) \to \R\) is \(n\) times continuously differentiable, with \(f^{(k)}(x_0) = 0\) for \(k=1,2,\ldots,n-1\text{,}\) and \(f^{(n)}(x_0) \neq 0\text{.}\) Prove:
  1. If \(n\) is odd, then \(f\) has neither a relative minimum, nor a maximum at \(x_0\text{.}\)
  2. If \(n\) is even, then \(f\) has a strict relative minimum at \(x_0\) if \(f^{(n)}(x_0) > 0\) and a strict relative maximum at \(x_0\) if \(f^{(n)}(x_0) < 0\text{.}\)

4.3.11.

Prove the more general version of the second derivative test. Suppose \(f \colon (a,b) \to \R\) is differentiable and \(x_0 \in (a,b)\) is such that, \(f'(x_0) = 0\text{,}\) \(f''(x_0)\) exists, and \(f''(x_0) > 0\text{.}\) Prove that \(f\) has a strict relative minimum at \(x_0\text{.}\) Hint: Consider the limit definition of \(f''(x_0)\text{.}\)
For a higher quality printout use the PDF versions: https://www.jirka.org/ra/realanal.pdf,https://www.jirka.org/ra/realanal2.pdf or https://jirilebl.github.io/ra/realanal.pdf,https://jirilebl.github.io/ra/realanal2.pdf