Skip to main content

Section 4.2 Mean value theorem

Note: 2 lectures (some applications may be skipped)

Subsection 4.2.1 Relative minima and maxima

We talked about absolute maxima and minima. These are the tallest peaks and lowest valleys in the whole mountain range. What about peaks of individual mountains and bottoms of individual valleys? The derivative, being a local concept, is like walking around in a fog; it can't tell you if you're on the highest peak, but it can help you find all the individual peaks.

Definition 4.2.1.

Let \(S \subset \R\) be a set and let \(f \colon S \to \R\) be a function. The function \(f\) is said to have a relative maximum at \(c \in S\) if there exists a \(\delta>0\) such that for all \(x \in S\) where \(\abs{x-c} < \delta\text{,}\) we have \(f(x) \leq f(c)\text{.}\) The definition of relative minimum is analogous.

Proof.

We prove the statement for a maximum. For a minimum the statement follows by considering the function \(-f\text{.}\)

Let \(c\) be a relative maximum of \(f\text{.}\) That is, there is a \(\delta > 0\) such that as long as \(\abs{x-c} < \delta\text{,}\) we have \(f(x)-f(c) \leq 0\text{.}\) We look at the difference quotient. If \(c < x < c+\delta\text{,}\) then

\begin{equation*} \frac{f(x)-f(c)}{x-c} \leq 0 , \end{equation*}

and if \(c-\delta < y < c\text{,}\) then

\begin{equation*} \frac{f(y)-f(c)}{y-c} \geq 0 . \end{equation*}

See Figure 4.3 for an illustration.


Figure 4.3. Slopes of secants at a relative maximum.

As \(a < c < b\text{,}\) there exist sequences \(\{ x_n\}\) and \(\{ y_n \}\) in \([a,b]\) and within \(\delta\) of \(c\text{,}\) such that \(x_n > c\text{,}\) and \(y_n < c\) for all \(n \in \N\text{,}\) and such that \(\lim\, x_n = \lim\, y_n = c\text{.}\) Since \(f\) is differentiable at \(c\text{,}\)

\begin{equation*} 0 \geq \lim_{n\to\infty} \frac{f(x_n)-f(c)}{x_n-c} = f'(c) = \lim_{n\to\infty} \frac{f(y_n)-f(c)}{y_n-c} \geq 0. \qedhere \end{equation*}

For a differentiable function, a point where \(f'(c) = 0\) is called a critical point. When \(f\) is not differentiable at some points, it is common to also say that \(c\) is a critical point if \(f'(c)\) does not exist. The theorem says that a relative minimum or maximum at an interior point of an interval must be a critical point. As you remember from calculus, finding minima and maxima of a function can be done by finding all the critical points together with the endpoints of the interval and simply checking at which of these points is the function biggest or smallest.

Subsection 4.2.2 Rolle's theorem

Suppose a function has the same value at both endpoints of an interval. Intuitively it ought to attain a minimum or a maximum in the interior of the interval, then at such a minimum or a maximum, the derivative should be zero. See Figure 4.4 for the geometric idea. This is the content of the so-called Rolle's theorem 1 .


Figure 4.4. Point where the tangent line is horizontal, that is \(f'(c) = 0\text{.}\)

Proof.

As \(f\) is continuous on \([a,b]\text{,}\) it attains an absolute minimum and an absolute maximum in \([a,b]\text{.}\) We wish to apply Lemma 4.2.2, and so we need to find some \(c \in (a,b)\) where \(f\) attains a minimum or a maximum. Write \(K := f(a) = f(b)\text{.}\) If there exists an \(x\) such that \(f(x) > K\text{,}\) then the absolute maximum is bigger than \(K\) and hence occurs at some \(c \in (a,b)\text{,}\) and therefore \(f'(c) = 0\text{.}\) On the other hand, if there exists an \(x\) such that \(f(x) < K\text{,}\) then the absolute minimum occurs at some \(c \in (a,b)\text{,}\) and so \(f'(c) = 0\text{.}\) If there is no \(x\) such that \(f(x) > K\) or \(f(x) < K\text{,}\) then \(f(x) = K\) for all \(x\) and then \(f'(x) = 0\) for all \(x \in [a,b]\text{,}\) so any \(c \in (a,b)\) works.

It is absolutely necessary for the derivative to exist for all \(x \in (a,b)\text{.}\) Consider the function \(f(x) := \abs{x}\) on \([-1,1]\text{.}\) Clearly \(f(-1) = f(1)\text{,}\) but there is no point \(c\) where \(f'(c) = 0\text{.}\)

Subsection 4.2.3 Mean value theorem

We extend Rolle's theorem to functions that attain different values at the endpoints.

For a geometric interpretation of the mean value theorem, see Figure 4.5. The idea is that the value \(\frac{f(b)-f(a)}{b-a}\) is the slope of the line between the points \(\bigl(a,f(a)\bigr)\) and \(\bigl(b,f(b)\bigr)\text{.}\) Then \(c\) is the point such that \(f'(c) = \frac{f(b)-f(a)}{b-a}\text{,}\) that is, the tangent line at the point \(\bigl(c,f(c)\bigr)\) has the same slope as the line between \(\bigl(a,f(a)\bigr)\) and \(\bigl(b,f(b)\bigr)\text{.}\) The theorem follows from Rolle's theorem, by subtracting from \(f\) the affine linear function with the derivative \(\frac{f(b)-f(a)}{b-a}\) with the same values at \(a\) and \(b\) as \(f\text{.}\) That is, we subtract the function whose graph is the straight line \(\bigl(a,f(a)\bigr)\) and \(\bigl(b,f(b)\bigr)\text{.}\) Then we are looking for a point where this new function has derivative zero.


Figure 4.5. Graphical interpretation of the mean value theorem.

Proof.

Define the function \(g \colon [a,b] \to \R\) by

\begin{equation*} g(x) := f(x)-f(b)-\frac{f(b)-f(a)}{b-a}(x-b) . \end{equation*}

The function \(g\) is differentiable on \((a,b)\text{,}\) continuous on \([a,b]\text{,}\) such that \(g(a) = 0\) and \(g(b) = 0\text{.}\) Thus there exists a \(c \in (a,b)\) such that \(g'(c) = 0\text{,}\) that is,

\begin{equation*} 0 = g'(c) = f'(c)-\frac{f(b)-f(a)}{b-a} . \end{equation*}

In other words, \(f'(c)(b-a) = f(b)-f(a)\text{.}\)

The proof generalizes. By considering \(g(x) := f(x)-f(b)-\frac{f(b)-f(a)}{\varphi(b)-\varphi(a)}\bigl(\varphi(x)-\varphi(b)\bigr)\text{,}\) one can prove the following version. We leave the proof as an exercise.

The mean value theorem has the distinction of being one of the few theorems commonly cited in court. That is, when police measure the speed of cars by aircraft, or via cameras reading license plates, they measure the time the car takes to go between two points. The mean value theorem then says that the car must have somewhere attained the speed you get by dividing the difference in distance by the difference in time.

Subsection 4.2.4 Applications

We now solve our very first differential equation.

Proof.

Take arbitrary \(x,y \in I\) with \(x < y\text{.}\) As \(I\) is an interval, \([x,y] \subset I\text{.}\) Then \(f\) restricted to \([x,y]\) satisfies the hypotheses of the mean value theorem. Therefore, there is a \(c \in (x,y)\) such that

\begin{equation*} f(y)-f(x) = f'(c)(y-x). \end{equation*}

As \(f'(c) = 0\text{,}\) we have \(f(y) = f(x)\text{.}\) Hence, the function is constant.

Now that we know what it means for the function to stay constant, let us look at increasing and decreasing functions. We say \(f \colon I \to \R\) is increasing (resp. strictly increasing) if \(x < y\) implies \(f(x) \leq f(y)\) (resp. \(f(x) < f(y)\)). We define decreasing and strictly decreasing in the same way by switching the inequalities for \(f\text{.}\)

Proof.

Let us prove the first item. Suppose \(f\) is increasing, then for all \(x,c \in I\) with \(x \neq c\text{,}\) we have

\begin{equation*} \frac{f(x)-f(c)}{x-c} \geq 0 . \end{equation*}

Taking a limit as \(x\) goes to \(c\) we see that \(f'(c) \geq 0\text{.}\)

For the other direction, suppose \(f'(x) \geq 0\) for all \(x \in I\text{.}\) Take any \(x, y \in I\) where \(x < y\text{,}\) and note that \([x,y] \subset I\text{.}\) By the mean value theorem, there is some \(c \in (x,y)\) such that

\begin{equation*} f(y)-f(x) = f'(c)(y-x) . \end{equation*}

As \(f'(c) \geq 0\text{,}\) and \(y-x > 0\text{,}\) then \(f(y) - f(x) \geq 0\) or \(f(x) \leq f(y)\) and so \(f\) is increasing.

We leave the decreasing part to the reader as exercise.

A similar but weaker statement is true for strictly increasing and decreasing functions.

The proof of i is left as an exercise. Then ii follows from i by considering \(-f\) instead. The converse of this proposition is not true. The function \(f(x) := x^3\) is strictly increasing, but \(f'(0) = 0\text{.}\)

Another application of the mean value theorem is the following result about location of extrema, sometimes called the first derivative test. The theorem is stated for an absolute minimum and maximum. To apply it to find relative minima and maxima, restrict \(f\) to an interval \((c-\delta,c+\delta)\text{.}\)

Proof.

We prove the first item and leave the second to the reader. Take \(x \in (a,c)\) and a sequence \(\{ y_n\}\) such that \(x < y_n < c\) for all \(n\) and \(\lim\, y_n = c\text{.}\) By the preceding proposition, \(f\) is decreasing on \((a,c)\) so \(f(x) \geq f(y_n)\text{.}\) As \(f\) is continuous at \(c\text{,}\) we take the limit to get \(f(x) \geq f(c)\) for all \(x \in (a,c)\text{.}\)

Similarly, take \(x \in (c,b)\) and \(\{ y_n\}\) a sequence such that \(c < y_n < x\) and \(\lim\, y_n = c\text{.}\) The function is increasing on \((c,b)\) so \(f(x) \geq f(y_n)\text{.}\) By continuity of \(f\) we get \(f(x) \geq f(c)\) for all \(x \in (c,b)\text{.}\) Thus \(f(x) \geq f(c)\) for all \(x \in (a,b)\text{.}\)

The converse of the proposition does not hold. See Example 4.2.12 below.

Another often used application of the mean value theorem you have possibly seen in calculus is the following result on differentiability at the end points of an interval. The proof is Exercise 4.2.13.

In fact, using the extension result Proposition 3.4.6, you do not need to assume that \(f\) is defined at the end point. See Exercise 4.2.14.

Subsection 4.2.5 Continuity of derivatives and the intermediate value theorem

Derivatives of functions satisfy an intermediate value property.

The proof follows by subtracting \(f\) and a linear function with derivative \(y\text{.}\) The new function \(g\) reduces the problem to the case \(y=0\text{,}\) where \(g'(a) > 0 > g'(b)\text{.}\) That is, \(g\) is increasing at \(a\) and decreasing at \(b\text{,}\) so it must attain a maximum inside \((a,b)\text{,}\) where the derivative is zero. See Figure 4.6.


Figure 4.6. Idea of the proof of Darboux theorem.

Proof.

Suppose \(f'(a) < y < f'(b)\text{.}\) Define

\begin{equation*} g(x) := yx - f(x) . \end{equation*}

The function \(g\) is continuous on \([a,b]\text{,}\) and so \(g\) attains a maximum at some \(c \in [a,b]\text{.}\)

The function \(g\) is also differentiable on \([a,b]\text{.}\) Compute \(g'(x) = y-f'(x)\text{.}\) Thus \(g'(a) > 0\text{.}\) As the derivative is the limit of difference quotients and is positive, there must be some difference quotient that is positive. That is, there must exist an \(x > a\) such that

\begin{equation*} \frac{g(x)-g(a)}{x-a} > 0 , \end{equation*}

or \(g(x) > g(a)\text{.}\) Thus \(g\) cannot possibly have a maximum at \(a\text{.}\) Similarly, as \(g'(b) < 0\text{,}\) we find an \(x < b\) (a different \(x\)) such that \(\frac{g(x)-g(b)}{x-b} < 0\) or that \(g(x) > g(b)\text{,}\) thus \(g\) cannot possibly have a maximum at \(b\text{.}\) Therefore, \(c \in (a,b)\text{,}\) and Lemma 4.2.2 applies: As \(g\) attains a maximum at \(c\) we find \(g'(c) = 0\) and so \(f'(c) = y\text{.}\)

Similarly, if \(f'(a) > y > f'(b)\text{,}\) consider \(g(x) := f(x)- yx\text{.}\)

We have seen already that there exist discontinuous functions that have the intermediate value property. While it is hard to imagine at first, there also exist functions that are differentiable everywhere and the derivative is not continuous.

Example 4.2.12.

Let \(f \colon \R \to \R\) be the function defined by

\begin{equation*} f(x) := \begin{cases} {\bigl( x \sin(\nicefrac{1}{x}) \bigr)}^2 & \text{if } x \not= 0, \\ 0 & \text{if } x = 0. \end{cases} \end{equation*}

We claim that \(f\) is differentiable everywhere, but \(f' \colon \R \to \R\) is not continuous at the origin. Furthermore, \(f\) has a minimum at \(0\text{,}\) but the derivative changes sign infinitely often near the origin. See Figure 4.7.


Figure 4.7. A function with a discontinuous derivative. The function \(f\) is on the left and \(f'\) is on the right. Notice that \(f(x) \leq x^2\) on the left graph.

Proof: It is immediate from the definition that \(f\) has an absolute minimum at \(0\text{;}\) we know \(f(x) \geq 0\) for all \(x\) and \(f(0) = 0\text{.}\)

The function \(f\) is differentiable for \(x\not=0\text{,}\) and the derivative is \(2 \sin (\nicefrac{1}{x}) \bigl( x \sin (\nicefrac{1}{x}) - \cos(\nicefrac{1}{x}) \bigr)\text{.}\) As an exercise, show that for \(x_n = \frac{4}{(8n+1)\pi}\text{,}\) we have \(\lim\, f'(x_n) = -1\text{,}\) and for \(y_n = \frac{4}{(8n+3)\pi}\text{,}\) we have \(\lim\, f'(y_n) = 1\text{.}\) Hence if \(f'\) exists at \(0\text{,}\) then it cannot be continuous.

Let us show that \(f'\) exists at \(0\text{.}\) We claim that the derivative is zero. In other words, \(\abs{\frac{f(x)-f(0)}{x-0} - 0}\) goes to zero as \(x\) goes to zero. For \(x \not= 0\text{,}\)

\begin{equation*} \abs{\frac{f(x)-f(0)}{x-0} - 0} = \abs{\frac{x^2 \sin^2(\nicefrac{1}{x})}{x}} = \abs{x \sin^2(\nicefrac{1}{x})} \leq \abs{x} . \end{equation*}

And, of course, as \(x\) tends to zero, \(\abs{x}\) tends to zero, and hence \(\abs{\frac{f(x)-f(0)}{x-0} - 0}\) goes to zero. Therefore, \(f\) is differentiable at 0 and the derivative at 0 is 0. A key point in the calculation above is that \(\abs{f(x)} \leq x^2\text{,}\) see also Exercises 4.1.11 and 4.1.12.

It is sometimes useful to assume the derivative of a differentiable function is continuous. If \(f \colon I \to \R\) is differentiable and the derivative \(f'\) is continuous on \(I\text{,}\) then we say \(f\) is continuously differentiable. It is common to write \(C^1(I)\) for the set of continuously differentiable functions on \(I\text{.}\)

Subsection 4.2.6 Exercises

Exercise 4.2.3.

Suppose \(f \colon \R \to \R\) is a differentiable function such that \(f'\) is a bounded function. Prove that \(f\) is a Lipschitz continuous function.

Exercise 4.2.4.

Suppose \(f \colon [a,b] \to \R\) is differentiable and \(c \in [a,b]\text{.}\) Show there exists a sequence \(\{ x_n \}\) converging to \(c\text{,}\) \(x_n \not= c\) for all \(n\text{,}\) such that

\begin{equation*} f'(c) = \lim_{n\to \infty} f'(x_n). \end{equation*}

Do note this does not imply that \(f'\) is continuous (why?).

Exercise 4.2.5.

Suppose \(f \colon \R \to \R\) is a function such that \(\abs{f(x)-f(y)} \leq \abs{x-y}^2\) for all \(x\) and \(y\text{.}\) Show that \(f(x) = C\) for some constant \(C\text{.}\) Hint: Show that \(f\) is differentiable at all points and compute the derivative.

Exercise 4.2.6.

Finish the proof of Proposition 4.2.8. That is, suppose \(I\) is an interval and \(f \colon I \to \R\) is a differentiable function such that \(f'(x) > 0\) for all \(x \in I\text{.}\) Show that \(f\) is strictly increasing.

Exercise 4.2.7.

Suppose \(f \colon (a,b) \to \R\) is a differentiable function such that \(f'(x) \not= 0\) for all \(x \in (a,b)\text{.}\) Suppose there exists a point \(c \in (a,b)\) such that \(f'(c) > 0\text{.}\) Prove \(f'(x) > 0\) for all \(x \in (a,b)\text{.}\)

Exercise 4.2.8.

Suppose \(f \colon (a,b) \to \R\) and \(g \colon (a,b) \to \R\) are differentiable functions such that \(f'(x) = g'(x)\) for all \(x \in (a,b)\text{,}\) then show that there exists a constant \(C\) such that \(f(x) = g(x) + C\text{.}\)

Exercise 4.2.9.

Prove the following version of L'Hôpital's rule. Suppose \(f \colon (a,b) \to \R\) and \(g \colon (a,b) \to \R\) are differentiable functions and \(c \in (a,b)\text{.}\) Suppose that \(f(c) = 0\text{,}\) \(g(c)=0\text{,}\) \(g'(x) \not= 0\) when \(x \not= c\text{,}\) and that the limit of \(\nicefrac{f'(x)}{g'(x)}\) as \(x\) goes to \(c\) exists. Show that

\begin{equation*} \lim_{x \to c} \frac{f(x)}{g(x)} = \lim_{x \to c} \frac{f'(x)}{g'(x)} . \end{equation*}

Compare to Exercise 4.1.15. Note: Before you do anything else, prove that \(g(x) \not= 0\) when \(x \not= c\text{.}\)

Exercise 4.2.10.

Let \(f \colon (a,b) \to \R\) be an unbounded differentiable function. Show \(f' \colon (a,b) \to \R\) is unbounded.

Exercise 4.2.11.

Prove the theorem Rolle actually proved in 1691: If \(f\) is a polynomial, \(f'(a) = f'(b) = 0\) for some \(a < b\text{,}\) and there is no \(c \in (a,b)\) such that \(f'(c) = 0\text{,}\) then there is at most one root of \(f\) in \((a,b)\text{,}\) that is at most one \(x \in (a,b)\) such that \(f(x) = 0\text{.}\) In other words, between any two consecutive roots of \(f'\) is at most one root of \(f\text{.}\) Hint: Suppose there are two roots and see what happens.

Exercise 4.2.12.

Suppose \(a,b \in \R\) and \(f \colon \R \to \R\) is differentiable, \(f'(x) = a\) for all \(x\text{,}\) and \(f(0) = b\text{.}\) Find \(f\) and prove that it is the unique differentiable function with this property.

Exercise 4.2.13.

  1. Prove Proposition 4.2.10.

  2. Suppose \(f \colon (a,b) \to \R\) is continuous, and suppose \(f\) is differentiable everywhere except at \(c \in (a,b)\) and \(\lim_{x \to c} f'(x) = L\text{.}\) Prove that \(f\) is differentiable at \(c\) and \(f'(c) = L\text{.}\)

Exercise 4.2.14.

Suppose \(f \colon (0,1) \to \R\) is differentiable and \(f'\) is bounded.

  1. Show that there exists a continuous function \(g \colon [0,1) \to \R\) such that \(f(x) = g(x)\) for all \(x \not= 0\text{.}\)
    Hint: Proposition 3.4.6 and Exercise 4.2.3.

  2. Find an example where the \(g\) is not differentiable at \(x=0\text{.}\)
    Hint: Consider something based on \(\sin(\ln x)\text{,}\) and assume you know basic properties of \(\sin\) and \(\ln\) from calculus.

  3. Instead of assuming that \(f'\) is bounded, assume that \(\lim_{x \to 0} f'(x) = L\text{.}\) Prove that not only does \(g\) exist but it is differentiable at \(0\) and \(g'(0) = L\text{.}\)

Named after the French mathematician Michel Rolle 2  (1652–1719).
https://en.wikipedia.org/wiki/Michel_Rolle
For a higher quality printout use the PDF versions: https://www.jirka.org/ra/realanal.pdf or https://www.jirka.org/ra/realanal2.pdf