Skip to main content

Section 6.3 Picard's theorem

Note: 1–2 lectures (can be safely skipped)

A first semester course in analysis should have a pièce de résistance caliber theorem. We pick a theorem whose proof combines everything we have learned. It is more sophisticated than the fundamental theorem of calculus, the first highlight theorem of this course. The theorem we are talking about is Picard's theorem 1  on existence and uniqueness of a solution to an ordinary differential equation. Both the statement and the proof are beautiful examples of what one can do with the material we mastered so far. It is also a good example of how analysis is applied as differential equations are indispensable in science of every stripe.

Subsection 6.3.1 First order ordinary differential equation

Modern science is described in the language of differential equations. That is, equations involving not only the unknown, but also its derivatives. The simplest nontrivial form of a differential equation is the so-called first order ordinary differential equation

\begin{equation*} y' = F(x,y) . \end{equation*}

Generally, we also specify an initial condition \(y(x_0)=y_0\text{.}\) The solution of the equation is a function \(y(x)\) such that \(y(x_0)=y_0\) and \(y'(x) = F\bigl(x,y(x)\bigr)\text{.}\)

When \(F\) involves only the \(x\) variable, the solution is given by the fundamental theorem of calculus. On the other hand, when \(F\) depends on both \(x\) and \(y\) we need far more firepower. It is not always true that a solution exists, and if it does, that it is the unique solution. Picard's theorem gives us certain sufficient conditions for existence and uniqueness.

Subsection 6.3.2 The theorem

We need a definition of continuity in two variables. A point in the plane \(\R^2 = \R \times \R\) is denoted by an ordered pair \((x,y)\text{.}\) For simplicity, we give the following sequential definition of continuity.

Definition 6.3.1.

Let \(U \subset \R^2\) be a set, \(F \colon U \to \R\) a function, and \((x,y) \in U\) a point. The function \(F\) is continuous at \((x,y)\) if for every sequence \(\bigl\{ (x_n,y_n) \bigr\}_{n=1}^\infty\) of points in \(U\) such that \(\lim\, x_n = x\) and \(\lim\, y_n = y\text{,}\) we have

\begin{equation*} \lim_{n \to \infty} F(x_n,y_n) = F(x,y) . \end{equation*}

We say \(F\) is continuous if it is continuous at all points in \(U\text{.}\)

Proof.

Suppose we could find a solution \(f\text{.}\) Using the fundamental theorem of calculus we integrate the equation \(f'(x) = F\bigl(x,f(x)\bigr)\text{,}\) \(f(x_0) = y_0\text{,}\) and write (6.1) as the integral equation

\begin{equation} f(x) = y_0 + \int_{x_0}^x F\bigl(t,f(t)\bigr)\,dt .\tag{6.2} \end{equation}

The idea of our proof is that we try to plug in approximations to a solution to the right-hand side of (6.2) to get better approximations on the left-hand side of (6.2). We hope that in the end the sequence converges and solves (6.2) and hence (6.1). The technique below is called Picard iteration, and the individual functions \(f_k\) are called the Picard iterates.

Without loss of generality, suppose \(x_0 = 0\) (exercise below). Another exercise tells us that \(F\) is bounded as it is continuous. Therefore pick some \(M > 0\) so that \(\abs{F(x,y)} \leq M\) for all \((x,y) \in I\times J\text{.}\) Pick \(\alpha > 0\) such that \([-\alpha,\alpha] \subset I\) and \([y_0-\alpha, y_0 + \alpha] \subset J\text{.}\) Define

\begin{equation*} h := \min \left\{ \alpha, \frac{\alpha}{M+L\alpha} \right\} . \end{equation*}

Observe \([-h,h] \subset I\text{.}\)

Set \(f_0(x) := y_0\text{.}\) We define \(f_k\) inductively. Assuming \(f_{k-1}([-h,h]) \subset [y_0-\alpha,y_0+\alpha]\text{,}\) we see \(F\bigl(t,f_{k-1}(t)\bigr)\) is a well-defined function of \(t\) for \(t \in [-h,h]\text{.}\) Further if \(f_{k-1}\) is continuous on \([-h,h]\text{,}\) then \(F\bigl(t,f_{k-1}(t)\bigr)\) is continuous as a function of \(t\) on \([-h,h]\) (left as an exercise). Define

\begin{equation*} f_k(x) := y_0+ \int_{0}^x F\bigl(t,f_{k-1}(t)\bigr)\,dt , \end{equation*}

and \(f_k\) is continuous on \([-h,h]\) by the fundamental theorem of calculus. To see that \(f_k\) maps \([-h,h]\) to \([y_0-\alpha,y_0+\alpha]\text{,}\) we compute for \(x \in [-h,h]\)

\begin{equation*} \abs{f_k(x) - y_0} = \abs{\int_{0}^x F\bigl(t,f_{k-1}(t)\bigr)\,dt } \leq M\abs{x} \leq Mh \leq M \frac{\alpha}{M+L\alpha} \leq \alpha . \end{equation*}

We now define \(f_{k+1}\) and so on, and we have defined a sequence \(\{ f_k \}\) of functions. We need to show that it converges to a function \(f\) that solves the equation (6.2) and therefore (6.1).

We wish to show that the sequence \(\{ f_k \}\) converges uniformly to some function on \([-h,h]\text{.}\) First, for \(t \in [-h,h]\text{,}\) we have the following useful bound

\begin{equation*} \abs{F\bigl(t,f_{n}(t)\bigr) - F\bigl(t,f_{k}(t)\bigr)} \leq L \abs{f_n(t)-f_k(t)} \leq L \norm{f_n-f_k}_u , \end{equation*}

where \(\norm{f_n-f_k}_u\) is the uniform norm, that is the supremum of \(\abs{f_n(t)-f_k(t)}\) for \(t \in [-h,h]\text{.}\) Now note that \(\abs{x} \leq h \leq \frac{\alpha}{M+L\alpha}\text{.}\) Therefore

\begin{equation*} \begin{split} \abs{f_n(x) - f_k(x)} & = \abs{\int_{0}^x F\bigl(t,f_{n-1}(t)\bigr)\,dt - \int_{0}^x F\bigl(t,f_{k-1}(t)\bigr)\,dt} \\ & = \abs{\int_{0}^x F\bigl(t,f_{n-1}(t)\bigr)- F\bigl(t,f_{k-1}(t)\bigr)\,dt} \\ & \leq L\norm{f_{n-1}-f_{k-1}}_u \abs{x} \\ & \leq \frac{L\alpha}{M+L\alpha} \norm{f_{n-1}-f_{k-1}}_u . \end{split} \end{equation*}

Let \(C := \frac{L\alpha}{M+L\alpha}\) and note that \(C < 1\text{.}\) Taking supremum on the left-hand side we get

\begin{equation*} \norm{f_n-f_k}_u \leq C \norm{f_{n-1}-f_{k-1}}_u . \end{equation*}

Without loss of generality, suppose \(n \geq k\text{.}\) Then by induction we can show

\begin{equation*} \norm{f_n-f_k}_u \leq C^{k} \norm{f_{n-k}-f_{0}}_u . \end{equation*}

For \(x \in [-h,h]\text{,}\) we have

\begin{equation*} \abs{f_{n-k}(x)-f_{0}(x)} = \abs{f_{n-k}(x)-y_0} \leq \alpha . \end{equation*}

Therefore,

\begin{equation*} \norm{f_n-f_k}_u \leq C^{k} \norm{f_{n-k}-f_{0}}_u \leq C^{k} \alpha . \end{equation*}

As \(C < 1\text{,}\) \(\{f_n\}\) is uniformly Cauchy and by Proposition 6.1.13 we obtain that \(\{ f_n \}\) converges uniformly on \([-h,h]\) to some function \(f \colon [-h,h] \to \R\text{.}\) The function \(f\) is the uniform limit of continuous functions and therefore continuous. Furthermore, since \(f_n([-h,h]) \subset [y_0-\alpha,y_0+\alpha]\) for all \(n\text{,}\) then \(f([-h,h]) \subset [y_0-\alpha,y_0+\alpha]\) (why?).

We now need to show that \(f\) solves (6.2). First, as before we notice

\begin{equation*} \abs{F\bigl(t,f_{n}(t)\bigr) - F\bigl(t,f(t)\bigr)} \leq L \abs{f_n(t)-f(t)} \leq L \norm{f_n-f}_u . \end{equation*}

As \(\norm{f_n-f}_u\) converges to 0, then \(F\bigl(t,f_n(t)\bigr)\) converges uniformly to \(F\bigl(t,f(t)\bigr)\) for \(t \in [-h,h]\text{.}\) Hence, for \(x \in [-h,h]\) the convergence is uniform for \(t \in [0,x]\) (or \([x,0]\) if \(x < 0\)). Therefore,

\begin{equation*} \begin{aligned} y_0 + \int_0^{x} F(t,f(t)\bigr)\,dt & = y_0 + \int_0^{x} F\bigl(t,\lim_{n\to\infty} f_n(t)\bigr)\,dt & & \\ & = y_0 + \int_0^{x} \lim_{n\to\infty} F\bigl(t,f_n(t)\bigr)\,dt & & \text{(by continuity of } F\text{)} \\ & = \lim_{n\to\infty} \left( y_0 + \int_0^{x} F\bigl(t,f_n(t)\bigr)\,dt \right) & & \text{(by uniform convergence)} \\ & = \lim_{n\to\infty} f_{n+1}(x) = f(x) . & & \end{aligned} \end{equation*}

We apply the fundamental theorem of calculus (Theorem 5.3.3) to show that \(f\) is differentiable and its derivative is \(F\bigl(x,f(x)\bigr)\text{.}\) It is obvious that \(f(0) = y_0\text{.}\)

Finally, what is left to do is to show uniqueness. Suppose \(g \colon [-h,h] \to J \subset \R\) is another solution. As before we use the fact that \(\abs{F\bigl(t,f(t)\bigr) - F\bigl(t,g(t)\bigr)} \leq L \norm{f-g}_u\text{.}\) Then

\begin{equation*} \begin{split} \abs{f(x)-g(x)} & = \abs{ y_0 + \int_0^{x} F\bigl(t,f(t)\bigr)\,dt - \left( y_0 + \int_0^{x} F\bigl(t,g(t)\bigr)\,dt \right) } \\ & = \abs{ \int_0^{x} F\bigl(t,f(t)\bigr) - F\bigl(t,g(t)\bigr)\,dt } \\ & \leq L\norm{f-g}_u\abs{x} \leq Lh\norm{f-g}_u \leq \frac{L\alpha}{M+L\alpha}\norm{f-g}_u . \end{split} \end{equation*}

As before, \(C = \frac{L\alpha}{M+L\alpha} < 1\text{.}\) By taking supremum over \(x \in [-h,h]\) on the left-hand side we obtain

\begin{equation*} \norm{f-g}_u \leq C \norm{f-g}_u . \end{equation*}

This is only possible if \(\norm{f-g}_u = 0\text{.}\) Therefore, \(f=g\text{,}\) and the solution is unique.

Subsection 6.3.3 Examples

Let us look at some examples. The proof of the theorem gives us an explicit way to find an \(h\) that works. It does not, however, give us the best \(h\text{.}\) It is often possible to find a much larger \(h\) for which the conclusion of the theorem holds.

The proof also gives us the Picard iterates as approximations to the solution. So the proof actually tells us how to obtain the solution, not just that the solution exists.

Example 6.3.3.

Consider

\begin{equation*} f'(x) = f(x), \qquad f(0) = 1 . \end{equation*}

That is, we suppose \(F(x,y) = y\text{,}\) and we are looking for a function \(f\) such that \(f'(x) = f(x)\text{.}\) Let us forget for the moment that we solved this equation in Section 5.4.

We pick any \(I\) that contains 0 in the interior. We pick an arbitrary \(J\) that contains 1 in its interior. We can use \(L = 1\text{.}\) The theorem guarantees an \(h > 0\) such that there exists a unique solution \(f \colon [-h,h] \to \R\text{.}\) This solution is usually denoted by

\begin{equation*} e^x := f(x) . \end{equation*}

We leave it to the reader to verify that by picking \(I\) and \(J\) large enough the proof of the theorem guarantees that we are able to pick \(\alpha\) such that we get any \(h\) we want as long as \(h < \nicefrac{1}{2}\text{.}\) We omit the calculation.

Of course, we know this function exists as a function for all \(x\text{,}\) so an arbitrary \(h\) ought to work. By same reasoning as above, no matter what \(x_0\) and \(y_0\) are, the proof guarantees an arbitrary \(h\) as long as \(h < \nicefrac{1}{2}\text{.}\) Fix such an \(h\text{.}\) We get a unique function defined on \([x_0-h,x_0+h]\text{.}\) After defining the function on \([-h,h]\) we find a solution on the interval \([0,2h]\) and notice that the two functions must coincide on \([0,h]\) by uniqueness. We thus iteratively construct the exponential for all \(x \in \R\text{.}\) Therefore Picard's theorem could be used to prove the existence and uniqueness of the exponential.

Let us compute the Picard iterates. We start with the constant function \(f_0(x) := 1\text{.}\) Then

\begin{equation*} \begin{aligned} f_1(x) & = 1 + \int_0^x f_0(s)\,ds = 1+x, \\ f_2(x) & = 1 + \int_0^x f_1(s)\,ds = 1 + \int_0^x (1+s)\,ds = 1 + x + \frac{x^2}{2}, \\ f_3(x) & = 1 + \int_0^x f_2(s)\,ds = 1 + \int_0^x \left(1+ s + \frac{s^2}{2} \right)\,ds = 1 + x + \frac{x^2}{2} + \frac{x^3}{6} . \end{aligned} \end{equation*}

We recognize the beginning of the Taylor series for the exponential.

Example 6.3.4.

Consider the equation

\begin{equation*} f'(x) = {\bigl(f(x)\bigr)}^2 \qquad \text{and} \qquad f(0)=1. \end{equation*}

From elementary differential equations we know

\begin{equation*} f(x) = \frac{1}{1-x} \end{equation*}

is the solution. The solution is only defined on \((-\infty,1)\text{.}\) That is, we are able to use \(h < 1\text{,}\) but never a larger \(h\text{.}\) The function that takes \(y\) to \(y^2\) is not Lipschitz as a function on all of \(\R\text{.}\) As we approach \(x=1\) from the left, the solution becomes larger and larger. The derivative of the solution grows as \(y^2\text{,}\) and so the \(L\) required has to be larger and larger as \(y_0\) grows. If we apply the theorem with \(x_0\) close to 1 and \(y_0 = \frac{1}{1-x_0}\) we find that the \(h\) that the proof guarantees is smaller and smaller as \(x_0\) approaches 1.

The \(h\) from the proof is not the best \(h\text{.}\) By picking \(\alpha\) correctly, the proof of the theorem guarantees \(h=1-\nicefrac{\sqrt{3}}{2} \approx 0.134\) (we omit the calculation) for \(x_0=0\) and \(y_0=1\text{,}\) even though we saw above that any \(h < 1\) should work.

Example 6.3.5.

Consider the equation

\begin{equation*} f'(x) = 2 \sqrt{\abs{f(x)}}, \qquad f(0) = 0 . \end{equation*}

The function \(F(x,y) = 2 \sqrt{\abs{y}}\) is continuous, but not Lipschitz in \(y\) (why?). The equation does not satisfy the hypotheses of the theorem. The function

\begin{equation*} f(x) = \begin{cases} x^2 & \text{if } x \geq 0,\\ -x^2 & \text{if } x < 0, \end{cases} \end{equation*}

is a solution, but \(f(x) = 0\) is also a solution. A solution exists, but is not unique.

Example 6.3.6.

Consider \(y' = \varphi(x)\) where \(\varphi(x) := 0\) if \(x \in \Q\) and \(\varphi(x):=1\) if \(x \not\in \Q\text{.}\) In other words, the \(F(x,y) = \varphi(x)\) is discontinuous. The equation has no solution regardless of the initial conditions. A solution would have derivative \(\varphi\text{,}\) but \(\varphi\) does not have the intermediate value property at any point (why?). No solution exists by Darboux's theorem.

The examples show that without the Lipschitz condition, a solution might exist but not be a unique, and without continuity of \(F\text{,}\) we may not have a solution at all. It is in fact a theorem, the Peano existence theorem, that if \(F\) is continuous a solution exists (but may not be unique).

Remark 6.3.7.

It is possible to weaken what we mean by “solution to \(y' = F(x,y)\)” by focusing on the integral equation \(f(x) = y_0 + \int_{x_0}^x F\bigl(t,f(t)\bigr) \, dt\text{.}\) For example, let \(H\) be the Heaviside function 4 , that is \(H(t) := 0\) for \(t < 0\) and \(H(t) := 1\) for \(t \geq 0\text{.}\) Then \(y' = H(t)\text{,}\) \(y(0) = 0\text{,}\) is a common equation. The “solution” is the ramp function \(f(x) := 0\) if \(x < 0\) and \(f(x) := x\) if \(x \geq 0\text{,}\) since this function satisfies \(f(x) = \int_0^x H(t)\, dt\text{.}\) Notice, however, that \(f'(0)\) does not exist, so \(f\) is only a so-called weak solution to the differential equation.

Subsection 6.3.4 Exercises

Exercise 6.3.1.

Let \(I, J \subset \R\) be intervals. Let \(F \colon I \times J \to \R\) be a continuous function of two variables and suppose \(f \colon I \to J\) be a continuous function. Show that \(F\bigl(x,f(x)\bigr)\) is a continuous function on \(I\text{.}\)

Exercise 6.3.2.

Let \(I, J \subset \R\) be closed bounded intervals. Show that if \(F \colon I \times J \to \R\) is continuous, then \(F\) is bounded.

Exercise 6.3.3.

We proved Picard's theorem under the assumption that \(x_0 = 0\text{.}\) Prove the full statement of Picard's theorem for an arbitrary \(x_0\text{.}\)

Exercise 6.3.4.

Let \(f'(x)=x f(x)\) be our equation. Start with the initial condition \(f(0)=2\) and find the Picard iterates \(f_0,f_1,f_2,f_3,f_4\text{.}\)

Exercise 6.3.5.

Suppose \(F \colon I \times J \to \R\) is a function that is continuous in the first variable, that is, for every fixed \(y\) the function that takes \(x\) to \(F(x,y)\) is continuous. Further, suppose \(F\) is Lipschitz in the second variable, that is, there exists a number \(L\) such that

\begin{equation*} \abs{F(x,y) - F(x,z)} \leq L \abs{y-z} \qquad \text{for all } y,z \in J, x \in I . \end{equation*}

Show that \(F\) is continuous as a function of two variables. Therefore, the hypotheses in the theorem could be made even weaker.

Exercise 6.3.6.

A common type of equation one encounters are linear first order differential equations, that is equations of the form

\begin{equation*} y' + p(x) y = q(x) , \qquad y(x_0) = y_0 . \end{equation*}

Prove Picard's theorem for linear equations. Suppose \(I\) is an interval, \(x_0 \in I\text{,}\) and \(p \colon I \to \R\) and \(q \colon I \to \R\) are continuous. Show that there exists a unique differentiable \(f \colon I \to \R\text{,}\) such that \(y = f(x)\) satisfies the equation and the initial condition. Hint: Assume existence of the exponential function and use the integrating factor formula for existence of \(f\) (prove that it works and then that it is unique):

\begin{equation*} f(x) := e^{-\int_{x_0}^x p(s)\, ds} \left( \int_{x_0}^x e^{\int_{x_0}^t p(s)\, ds} q(t) \,dt + y_0 \right). \end{equation*}

Exercise 6.3.7.

Consider the equation \(f'(x) = f(x)\text{,}\) from Example 6.3.3. Show that given any \(x_0\text{,}\) any \(y_0\text{,}\) and any positive \(h < \nicefrac{1}{2}\text{,}\) we can pick \(\alpha > 0\) large enough that the proof of Picard's theorem guarantees a solution for the initial condition \(f(x_0) = y_0\) in the interval \([x_0-h,x_0+h]\text{.}\)

Exercise 6.3.8.

Consider the equation \(y' = y^{1/3}x\text{.}\)

  1. Show that for the initial condition \(y(1)=1\text{,}\) Picard's theorem applies. Find an \(\alpha > 0\text{,}\) \(M\text{,}\) \(L\text{,}\) and \(h\) that would work in the proof.

  2. Show that for the initial condition \(y(1) = 0\text{,}\) Picard's theorem does not apply.

  3. Find a solution for \(y(1) = 0\) anyway.

Exercise 6.3.9.

Consider the equation \(x y' = 2y\text{.}\)

  1. Show that \(y = Cx^2\) is a solution for every constant \(C\text{.}\)

  2. Show that for every \(x_0 \not= 0\) and every \(y_0\text{,}\) Picard's theorem applies for the initial condition \(y(x_0) = y_0\text{.}\)

  3. Show that \(y(0) = y_0\) is solvable if and only if \(y_0 = 0\text{.}\)

Named for the French mathematician Charles Émile Picard 2  (1856–1941).
https://en.wikipedia.org/wiki/\%C3\%89mile_Picard
By interior of \([a,b]\) we mean \((a,b)\text{.}\)
Named for the English engineer, mathematician, and physicist Oliver Heaviside 5  (1850–1825).
https://en.wikipedia.org/wiki/Oliver_Heaviside
For a higher quality printout use the PDF versions: https://www.jirka.org/ra/realanal.pdf or https://www.jirka.org/ra/realanal2.pdf