Skip to main content
Logo image

Section 8.3 The derivative

Note: 2–3 lectures

Subsection 8.3.1 The derivative

For a function \(f \colon \R \to \R\text{,}\) we defined the derivative at \(x\) as
\begin{equation} \lim_{h \to 0} \frac{f(x+h)-f(x)}{h} . \end{equation}
In other words, there is a number \(a\) (the derivative of \(f\) at \(x\)) such that
\begin{equation} \begin{aligned} \lim_{h \to 0} \abs{\frac{f(x+h)-f(x)}{h} - a} & = \lim_{h \to 0} \abs{\frac{f(x+h)-f(x) - ah}{h}} \\ & = \lim_{h \to 0} \frac{\babs{f(x+h)-f(x) - ah}}{\sabs{h}} = 0. \end{aligned} \end{equation}
Multiplying by \(a\) is a linear map in one dimension: \(h \mapsto ah\text{.}\) Namely, we think of \(a \in L(\R^1,\R^1)\text{,}\) which is the best linear approximation of how \(f\) changes near \(x\text{.}\) We use this interpretation to extend differentiation to more variables.

Definition 8.3.1.

Let \(U \subset \R^n\) be open and \(f \colon U \to \R^m\) a function. We say \(f\) is differentiable at \(x \in U\) if there exists an \(A \in L(\R^n,\R^m)\) such that
\begin{equation} \lim_{\substack{h \to 0\\h\in \R^n}} \frac{\bnorm{f(x+h)-f(x) - Ah}}{\snorm{h}} = 0 . \end{equation}
We will show momentarily that \(A\text{,}\) if it exists, is unique. We write \(Df(x) \coloneqq A\text{,}\) or \(f'(x) \coloneqq A\text{,}\) and we say \(A\) is the derivative of \(f\) at \(x\text{.}\) When \(f\) is differentiable at every \(x \in U\text{,}\) we say simply that \(f\) is differentiable. See Figure 8.4 for an illustration.

A graph of a sample function y equals f of x sub 1, x sub 2. A particular point is marked on the x sub 1 x sub 2 plane. There is an arrow labeled h in the x sub 1 x sub 2 plane. On the graph a piece of the tangent plane (that is, a plane tangent to the graph) is shown above the point marked. A horizontal line representing h is drawn starting on the graph now. Then from the end of that line a vertical arrow that moves us to the tangent plane is drawn and labeled A h.
Figure 8.4. Illustration of a derivative for a function \(f \colon \R^2 \to \R\text{.}\) The vector \(h\) is shown in the \(x_1x_2\)-plane based at \((x_1,x_2)\text{,}\) and the vector \(Ah \in \R^1\) is shown along the \(y\) direction.

For a differentiable function, the derivative of \(f\) is a function from \(U\) to \(L(\R^n,\R^m)\text{.}\) Compare to the one-dimensional case, where the derivative is a function from \(U\) to \(\R\text{,}\) but we really want to think of \(\R\) here as \(L(\R^1,\R^1)\text{.}\) As in one dimension, the idea is that a differentiable mapping is “infinitesimally close” to a linear mapping, and this linear mapping is the derivative.
Notice the norms in the definition. The norm in the numerator is on \(\R^m\text{,}\) and the norm in the denominator is on \(\R^n\) where \(h\) lives. Normally it is understood that \(h \in \R^n\) from context (the formula makes no sense otherwise); it is not always necessary to explicitly say so. Let us prove, as promised, that the derivative is unique.

Proof.

Suppose \(h \in \R^n\text{,}\) \(h \neq 0\text{.}\) Compute
\begin{equation} \begin{split} \frac{\bnorm{(A-B)h}}{\snorm{h}} & = \frac{\bnorm{-\bigl(f(x+h)-f(x) - Ah\bigr) + f(x+h)-f(x) - Bh}}{\snorm{h}} \\ & \leq \frac{\bnorm{f(x+h)-f(x) - Ah}}{\snorm{h}} + \frac{\bnorm{f(x+h)-f(x) - Bh}}{\snorm{h}} . \end{split} \end{equation}
So \(\frac{\snorm{(A-B)h}}{\snorm{h}} \to 0\) as \(h \to 0\text{.}\) Given \(\epsilon > 0\text{,}\) for all nonzero \(h\) in some \(\delta\)-ball around the origin we have
\begin{equation} \epsilon > \frac{\bnorm{(A-B)h}}{\snorm{h}} = \norm{(A-B)\frac{h}{\snorm{h}}} . \end{equation}
For any given \(v \in \R^n\) with \(\snorm{v}=1\text{,}\) if \(h = (\nicefrac{\delta}{2}) \, v\text{,}\) then \(\snorm{h} < \delta\) and \(\frac{h}{\snorm{h}} = v\text{.}\) So \(\bnorm{(A-B)v} < \epsilon\text{.}\) Taking the supremum over all \(v\) with \(\snorm{v} = 1\text{,}\) we get the operator norm \(\snorm{A-B} \leq \epsilon\text{.}\) As \(\epsilon > 0\) was arbitrary, \(\snorm{A-B} = 0\text{,}\) or in other words \(A = B\text{.}\)

Example 8.3.3.

If \(f(x) = Ax\) for a linear mapping \(A\text{,}\) then \(f'(x) = A\text{:}\)
\begin{equation} \frac{\bnorm{f(x+h)-f(x) - Ah}}{\snorm{h}} = \frac{\bnorm{A(x+h)-Ax - Ah}}{\snorm{h}} = \frac{0}{\snorm{h}} = 0 . \end{equation}

Example 8.3.4.

Let \(f \colon \R^2 \to \R^2\) be defined by
\begin{equation} f(x,y) = \bigl(f_1(x,y),f_2(x,y)\bigr) \coloneqq (1+x+2y+x^2,2x+3y+xy). \end{equation}
Let us show that \(f\) is differentiable at the origin and compute the derivative directly using the definition. If the derivative exists, it is in \(L(\R^2,\R^2)\text{,}\) so it can be represented by a \(2\)-by-\(2\) matrix \(\left[\begin{smallmatrix}a&b\\c&d\end{smallmatrix}\right]\text{.}\) Suppose \(h = (h_1,h_2)\text{.}\) We need the following expression to go to zero.
\begin{multline*} \frac{\bnorm{ f(h_1,h_2)-f(0,0) - (ah_1 +bh_2 , ch_1+dh_2)} }{\bnorm{(h_1,h_2)}} = \\ \frac{\sqrt{ {\bigl((1-a)h_1 + (2-b)h_2 + h_1^2\bigr)}^2 + {\bigl((2-c)h_1 + (3-d)h_2 + h_1h_2\bigr)}^2}}{\sqrt{h_1^2+h_2^2}} . \end{multline*}
If we choose \(a=1\text{,}\) \(b=2\text{,}\) \(c=2\text{,}\) \(d=3\text{,}\) the expression becomes
\begin{equation} \frac{\sqrt{ h_1^4 + h_1^2h_2^2}}{\sqrt{h_1^2+h_2^2}} = \sabs{h_1} \frac{\sqrt{ h_1^2 + h_2^2}}{\sqrt{h_1^2+h_2^2}} = \sabs{h_1} . \end{equation}
This expression does indeed go to zero as \(h \to 0\text{.}\) The function \(f\) is differentiable at the origin and the derivative \(f'(0)\) is represented by the matrix \(\left[\begin{smallmatrix}1&2\\2&3\end{smallmatrix}\right]\text{.}\)

Proof.

Another way to write the differentiability of \(f\) at \(p\) is to consider
\begin{equation} r(h) \coloneqq f(p+h)-f(p) - f'(p) h . \end{equation}
The function \(f\) is differentiable at \(p\) if \(\frac{\snorm{r(h)}}{\snorm{h}}\) goes to zero as \(h \to 0\text{,}\) so \(r(h)\) itself goes to zero. The mapping \(h \mapsto f'(p) h\) is a linear mapping between finite-dimensional spaces, hence continuous and \(f'(p) h \to 0\) as \(h \to 0\text{.}\) Thus, \(f(p+h)\) must go to \(f(p)\) as \(h \to 0\text{.}\) That is, \(f\) is continuous at \(p\text{.}\)
Differentiation is a linear operator on the space of differentiable functions.

Proof.

Let \(h \in \R^n\text{,}\) \(h \neq 0\text{.}\) Then
\begin{multline*} \frac{\bnorm{f(p+h)+g(p+h)-\bigl(f(p)+g(p)\bigr) - \bigl(f'(p) + g'(p)\bigr)h}}{\snorm{h}} \\ \leq \frac{\bnorm{f(p+h)-f(p) - f'(p)h}}{\snorm{h}} + \frac{\bnorm{g(p+h)-g(p) - g'(p)h}}{\snorm{h}} , \end{multline*}
and
\begin{equation} \frac{\bnorm{\alpha f(p+h) - \alpha f(p) - \alpha f'(p)h}}{\snorm{h}} = \sabs{\alpha} \frac{\bnorm{f(p+h))-f(p) - f'(p)h}}{\snorm{h}} . \end{equation}
The limits as \(h\) goes to zero of the right-hand sides are zero by hypothesis. The result follows.
If \(A \in L(\R^n,\R^m)\) and \(B \in L(\R^m,\R^k)\) are linear maps, then they are their own derivative. The composition \(BA \in L(\R^n,\R^k)\) is also its own derivative, and so the derivative of the composition is the composition of the derivatives. As differentiable maps are “infinitesimally close” to linear maps, they have the same property:
Without the points where things are evaluated, we write \(F' = {(g \circ f)}' = g' f'\text{.}\) The derivative of the composition \(g \circ f\) is the composition of the derivatives of \(g\) and \(f\text{:}\) If \(f'(p) = A\) and \(g'\bigl(f(p)\bigr) = B\text{,}\) then \(F'(p) = BA\text{,}\) just as for linear maps.

Proof.

Let \(A \coloneqq f'(p)\) and \(B \coloneqq g'\bigl(f(p)\bigr)\text{.}\) Take a nonzero \(h \in \R^n\) and write \(q \coloneqq f(p)\text{,}\) \(k \coloneqq f(p+h)-f(p)\text{.}\) Let
\begin{equation} r(h) \coloneqq f(p+h)-f(p) - A h . \end{equation}
Then \(r(h) = k-Ah\) or \(Ah = k-r(h)\text{,}\) and \(f(p+h) = q+k\text{.}\) We look at the quantity we need to go to zero:
\begin{equation} \begin{split} \frac{\bnorm{F(p+h)-F(p) - BAh}}{\snorm{h}} & = \frac{\bnorm{g\bigl(f(p+h)\bigr)-g\bigl(f(p)\bigr) - BAh}}{\snorm{h}} \\ & = \frac{\bnorm{g(q+k)-g(q) - B\bigl(k-r(h)\bigr)}}{\snorm{h}} \\ & \leq \frac {\bnorm{g(q+k)-g(q) - Bk}} {\snorm{h}} + \snorm{B} \frac {\bnorm{r(h)}} {\snorm{h}} . \end{split} \end{equation}
We need both terms on the right to go to 0 as \(h\) goes to 0. First, \(\snorm{B}\) is a constant and \(f\) is differentiable at \(p\text{,}\) so the term \(\snorm{B}\frac{\snorm{r(h)}}{\snorm{h}}\) goes to 0. Next, if \(k=0\text{,}\) then \(\frac {\snorm{g(q+k)-g(q) - Bk}} {\snorm{h}} = 0\text{.}\) So suppose that \(k \neq 0\text{.}\) Then
\begin{equation} \frac {\bnorm{g(q+k)-g(q) - Bk}} {\snorm{h}} = \frac {\bnorm{g(q+k)-g(q) - Bk}} {\snorm{k}} \frac {\bnorm{f(p+h)-f(p)}} {\snorm{h}} . \end{equation}
Because \(f\) is continuous at \(p\text{,}\) \(k\) goes to 0 as \(h\) goes to 0. Thus \(\frac {\snorm{g(q+k)-g(q) - Bk}} {\snorm{k}}\) goes to 0, because \(g\) is differentiable at \(q\text{.}\) We have,
\begin{equation} \begin{aligned} \frac {\bnorm{f(p+h)-f(p)}} {\snorm{h}} & \leq \frac {\bnorm{f(p+h)-f(p)-Ah}} {\snorm{h}} + \frac {\snorm{Ah}} {\snorm{h}} \\ & \leq \frac {\bnorm{f(p+h)-f(p)-Ah}} {\snorm{h}} + \snorm{A} . \end{aligned} \end{equation}
As \(f\) is differentiable at \(p\text{,}\) for small enough \(h\text{,}\) the quantity \(\frac{\snorm{f(p+h)-f(p)-Ah}}{\snorm{h}}\) is bounded. Hence, the term \(\frac {\snorm{f(p+h)-f(p)}} {\snorm{h}}\) stays bounded as \(h\) goes to 0. In other words, the term \(\frac {\snorm{g(q+k)-g(q) - Bk}} {\snorm{h}}\) goes to zero as \(h\) goes to 0. Therefore, \(\frac{\snorm{F(p+h)-F(p) - BAh}}{\snorm{h}}\) goes to zero, and \(F'(p) = BA\text{,}\) which is what was claimed.

Subsection 8.3.2 Partial derivatives

There is another way to generalize the derivative from one dimension. We hold all but one variable constant and take the regular one-variable derivative.

Definition 8.3.8.

Let \(f \colon U \to \R\) be a function on an open set \(U \subset \R^n\text{.}\) If the following limit exists, we write
\begin{equation} \frac{\partial f}{\partial x_j} (x) \coloneqq \lim_{h\to 0}\frac{f(x_1,\ldots,x_{j-1},x_j+h,x_{j+1},\ldots,x_n)-f(x)}{h} = \lim_{h\to 0}\frac{f(x+h e_j)-f(x)}{h} . \end{equation}
We call \(\frac{\partial f}{\partial x_j} (x)\) the partial derivative of \(f\) with respect to \(x_j\text{.}\) See Figure 8.5. Here \(h\) is a number, not a vector.
For a mapping \(f \colon U \to \R^m\text{,}\) we write \(f = (f_1,f_2,\ldots,f_m)\text{,}\) where \(f_k\) are real-valued functions. We then take partial derivatives of the components, \(\frac{\partial f_k}{\partial x_j}\text{.}\)

A graph of a sample function y equals f of x sub 1, x sub 2. A particular point is marked on the x sub 1 x sub 2 plane. A plane in the x sub 2 and y direction is marked in a dotted line and the intersection of the graph of the function f with this plane is marked in thicker line. The point on the graph above the given x sub 1 x sub 2 point is also marked, through this point and inside the dotted plane the tangent line to the graph is drawn. It is labeled with slope equals the partial derivative of f with respect to x sub 2 at the point x sub 1, x sub 2.
Figure 8.5. Illustration of a partial derivative for a function \(f \colon \R^2 \to \R\text{.}\) The \(yx_2\)-plane where \(x_1\) is fixed is marked in dotted line, and the slope of the tangent line in the \(yx_2\)-plane is \(\frac{\partial f}{\partial x_2}(x_1,x_2)\text{.}\)

Partial derivatives are easier to compute with all the machinery of calculus, and they provide a way to compute the derivative of a function.
In other words,
\begin{equation} f'(p) \, e_j = \sum_{k=1}^m \frac{\partial f_k}{\partial x_j}(p) \,e_k . \end{equation}
If \(v = \sum_{j=1}^n c_j\, e_j = (c_1,c_2,\ldots,c_n)\text{,}\) then
\begin{equation} f'(p) \, v = \sum_{j=1}^n \sum_{k=1}^m c_j \frac{\partial f_k}{\partial x_j}(p) \,e_k = \sum_{k=1}^m \left( \sum_{j=1}^n c_j \frac{\partial f_k}{\partial x_j}(p) \right) \,e_k . \end{equation}

Proof.

Fix a \(j\) and note that for nonzero \(h\text{,}\)
\begin{equation} \begin{split} \norm{\frac{f(p+h e_j)-f(p)}{h} - f'(p) \, e_j} & = \norm{\frac{f(p+h e_j)-f(p) - f'(p) \, h e_j}{h}} \\ & = \frac{\bnorm{f(p+h e_j)-f(p) - f'(p) \, h e_j}}{\snorm{h e_j}} . \end{split} \end{equation}
As \(h\) goes to 0, the right-hand side goes to zero by differentiability of \(f\text{.}\) Hence,
\begin{equation} \lim_{h \to 0} \frac{f(p+h e_j)-f(p)}{h} = f'(p) \, e_j . \end{equation}
The limit is in \(\R^m\text{.}\) Represent \(f\) in components \(f = (f_1,f_2,\ldots,f_m)\text{.}\) Taking a limit in \(\R^m\) is the same as taking the limit in each component separately. So for every \(k\text{,}\) the partial derivative
\begin{equation} \frac{\partial f_k}{\partial x_j} (p) = \lim_{h \to 0} \frac{f_k(p+h e_j)-f_k(p)}{h} \end{equation}
exists and is equal to the \(k\)th component of \(f'(p)\, e_j\text{,}\) which is the \(j\)th column of \(f'(p)\text{,}\) and we are done.
The converse of the proposition is not true. Just because the partial derivatives exist does not mean that the function is differentiable. See the exercises. However, when the partial derivatives are continuous, we will prove that the converse holds. One of the consequences of the proposition above is that if \(f\) is differentiable on \(U\text{,}\) then \(f' \colon U \to L(\R^n,\R^m)\) is a continuous function if and only if all the \(\frac{\partial f_k}{\partial x_j}\) are continuous functions.

Subsection 8.3.3 Gradients, curves, and directional derivatives

Let \(U \subset \R^n\) be open and \(f \colon U \to \R\) a differentiable function. We define the gradient as
\begin{equation} \nabla f (x) \coloneqq \sum_{j=1}^n \frac{\partial f}{\partial x_j} (x)\, e_j . \end{equation}
The gradient gives a way to represent the action of the derivative as a dot product: \(f'(x)\,v = \nabla f(x) \cdot v\text{.}\)
Suppose \(\gamma \colon (a,b) \subset \R \to \R^n\) is differentiable. Such a function (and its image) is sometimes called a curve, or a differentiable curve. Write \(\gamma = (\gamma_1,\gamma_2,\ldots,\gamma_n)\text{.}\) For the purposes of computation, we identify \(L(\R^1)\) and \(\R\) as we did when we defined the derivative in one variable. We also identify \(L(\R^1,\R^n)\) with \(\R^n\text{.}\) We treat \(\gamma'(t)\) both as an operator in \(L(\R^1,\R^n)\) and the vector \(\bigl(\gamma_1'(t),\gamma_2'(t),\ldots,\gamma_n'(t)\bigr)\) in \(\R^n\text{.}\) Using Proposition 8.3.9, if \(v\in \R^n\) is \(\gamma'(t)\) acting as a vector, then \(h \mapsto h \, v\) (for \(h \in \R^1 = \R\)) is \(\gamma'(t)\) acting as an operator in \(L(\R^1,\R^n)\text{.}\) We often use this slight abuse of notation when dealing with curves. The vector \(\gamma'(t)\) is called a tangent vector. See Figure 8.6.

A curve in the plane is drawn. The beginning of the curve is marked as gamma of a and the end is marked gamma of b. The entire curve is marked gamma of (a,b). A point on the curve is marked gamma of t, and an arrow starting at this point and pointing along the curve is marked gamma prime of t.
Figure 8.6. Differentiable curve and its derivative as a vector (for clarity assuming \(\gamma\) defined on \([a,b]\)). The tangent vector \(\gamma'(t)\) points along the curve.

Suppose \(\gamma\bigl((a,b)\bigr) \subset U\) and let
\begin{equation} g(t) \coloneqq f\bigl(\gamma(t)\bigr) . \end{equation}
The function \(g\) is differentiable. Treating \(g'(t)\) as a number,
\begin{equation} g'(t) = f'\bigl(\gamma(t)\bigr) \gamma'(t) = \sum_{j=1}^n \frac{\partial f}{\partial x_j} \bigl(\gamma(t)\bigr) \frac{d\gamma_j}{dt} (t) = \sum_{j=1}^n \frac{\partial f}{\partial x_j} \frac{d\gamma_j}{dt} . \end{equation}
For convenience, we often leave out the points where we are evaluating, such as above on the far right-hand side. With the notation of the gradient and the dot product, the equation becomes
\begin{equation} g'(t) = (\nabla f) \bigl(\gamma(t)\bigr) \cdot \gamma'(t) = \nabla f \cdot \gamma'. \end{equation}
We use this idea to define derivatives in a specific direction. A direction is simply a vector pointing in that direction. Pick a vector \(u \in \R^n\) such that \(\snorm{u} = 1\text{,}\) and fix \(x \in U\text{.}\) We define the directional derivative as
\begin{equation} D_u f (x) \coloneqq \frac{d}{dt}\Big|_{t=0} \bigl[ f(x+tu) \bigr] = \lim_{h\to 0} \frac{f(x+hu)-f(x)}{h} , \end{equation}
where the notation \(\frac{d}{dt}\big|_{t=0}\) represents the derivative evaluated at \(t=0\text{.}\) When \(u=e_j\) is a standard basis vector, we find \(\frac{\partial f}{\partial x_j} = D_{e_j} f\text{.}\) For this reason, sometimes the notation \(\frac{\partial f}{\partial u}\) is used instead of \(D_u f\text{.}\)
Define \(\gamma\) by
\begin{equation} \gamma(t) \coloneqq x + tu . \end{equation}
Then \(\gamma'(t) = u\) for all \(t\text{.}\) Let us see what happens to \(f\) when we travel along \(\gamma\text{:}\)
\begin{equation} D_u f (x) = \frac{d}{dt}\Big|_{t=0} \bigl[ f(x+tu) \bigr] = (\nabla f) \bigl(\gamma(0)\bigr) \cdot \gamma'(0) = (\nabla f) (x) \cdot u . \end{equation}
In fact, this computation holds whenever \(\gamma\) is any curve such that \(\gamma(0) = x\) and \(\gamma'(0) = u\text{.}\)
Suppose \((\nabla f)(x) \neq 0\text{.}\) By the Cauchy–Schwarz inequality,
\begin{equation} \babs{D_u f(x)} \leq \bnorm{(\nabla f)(x)} . \end{equation}
Equality is achieved when \(u\) is a scalar multiple of \((\nabla f)(x)\text{.}\) That is, when
\begin{equation} u = \frac{(\nabla f)(x)}{\bnorm{(\nabla f)(x)}} , \end{equation}
we get \(D_u f(x) = \bnorm{(\nabla f)(x)}\text{.}\) The gradient points in the direction in which the function grows fastest, in other words, in the direction in which \(D_u f(x)\) is maximal.

Subsection 8.3.4 The Jacobian

Definition 8.3.10.

Let \(U \subset \R^n\) and \(f \colon U \to \R^n\) be a differentiable mapping. Define the Jacobian determinant
 1 
Named after the Italian mathematician Carl Gustav Jacob Jacobi (1804–1851).
, or simply the Jacobian
 2 
The matrix from Proposition 8.3.9 representing \(f'(x)\) is called the Jacobian matrix, or sometimes confusingly also called just “the Jacobian.”
, of \(f\) at \(x\) as
\begin{equation} J_f(x) \coloneqq \det\bigl( f'(x) \bigr) . \end{equation}
Sometimes \(J_f\) is written as
\begin{equation} \frac{\partial(f_1,f_2,\ldots,f_n)}{\partial(x_1,x_2,\ldots,x_n)} . \end{equation}
This last piece of notation may seem somewhat confusing, but it is quite useful when we need to specify the exact variables and function components used, as we will do, for example, in the implicit function theorem.
The Jacobian determinant \(J_f\) is a real-valued function, and when \(n=1\) it is simply the derivative. From the chain rule and the fact that \(\det(AB) = \det(A)\det(B)\text{,}\) it follows that:
\begin{equation} J_{f \circ g} (x) = J_f\bigl(g(x)\bigr) J_g(x) . \end{equation}
The determinant of a linear mapping tells us what happens to area/volume under the mapping. Similarly, the Jacobian determinant measures how much a differentiable mapping stretches things locally, and if it flips orientation. In particular, if the Jacobian determinant is nonzero, then we would assume that locally the mapping is invertible (and we would be correct as we will later see).

Exercises 8.3.5 Exercises

8.3.1.

Suppose \(\gamma \colon (-1,1) \to \R^n\) and \(\alpha \colon (-1,1) \to \R^n\) are two differentiable curves such that \(\gamma(0) = \alpha(0)\) and \(\gamma'(0) = \alpha'(0)\text{.}\) Suppose \(F \colon \R^n \to \R\) is a differentiable function. Show that
\begin{equation} \frac{d}{dt}\Big|_{t=0} F\bigl(\gamma(t)\bigr) = \frac{d}{dt}\Big|_{t=0} F\bigl(\alpha(t)\bigr) . \end{equation}

8.3.2.

Let \(f \colon \R^2 \to \R\) be given by \(f(x,y) \coloneqq \sqrt{x^2+y^2}\text{,}\) see Figure 8.7. Show that \(f\) is not differentiable at the origin.

A graph in the xyz-space of a function z equals f of x, y, which is a cone facing upwards from the origin. That is, one could construct this graph by going upwards at 45 degrees along every direction in the xy-plane away from the origin.
Figure 8.7. Graph of \(\sqrt{x^2+y^2}\text{.}\)

8.3.3.

Using only the definition of the derivative, show that the following \(f \colon \R^2 \to \R^2\) are differentiable at the origin and find their derivative.
  1. \(f(x,y) \coloneqq (1+x+xy,x)\text{,}\)
  2. \(f(x,y) \coloneqq \bigl(y-y^{10},x \bigr)\text{,}\)
  3. \(f(x,y) \coloneqq \bigl( {(x+y+1)}^2 , {(x-y+2)}^2 \bigr)\text{.}\)

8.3.4.

Suppose \(f \colon \R \to \R\) and \(g \colon \R \to \R\) are differentiable functions. Using only the definition of the derivative, show that \(h \colon \R^2 \to \R^2\) defined by \(h(x,y) \coloneqq \bigl(f(x),g(y)\bigr)\) is a differentiable function, and find the derivative, at all points \((x,y)\text{.}\)

8.3.5.

Define a function \(f \colon \R^2 \to \R\) by (see Figure 8.8)
\begin{equation} f(x,y) \coloneqq \begin{cases} \frac{xy}{x^2+y^2} & \text{if } (x,y) \neq (0,0), \\ 0 & \text{if } (x,y) = (0,0). \end{cases} \end{equation}
  1. Show that the partial derivatives \(\frac{\partial f}{\partial x}\) and \(\frac{\partial f}{\partial y}\) exist at all points (including the origin).
  2. Show that \(f\) is not continuous at the origin (and hence not differentiable).

A graph of a function z equals f of x, y. The function is zero on the x and y axes. In the first and third quadrant it is positive and negative in the second and fourth quadrant. It is bounded, but it is not continuous at the origin as if we approach along the y equals x the function seems to be constant and positive, and along the y equals minus x line the function seems constant and negative. The graph appears as if we took a rubber sheet at the origin with one hand from above and one hand from below and we stretched the origin into kind of stretched pinch.
Figure 8.8. Graph of \(\frac{xy}{x^2+y^2}\text{.}\)

8.3.6.

Define a function \(f \colon \R^2 \to \R\) by (see Figure 8.9)
\begin{equation} f(x,y) \coloneqq \begin{cases} \frac{x^2y}{x^2+y^2} & \text{if } (x,y) \neq (0,0), \\ 0 & \text{if } (x,y) = (0,0). \end{cases} \end{equation}
  1. Show that the partial derivatives \(\frac{\partial f}{\partial x}\) and \(\frac{\partial f}{\partial y}\) exist at all points.
  2. Show that for all \(u \in \R^2\) with \(\snorm{u}=1\text{,}\) the directional derivative \(D_u f\) exists at all points.
  3. Show that \(f\) is continuous at the origin.
  4. Show that \(f\) is not differentiable at the origin.

A graph of a function z equals f of x, y. The function is zero on the x and y axes. In the first and second quadrant it is positive and negative in the third and fourth quadrant. While it is obviously continuous, it is clear from the picture that it will have the same shape if we keep zooming into the origin, and so it will not have a tangent plane at the origin.
Figure 8.9. Graph of \(\frac{x^2y}{x^2+y^2}\text{.}\)

8.3.7.

Suppose \(f \colon \R^n \to \R^n\) is one-to-one, onto, differentiable at all points, and such that \(f^{-1}\) is also differentiable at all points.
  1. Show that \(f'(p)\) is invertible at all points \(p\) and compute \({(f^{-1})}'\bigl(f(p)\bigr)\text{.}\) Hint: Consider \(x = f^{-1}\bigl(f(x)\bigr)\text{.}\)
  2. Let \(g \colon \R^n \to \R^n\) be a function differentiable at \(q \in \R^n\) and such that \(g(q)=q\text{.}\) Suppose \(f(p) = q\) for some \(p \in \R^n\text{.}\) Show \(J_g(q) = J_{f^{-1} \circ g \circ f}(p)\) where \(J_g\) is the Jacobian determinant.

8.3.8.

Suppose \(f \colon \R^2 \to \R\) is differentiable and such that \(f(x,y) = 0\) if and only if \(y=0\) and such that \(\nabla f(0,0) = (0,1)\text{.}\) Prove that \(f(x,y) > 0\) whenever \(y > 0\text{,}\) and \(f(x,y) < 0\) whenever \(y < 0\text{.}\)
As for functions of one variable, \(f \colon U \to \R\) has a relative maximum at \(p \in U\) if there exists a \(\delta >0\) such that \(f(q) \leq f(p)\) for all \(q \in B(p,\delta) \cap U\text{.}\) Similarly for relative minimum.

8.3.9.

Suppose \(U \subset \R^n\) is open and \(f \colon U \to \R\) is differentiable. Suppose \(f\) has a relative maximum at \(p \in U\text{.}\) Show that \(f'(p) = 0\text{,}\) that is, the zero mapping in \(L(\R^n,\R)\text{.}\) Namely, \(p\) is a critical point of \(f\text{.}\)

8.3.10.

Suppose \(f \colon \R^2 \to \R\) is differentiable and \(f(x,y) = 0\) whenever \(x^2+y^2 = 1\text{.}\) Prove that there exists at least one point \((x_0,y_0)\) such that \(\frac{\partial f}{\partial x}(x_0,y_0) = \frac{\partial f}{\partial y}(x_0,y_0) = 0\text{.}\)

8.3.11.

Define \(f(x,y) \coloneqq ( x-y^2 ) ( 2 y^2 - x)\text{.}\) The graph of \(f\) is called the Peano surface.
 3 
Named after the Italian mathematician Giuseppe Peano (1858–1932).
  1. Show that \((0,0)\) is a critical point, that is, \(f'(0,0) = 0\text{,}\) the zero linear map in \(L(\R^2,\R)\text{.}\)
  2. Show that for every direction, the restriction of \(f\) to a line through the origin in that direction has a relative maximum at the origin. In other words, for every \((x,y)\) such that \(x^2+y^2=1\text{,}\) the function \(g(t) \coloneqq f(tx,ty)\text{,}\) has a relative maximum at \(t=0\text{.}\)
    Hint: While not necessary, Section 4.3 makes this part easier.
  3. Show that \(f\) does not have a relative maximum at \((0,0)\text{.}\)

8.3.12.

Suppose \(f \colon \R \to \R^n\) is differentiable and \(\bnorm{f(t)} = 1\) for all \(t\) (that is, we have a curve in the unit sphere). Show that \(f'(t) \cdot f(t) = 0\) (treating \(f'(t)\) as a vector) for all \(t\text{.}\)

8.3.13.

Define \(f \colon \R^2 \to \R^2\) by \(f(x,y) \coloneqq \bigl(x,y+\varphi(x)\bigr)\) for some differentiable function \(\varphi\) of one variable. Show \(f\) is differentiable and find \(f'\text{.}\)

8.3.14.

Suppose \(U \subset \R^n\) is open, \(p \in U\text{,}\) and \(f \colon U \to \R\text{,}\) \(g \colon U \to \R\text{,}\) \(h \colon U \to \R\) are functions such that \(f(p) = g(p) = h(p)\text{,}\) \(f\) and \(h\) are differentiable at \(p\text{,}\) \(f'(p) = h'(p)\text{,}\) and
\begin{equation} f(x) \leq g(x) \leq h(x) \qquad \text{for all } x \in U . \end{equation}
Show that \(g\) is differentiable at \(p\) and \(g'(p) = f'(p) = h'(p)\text{.}\)

8.3.15.

Prove a version of the mean value theorem for functions of several variables. That is, suppose \(U \subset \R^n\) is open, \(f \colon U \to \R\) differentiable, \(p,q \in U\text{,}\) and the segment \([p,q] \in U\text{.}\) Prove that there exists an \(x \in [p,q]\) such that \(\nabla f (x) \cdot (q-p) = f(q)-f(p)\text{.}\)

8.3.16.

Define \(f \colon \R^2 \to \R\) by \(f(x,y) \coloneqq \frac{y(x^2+y^2)}{x}\) when \(x \neq 0\) and \(f(0,y) \coloneqq 0\text{.}\) Show that for all \(u \in \R^2\) with \(\snorm{u}=1\text{,}\) we get \(D_u f(0) = 0\text{,}\) that is, the directional derivative exists and is zero in all directions at the origin, but the function is not continuous at the origin (in fact, not continuous whenever \(x=0\)).
For a higher quality printout use the PDF versions: https://www.jirka.org/ra/realanal.pdf,https://www.jirka.org/ra/realanal2.pdf or https://jirilebl.github.io/ra/realanal.pdf,https://jirilebl.github.io/ra/realanal2.pdf