Without continuity the theorem does not hold. Just because partial derivatives exist does not mean that \(f\) is differentiable, in fact, \(f\) may not even be continuous. See the exercises for the last section and also for this section.
Proof.
We proved that if \(f\) is differentiable, then the partial derivatives exist. The partial derivatives are the entries of the matrix representing \(f'(x)\text{.}\) If \(f' \colon U \to L(\R^n,\R^m)\) is continuous, then the entries are continuous, and hence the partial derivatives are continuous.
To prove the opposite direction, suppose the partial derivatives exist and are continuous. Fix \(x \in U\text{.}\) If we show that \(f'(x)\) exists we are done, because the entries of the matrix representing \(f'(x)\) are the partial derivatives and if the entries are continuous functions, the matrix-valued function \(f'\) is continuous.
We do induction on dimension. First, the conclusion is true when
\(n=1\) (exercise, note that
\(f\) is vector-valued). In this case,
\(f'(x)\) is essentially the derivative of
Chapter 4. Suppose the conclusion is true for
\(\R^{n-1}\text{.}\) That is, if we restrict to the first
\(n-1\) variables, the function is differentiable. When taking the partial derivatives in
\(x_1\) through
\(x_{n-1}\text{,}\) it does not matter if we consider
\(f\) or
\(f\) restricted to the set where
\(x_n\) is fixed. In the following, by a slight abuse of notation, we think of
\(\R^{n-1}\) as a subset of
\(\R^n\text{,}\) that is, the set in
\(\R^n\) where
\(x_n = 0\text{.}\) In other words, we identify the vectors
\((x_1,x_2,\ldots,x_{n-1})\) and
\((x_1,x_2,\ldots,x_{n-1},0)\text{.}\)
Fix \(p \in U\) and let
\begin{equation*}
A \coloneqq
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1}(p)
& \ldots &
\frac{\partial f_1}{\partial x_n}(p)
\\
\vdots & \ddots & \vdots
\\
\frac{\partial f_m}{\partial x_1}(p)
& \ldots &
\frac{\partial f_m}{\partial x_n}(p)
\end{bmatrix} ,
\qquad
A' \coloneqq
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1}(p)
& \ldots &
\frac{\partial f_1}{\partial x_{n-1}}(p)
\\
\vdots & \ddots & \vdots
\\
\frac{\partial f_m}{\partial x_1}(p)
& \ldots &
\frac{\partial f_m}{\partial x_{n-1}}(p)
\end{bmatrix} ,
\qquad
v \coloneqq
\begin{bmatrix}
\frac{\partial f_1}{\partial x_n}(p)
\\
\vdots
\\
\frac{\partial f_m}{\partial x_n}(p)
\end{bmatrix} .
\end{equation*}
Let \(\epsilon > 0\) be given. By the induction hypothesis, there is a \(\delta > 0\) such that for every \(h' \in \R^{n-1}\) with \(\snorm{h'} < \delta\text{,}\) we have
\begin{equation*}
\frac{\snorm{f(p+h') - f(p) - A' h'}}{\snorm{h'}} < \epsilon .
\end{equation*}
By continuity of the partial derivatives, suppose \(\delta\) is small enough so that
\begin{equation*}
\abs{\frac{\partial f_k}{\partial x_n}(p+h)
- \frac{\partial f_k}{\partial x_n}(p)} < \epsilon
\end{equation*}
for all \(k\) and all \(h \in \R^n\) with \(\snorm{h} < \delta\text{.}\)
Suppose \(h = h' + t e_n\) is a vector in \(\R^n\text{,}\) where \(h' \in \R^{n-1}\text{,}\) \(t \in \R\text{,}\) such that \(\snorm{h} < \delta\text{.}\) Then \(\snorm{h'} \leq \snorm{h} < \delta\text{.}\) Note that \(Ah = A' h' + tv\text{.}\)
\begin{equation*}
\begin{split}
\snorm{f(p+h) - f(p) - Ah}
& = \snorm{f(p+h' + t e_n) - f(p+h') - tv + f(p+h') - f(p) - A' h'}
\\
& \leq \snorm{f(p+h' + t e_n) - f(p+h') -tv} + \snorm{f(p+h') - f(p) -
A' h'}
\\
& \leq \snorm{f(p+h' + t e_n) - f(p+h') -tv} + \epsilon \snorm{h'} .
\end{split}
\end{equation*}
As all the partial derivatives exist, by the mean value theorem, for each \(k\) there is some \(\theta_k \in [0,t]\) (or \([t,0]\) if \(t < 0\)), such that
\begin{equation*}
f_k(p+h' + t e_n) - f_k(p+h') =
t \frac{\partial f_k}{\partial x_n}(p+h'+\theta_k e_n).
\end{equation*}
We have \(\snorm{h'+\theta_k e_n} \leq \snorm{h} < \delta\text{,}\) and so we can finish the estimate
\begin{equation*}
\begin{split}
\snorm{f(p+h) - f(p) - Ah}
& \leq \snorm{f(p+h' + t e_n) - f(p+h') -tv} + \epsilon \snorm{h'}
\\
& \leq \sqrt{\sum_{k=1}^m {\left(t\frac{\partial f_k}{\partial x_n}(p+h'+\theta_k e_n) -
t \frac{\partial f_k}{\partial x_n}(p)\right)}^2} + \epsilon \snorm{h'}
\\
& \leq \sqrt{m}\, \epsilon \sabs{t} + \epsilon \snorm{h'}
\\
& \leq (\sqrt{m}+1)\epsilon \snorm{h} . \qedhere
\end{split}
\end{equation*}
A common application is to prove that a certain function is differentiable. For example, we can show that all polynomials are differentiable, and in fact continuously differentiable, by computing the partial derivatives.
Proof.
Consider the partial derivative of \(p\) in the \(x_n\) variable. Write \(p\) as
\begin{equation*}
p(x) = \sum_{j=0}^d p_j(x_1,\ldots,x_{n-1}) \, x_n^j ,
\end{equation*}
where \(p_j\) are polynomials in one less variable. Then
\begin{equation*}
\frac{\partial p}{\partial x_n}(x)
= \sum_{j=1}^d p_j(x_1,\ldots,x_{n-1}) \, j x_n^{j-1} ,
\end{equation*}
which is again a polynomial. So the partial derivatives of polynomials exist and are again polynomials. By the continuity of algebraic operations, polynomials are continuous functions. Therefore \(p\) is continuously differentiable.