where \(u(x,t)\) is a function of \(x\) and \(t\text{.}\) We again use the notation \(u_x = \frac{\partial u}{\partial x}\) and \(u_t = \frac{\partial u}{\partial t}\) for convenience. The initial condition\(u(x,0) = f(x)\) is now a function of \(x\) rather than just a number. In these problems, it is useful to think of \(x\) as position and \(t\) as time. The equation describes the evolution of a function of \(x\) as time goes on. Below, the coefficients \(a\text{,}\)\(b\text{,}\)\(c\text{,}\) and the function \(g\) are mostly going to be constant or zero. The method we describe works with nonconstant coefficients, although the computations may get difficult quickly.
This method we use is the method of characteristics. The idea is to find lines along which the equation is an ODE that we then solve. We will see this technique again for second order PDE when we encounter the wave equation in Section 4.8.
where \(\alpha\) is a constant. This particular equation, \(\alpha u_x + u_t = 0\text{,}\) is called the transport equation.
The data will propagate along curves called characteristics. The idea is to change to the so-called characteristic coordinates, which we will call \((\xi,s)\text{.}\) If we change to these coordinates, the equation simplifies. The change of variables for this equation is
\begin{equation*}
\xi = x - \alpha t , \qquad s = t .
\end{equation*}
Let us see what the equation becomes. Remember the chain rule in several variables.
are called the characteristic curves. See Figure 1.20. In this case, the solution does not change along the characteristic as \(\frac{du}{ds} = 0\text{.}\)
In the \((x,t)\) coordinates, the characteristic curves satisfy \(t = \frac{1}{\alpha} ( x- \xi)\text{,}\) and are in fact lines. The slope of characteristic lines is \(\frac{1}{\alpha}\text{,}\) and for each different \(\xi\text{,}\) we get a different characteristic line.
We see why \(\alpha u_x + u_t = 0\) is called the transport equation: Everything travels at some constant speed. This behavior is called convection. An example application is material being moved by a river where the material does not diffuse and is simply carried along. In this setup, \(x\) is the position along the river, \(t\) is the time, and \(u(x,t)\) the concentration the material at position \(x\) and time \(t\text{.}\) See Figure 1.21 for an example.
We use a similar idea in the more general case:
\begin{equation*}
a u_x + b u_t + c u = g, \qquad u(x,0) = f(x) .
\end{equation*}
We change coordinates to the characteristic coordinates, which we call \((\xi,s)\text{.}\) These are coordinates where \(a u_x + b u_t\) becomes differentiation in the \(s\) variable.
Along the characteristic curves (where \(\xi\) is constant), we get a new ODE in the \(s\) variable. In the transport equation, we got the simple \(\frac{du}{ds} = 0\text{.}\) In general, we get the linear equation
\begin{equation}
\frac{du}{ds} + c u = g.\tag{1.7}
\end{equation}
We think of everything as a function of \(\xi\) and \(s\text{,}\) although we are thinking of \(\xi\) as a parameter rather than an independent variable. So the equation is an ODE. It is a linear ODE that we can solve using the integrating factor.
To find the characteristics, think of a curve given parametrically \(\bigl(x(s),t(s)\bigr)\text{.}\) We try to have the curve satisfy
\begin{equation*}
\frac{dx}{ds} = a, \qquad \frac{dt}{ds} = b .
\end{equation*}
Why? Because when we think of \(x\) and \(t\) as functions of \(s\text{,}\) we find, using the chain rule,
\begin{equation*}
\frac{du}{ds} + c u =
\underbrace{\left( u_x \frac{dx}{ds} + u_t
\frac{dt}{ds}\right)}_{\frac{du}{ds}} + c u =
a u_x + b u_t + c u = g .
\end{equation*}
So we get the ODE (1.7), which then describes the value of the solution \(u\) of the PDE along this characteristic curve. It is convenient to make sure that \(s=0\) corresponds to \(t=0\text{,}\) that is, \(t(0) = 0\text{.}\) It will also be convenient for \(x(0) = \xi\text{.}\) See Figure 1.22.
\begin{equation*}
x = s + c_1, \qquad t = s+ c_2 ,
\end{equation*}
for some \(c_1\) and \(c_2\text{.}\) At \(s=0\text{,}\) we want \(x=\xi\) and \(t=0\text{.}\) So we let \(c_1 = \xi\) and \(c_2 = 0\text{:}\)
\begin{equation*}
x = s + \xi, \qquad t = s .
\end{equation*}
The ODE is \(\frac{du}{ds} + u = x\text{,}\) and \(x = s+\xi\text{.}\) So, the ODE to solve along the characteristic is
\begin{equation*}
\frac{du}{ds} + u = s+ \xi .
\end{equation*}
The general solution of this equation, treating \(\xi\) as a parameter, is \(u = C e^{-s}+s+\xi-1\text{,}\) for some \(C\text{,}\) which can depend on \(\xi\text{.}\) At \(s=0\text{,}\) our initial condition is that \(u\) is \(e^{-\xi^2}\text{,}\) since at \(s=0\text{,}\) we have \(x=\xi\text{.}\) Given this initial condition, we find \(C=e^{-\xi^2} - \xi +1\text{.}\) So,
\begin{equation*}
x = c_1 e^{s} , \qquad t = s+ c_2 .
\end{equation*}
At \(s=0\text{,}\) we wish to get \(x=\xi\) and \(t=0\) as before. So
\begin{equation*}
x = \xi e^s, \qquad t = s .
\end{equation*}
OK, the ODE we need to solve is
\begin{equation*}
\frac{du}{ds} + 2 u = 0 .
\end{equation*}
This is for a fixed \(\xi\text{.}\) We find \(u = C e^{-2s}\text{.}\) At \(s=0\text{,}\) we want \(u\) to be \(\cos(\xi)\text{,}\) so that is our initial condition for the ODE. Moreover, \(\xi = xe^{-t}\) and \(s=t\text{.}\) Consequently,
\begin{equation*}
u = e^{-2s} \cos(\xi)= e^{-2t} \cos(xe^{-t}) .
\end{equation*}
We make a few closing remarks. One thing to keep in mind is that we would get into trouble if the coefficient in front of \(u_t\text{,}\) that is the \(b\text{,}\) is ever zero. Let us consider a quick example of what can go wrong:
This problem has no solution. If we had a solution, it would imply that \(u_x(x,0) = \cos(x)\text{,}\) but \(u_x(x,0) + u(x,0) = \cos(x) + \sin(x) \not= 0\text{.}\) The problem is that the characteristic curve is now the line \(t=0\text{,}\) and the solution is already provided on that line!
As \(b\) ought to then be nonzero, it is convenient to ensure that \(b\) is positive by multiplying the equation by \(-1\) if necessary, so that positive \(s\) means positive \(t\text{.}\)
Another remark is that if \(a\) or \(b\) in the equation are not constants, the computations can quickly get out of hand, as the expressions for the characteristic coordinates become messy and then solving the ODE becomes even messier. In the examples above, \(b\) was always \(1\text{,}\) meaning we got \(s=t\) in the characteristic coordinates. If \(b\) is not constant, your expression for \(s\) will be more complicated.
Finding the characteristic coordinates is really a system of ODE in general if \(a\) depends on \(t\) or if \(b\) depends on \(x\text{.}\) In that case, we would need techniques of systems of ODE to solve, see Chapter 3 or Chapter 8. In general, if \(a\) and \(b\) are not linear functions or constants, finding closed form expressions for the characteristic coordinates may be impossible.
Finally, the method of characteristics applies to nonlinear first order PDE as well. In the nonlinear case, the characteristics depend not only on the differential equation, but also on the initial data. This leads to not only more difficult computations, but also the formation of singularities where the solution breaks down at a certain point in time. An example application where first order nonlinear PDE come up is traffic flow theory, and you have probably experienced the formation of singularities: traffic jams. But we digress.