Equations of Motion

The Entropy


Time Evolution In Macroscopic Systems.

III: Selected Applications



... W.T. Grandy, Jr.



... Department of Physics & Astronomy, University of Wyoming

... Laramie, Wyoming 82071



Abstract. The results of two recent articles expanding the Gibbs variational principle to encompass all of statistical mechanics, in which the role of external sources is made explicit, are utilized to further explicate the theory. Representative applications to nonequilibrium thermodynamics and hydrodynamics are presented, describing several fundamental processes, including hydrodynamic fluctuations. A coherent description of macroscopic relaxation dynamics is provided, along with an exemplary demonstration of the approach to equilibrium in a simple fluid.



1. Introduction


In his classic exposition of the Elementary Principles of Statistical Mechanics (1902) Willard Gibbs introduced two seminal ideas into the analysis of many-body systems: the notion of ensembles of identical systems in phase space, and his variational principle minimizing an `average index of probability.' With respect to the first, he noted that it was an artifice that ``may serve to give precision to notions of probability," and was not necessary. It now seems clear that this is indeed the case, and our understanding of probability theory has progressed to the point that one need focus only on the single system actually under study, as logic requires.

Gibbs never revealed the reasoning behind his variational principle, and it took more than fifty years to understand the underlying logic. This advance was initiated by Shannon (1948) and developed explicitly by Jaynes (1957a), who recognized the principle as a Principle of Maximum Entropy (PME) and of fundamental importance to probability theory. That is, the entropy of a probability distribution over an exhaustive set of mutually exclusive propositions {Ai},
SI º -k
å
i 
PilnPi ,       k > 0,      (1)
is to be maximized over all possible distributions subject to constraints expressed as expectation values
áAñ =
å
i 
PiAi

. The crucial notion here is that all probabilities are logically based on some kind of prior information, so that Pi=P(Ai|I), and that information here is just those constraints along with any other background information relevant to the problem at hand. It is this maximum of SI that has been identified in the past with the physical entropy of equilibrium thermodynamics when the constraints involve only constants of the motion.

It may be of some value to stress the uniqueness of the information measure (1) in this respect. In recent years a number of other measures have been proposed, generally on an ad hoc basis, and often to address a very specific type of problem. While these may or may not be of value for the intended purpose, their relation to probable inference is highly dubious. That issue was settled years ago when Shore and Johnson (1980) demonstrated that any function different from (1) is inconsistent with accepted methods of inference unless the two have identical maxima under the same constraints; the argument does not rely on their interpretation as information measures.

There is nothing in the principle as formulated by either Gibbs or Jaynes, other than the constraints, to restrict it to time-independent probabilities; indeed, this had already been noted by Boltzmann with respect to some of his ideas on entropy and expressed by Planck in the form SB=klogW. In turn, the only way probabilities can evolve in time is for I itself to be time dependent, thus freeing the {Ai} from restriction to constants of the motion. In two recent articles (Grandy, 2004 a,b) this notion has been exploited to extend the variational principle to the entire range of statistical physics. 1 It was emphasized there how important it is to take into account explicitly any external sources, for they are the only means by which the constraints can change and provide new information about the system. The results of these steps appear to form a sound basis in probability theory for the derivation of macroscopic equations of motion based on the underlying microscopic dynamics.

The subtitle Gibbs chose for his work was, Developed with Especial Reference to The Rational Foundation of Thermodynamics, the continuation of which motivates the present essay. The intention here is not to develop detailed applications, but only to further explicate their logical foundations. To keep the discussion somewhat self contained we begin with a brief review and summary of the previous results, including some additional details of the equilibrium scenario that provide a framework for the nonequilibrium case. At each stage it is the nature of the information, or the constraints on the PME, that establishes the mathematical structure of the theory, demonstrating how the maximum entropy functional presides over all of statistical mechanics in much the same way as the Lagrangian governs all of mechanics.


The Basic Equilibrium Scenario

A brief description of the elementary structure of the PME was given in I, as well as in a number of other places, but we belabor it a bit here to serve as a general guide for its extension; the quantum-mechanical context in terms of the density matrix r is adopted for the sake of brevity. Information is specified in terms of expectation values of a number of Hermitian operators {fr}, r=1,¼,m < n, and the von Neumann information entropy
SI=-kTr(rlnr)      (2)
is maximized subject to constraints
Tr(r)=1 ,        áfrñ = Tr(rfr) .      (3)
Lagrange multipliers {lr} are introduced for each constraint and the result of the variational calculation is the canonical form
r =  1

Z
e-l·f ,        Z(l1,¼,lm)=Tr æ
è
e-l·f ö
ø
 ,      (4)
in terms of the convenient scalar-product notation l·f º l1f1+¼+lmfm. The multipliers and the partition function Z are identified by substitution into (3):
áfrñ = -  

lr
lnZ ,        r=1,¼,m ,      (5)
a set of m coupled differential equations. Equation (4) expresses the initial intent of the PME, to construct a prior probability distribution, or initial state from raw data.

The maximum entropy, which is no longer a function of the probabilities but of the input data only, is found by substituting (4) into (2): 2
S=klnZ +kl·áfñ .      (6)
A structure for the statistical theory follows from the analytical properties of S and Z, beginning with the total differential dS=kl·dáfñ, as follows from (5). Hence,
 S

lr
=0 ,         S

áfrñ
=klr .      (7)
The operators fr can also depend on one or more `external' variables a, say, so that
 lnZ

a
=-l·
 f

a

 ,     (8)
and because lnZ=lnZ(l1,¼,lm,a) we have the reciprocity relation
æ
è
 S

a
ö
ø


{áfrñ} 
=k æ
è
 lnZ

a
ö
ø


{lr} 
 ,      (9)
indicating which variables are to be held constant under differentiation. When such external variables are present the total differential becomes
 1

k
dS
=  lnZ

a
da+l·dáfñ
= l·dQ ,
     (10)
where
dQr º dáfrñ-ádfrñ ,        ádfrñ º
 fr

a
da
 .     (11)
As in I, changes in entropy are always related to a source of some kind, here denoted by dQr.

With (7) the maximum entropy (6) can be rewritten in the form
 1

k
S=  1

k
 S

áfñ
·áfñ+a æ
è
 lnZ

a
ö
ø
 .      (12)
If (lnZ/a) is independent of a it can be replaced by lnZ/a, and from (9) we can write
S=áfñ·  S

áfñ
+a  S

a
 ,      (13)
which is just Euler's theorem exhibiting S as a homogeneous function of degree 1. Thus, under the stated condition the maximum entropy is an extensive function of the input data.

As always, the sharpness of a probability distribution, and therefore a measure of its predictive power, is provided by the variances and covariances of the fundamental variables. One readily verifies that
áfmfnñ-áfmñáfnñ
=-  áfmñ

ln
=-  áfnñ

lm
º Kmn=Knm ,
     (14)
defining the covariance functions whose generalizations appear throughout the theory; they represent the correlation of fluctuations.

In addition to the maximum property of SI with respect to variations in the probability distribution, the maximum entropy itself possesses a variational property of some importance. If we vary the entropy in (6) with respect to all parameters in the problem, including {lr} and {fr}, we obtain an alternative to the derivation of (10):
dS=
å
r 
lrdQr=
å
r 
lrTr(frdr) ,      (15)
where dQr is defined in (11). Hence, S is stationary with respect to small changes in the entire problem if the distribution itself is held constant. The difference in the two types of variational result is meaningful, as is readily seen by examining the second variations. For the case of S we compute d2 S from (13) and retain only first-order variations of the variables. If S is to be a maximum with respect to variation of those constraints, then the desired stability or concavity condition is
d2 S @ dl·dáfñ+da·d æ
è
 S

a
ö
ø
< 0 .      (16)
We return to this presently, but it is precisely the condition employed by Gibbs (1875) to establish all his stability conditions in thermodynamics.

So far there has been no mention of physics, but the foregoing expressions pertain to fixed constraints, and therefore are immediately applicable to the case of thermal equilibrium and constants of the motion. In the simplest case only a single operator is considered, f1=H, the system Hamiltonian, and (4) becomes the canonical distribution
r0=  1

Z0
e-bH ,        Z0(b)=Tre-bH ,      (17)
where b = (kT)-1. When a is taken as the system volume V, (11) identifies the internal energy U=áHñ, elements of heat dQ and work dW=ádHñ, and the pressure. Because classically the Kelvin temperature is defined as an integrating factor for heat, T must be the absolute temperature and k is Boltzmann's constant. The first term in (11) also expresses the first law of thermodynamics; while this cannot be derived from more primitive dynamic laws, the relation arises here as a result of probable inference. If a second function f2=N, the total number operator, had been included, the grand canonical distribution would have been obtained in place of (17).

In this application to classical thermodynamics Eq.(13) takes on special significance, for it expresses the entropy as an extensive function of extensive variables, if the required condition is fullfilled. In all but the very simplest models direct calculation of lnZ is not practicable, so one must pursue an indirect course. There may be various ways to establish this condition, but with a = V the standard procedure is to demonstrate it in the infinite volume limit, where it is found to hold for many common Hamiltonians. But in some situations and for some systems it may not be possible to verify the condition; then the theory is describing something other than classical thermodynamics. The one remaining step needed to complete the derivation of elementary equilibrium thermodynamics is to show that theoretical expectation values are equal to measured values; the necessary conditions are discussed in the Appendix.


Nonequilibrium States

Although Gibbs was silent on exactly why he chose this variational principle, his intent was quite clear: to define and construct a description of the equilibrium state. That is, the PME provides a criterion for that state. To continue in this vein, then, we should seek to extend the principle to construction of an arbitrary nonequilibrium state. The procedure for doing this, and its rationale, were outlined in II, where we noted that the main task in this respect is to gather information that varies in both space and time and incorporate it into a density matrix describing a nonequilibrium state.

To illustrate the method of information gathering, consider a system with a fixed time-independent Hamiltonian and suppose the data to be given over a space-time region R(x,t) in the form of an expectation value of a Heisenberg operator F(x,t), which could, for example, be a density or a current. We are reminded that the full equation of motion for such operators, if they are also explicitly time varying, is
i ℏ
×
F
 
=[F,H]+t F ,      (18)
and the superposed dot will always denote a total time derivative. When the input data vary continuously over R their sum becomes an integral and there is a distinct Lagrange multiplier for each space-time point. Maximization of the entropy subject to the constraint provided by that information leads to a density matrix describing this macrostate:
r =  1

Z
exp é
ë
- ó
õ


R 
l(x,t)F(x,td3x dt ù
û
 ,      (19a)
where
Z[l(x,t)]=Trexp é
ë
- ó
õ


R 
l(x,t)F(x,td3x dt ù
û
     (19b)
is now the partition functional. The Lagrange multiplier function is identified as the solution of the functional differential equation
áF(x,t)ñ º Tr[rF(x,t)]=-  d

dl(x,t)
lnZ ,       (x,t) Î R ,      (20)
and is defined only in the region R. Note carefully that the data set denoted by áF(x,t)ñ is a numerical quantity that has been equated to an expectation value to incorporate it into a density matrix. Any other operator J(x,t), including J=F, is determined at any other space-time point (x,t) as usual by
áJ(x,t)ñ = Tr é
ë
rJ(x,t) ù
û
=Tr é
ë
r(t)J(x) ù
û
 .     (21)
That is, the system with fixed H still evolves unitarily from the initial nonequilibrium state (19); although r surely will no longer commute with H, its eigenvalues nevertheless remain unchanged.

Inclusion of a number of operators Fk, each with its own information-gathering region Rk and its own Lagrange multiplier function lk, is straightforward, and if the data are time independent r can describe an inhomogeneous equilibrium system as discussed in connection with (II-10). The question is sometimes raised concerning exactly which functions or operators should be included in the description of a macroscopic state, and the short answer is: include all the relevant information available, for the PME will automatically eliminate that which is redundant or contradictory. A slightly longer answer was provided by Jaynes (1957b) in his second paper introducing information-theoretic ideas into statistical mechanics. He defined a density matrix providing a definite probability assignment for each possible outcome of an experiment as sufficient for that experiment. A density matrix that is sufficient for all conceivable experiments on a system is called complete for that system. Both sufficiency and completeness are defined relative to the initial information, and the existence of complete density matrices presumes that all measurable quantities can be represented by Hermitian operators and that all experimental measurements can be expressed in terms of expectation values. But even if one could in principle employ a complete density matrix, it would be extremely awkward and inconvenient to do so in practice, for that would require a much larger function space than necessary. If the system is nonmagnetic and there are no magnetic fields present, then there is no point to including those coordinates in a description of the processes of immediate interest, but only those that are sufficient in the present context. The great self-correcting feature of the PME is that if subsequent predictions are not confirmed by experiment, then this is an indication that some relevant constraints have been overlooked or, even better, that new physics has been uncovered.

The form (19) illustrates how r naturally incorporates memory effects while placing no restrictions on spatial or temporal scales. But this density matrix is definitely not a function of space and time; it merely provides an initial nonequilibrium distribution corresponding to data áF(x,t)ñ Î R. Lack of any other information outside R - in the future, say - may tend to render r less and less reliable, and the quality of predictions may deteriorate. Barring any further knowledge of system behavior this deterioration represents a fading memory, which becomes quite important if the system is actually allowed to relax from this state, for an experimenter carrying out a measurement on an equilibrium sample cannot possibly know the history of everything that has been done to it, so it is generally presumed that it has no memory. Relaxation processes will be discussed in Section 4 below.


Steady State Processes

With an understanding of how to construct a nonequilibrium state it's now possible to move on to the next stage, which is to steady-state systems in which there may be steady currents, but all variables remain time independent. This is perhaps the most well-understood nonequilibrium scenario, primarily because it shares with equilibrium the property that it is stationary. But in equilibrium the Hamiltonian commutes with the density matrix, [H,r]=0, which implies that r also commutes with the time evolution operator. In the steady state, though, it is almost certain that H will not commute with the operators in the exponential defining r, even if the expectation values defining the state are time independent. While this time independence is a necessary condition, an additional criterion is needed to guarantee stationarity, and that is provided by requiring that [H,r]=0. In II it was shown that this leads to the additional constraint that only the diagonal parts of the specified operators appear in the steady-state density matrix, providing a theoretical definition of the steady state. By `diagonal part' of an operator we mean that part that is diagonal in the energy representation, so that it commutes with H. For present purposes the most useful expression of the diagonal part of an operator is
Fd=F-
lim
e® 0+ 
ó
õ
0

-¥ 
eet t F(x,tdt ,        e > 0 ,     (22)
where the time dependence of F is unitary: F(t)=eitH/ ℏ F e-itH/ ℏ.

The resulting steady-state density matrix is given by (II-12) and (II-13) and is found by a simple modification of (19) above: remove all time dependence, including that in R, and replace F(x,t) by Fd(x); in addition, a term -bH is included in the exponentials to characterize an earlier equilibrium reference state. Substitution of the resulting rss into (2) provides the maximum entropy of the stationary state:
 1

k
Sss = lnZss[b,l(x)] +báHñss + ó
õ
l(x)áFd(x)ñss , d3x      (23)
and F is often specified to be a current. This is the time-independent entropy of the steady-state process as it was established and has nothing to do with the entropy being passed from source to sink; entropy generation takes place only at the boundaries and not internally. Some applications of rss were given in II and others will be made below. No mention appears in II, however, of conditions for stability of the steady state, so a brief discussion is in order here.

Schlögl (1971) has studied stability conditions for the steady state in some depth through consideration of the quantum version of the information gain in changing from a steady-state density matrix r¢ to another r,
L(r,r¢)=Tr é
ë
r æ
è
lnr-lnr¢ ö
ø
ù
û
 ,      (24)
which is effectively the entropy produced in going from r¢ to r. He notes that L is a Liapunov function in that it is positive definite, vanishes only if r = r¢, and has a positive second-order variation. Pfaffelhuber (1977) demonstrates that the symmetrized version, which is more convenient here,
L*(r,r¢)=  1

2
Tr é
ë
æ
è
r-r¢ ö
ø
æ
è
lnr-lnr¢ ö
ø
ù
û
 ,      (25)
is an equivalent Liapunov function, and that its first-order variation is given by
-dL*=  1

2
d(Dl DáFdñ) .      (26)
The notation is that Dl = l-l¢, for example. If d is taken as a time variation, then Liapunov's theorem immediately provides a stability condition for the steady state,
-
×
L
 
*
 
³ 0 .      (27)
Remarkably, (27) closely resembles the Gibbs condition (16) for stability of the equilibrium state, but in terms of
-D
×
S
 
³ 0

, as well as the Glansdorff-Prigogine criterion of phenomenological thermodynamics. But the merit of the present approach is that L* does not depend directly on the entropy and therefore encounters no ambiguities in defining a
D
×
S
 

.


Thermal Driving

A variable, and therefore the system itself, is said to be thermally driven if no new variables other than those constrained experimentally are needed to characterize the resulting state, and if the Lagrange multipliers corresponding to variables other than those specified remain constant. 3 As discussed in I, a major difference with purely dynamic driving is that the thermally-driven density matrix is not constrained to evolve by unitary transformation alone. It was argued in I and II that a general theory of nonequilibrium must necessarily account explicitly for external sources, since it is only through them that the macroscopic constraints on the system can change. With that discussion as background let us suppose that a system is in thermal equilibrium with time-independent Hamiltonian in the past, and then at t=0 a source is turned on smoothly and specified to run continuously, as described by its effect on the expectation value áF(t)ñ. That is, F(t) is given throughout the changing interval [0,t] and is specified to continue to change in a known way until further notice. We omit spatial dependence explicitly here in the interest of clarity, noting that the following equations are generalized to arguments (x,t) in (II-41)-(II-49). For convenience we consider only a single driven operator; multiple operators, both driven and otherwise, are readily included. Based on the probability model of I, the PME then provides the density matrix for thermal driving:
rt
=  1

Zt
exp é
ë
-bH- ó
õ
t

0 
l(t¢)F(t¢dt¢ ù
û
 ,
Zt[b,l(t)]
=Tr exp é
ë
-bH- ó
õ
t

0 
l(t¢)F(t¢dt¢ ù
û
 .
     (28)
The theoretical maximum entropy is obtained explicitly by substitution of (28) into (2),
 1

k
St=lnZt+báH ñt + ó
õ
t

0 
l(t¢)áF(t¢)ñt dt¢ ;     (29)
it is the continuously re-maximized information entropy. Equation (29) indicates explicitly that áH ñt changes only as a result of changes in, and correlation with F.

The expectation value of another operator at time t is áC ñt=Tr[rt C], and direct differentiation yields
 d

dt
áC(t)ñt
=Tr é
ë
C(t)trt +rt
×
C
 
(t) ù
û
=á
×
C
 
(t)ñt -l(t)KCFt(t,t) ,
     (30)
where the superposed dot denotes a total time derivative. We have here introduced the covariance function
KCFt(t¢,t) º á

F(t¢)
 
C(t)ñt-áF(t¢)ñtáC(t)ñt = -  dáC(t)ñt

dl(t)
 ,      (31)
where the overline denotes a generalized Kubo transform with respect to the operator lnrt:

F(t)
 
º ó
õ
1

0 
e-ulnrt F(t)eulnrt du ,      (32)
which arises from the possible noncommutativity of F(t) with itself at different times. The superscript t in KCFt implies that the density matrix rt is employed everywhere on the right-hand side of the definition, including the Kubo transform; this is necessary to distinguish it from several approximations.

In II we introduced a new notation into (30), which at first appears to be only a convenience:
sC(t) º  d

dt
áC(t)ñt-á
×
C
 
(t)ñt = -l(t)KCFt(t,t) .     (33)
For C=F
sF(t)
º  d

dt
áF(t)ñt-á
×
F
 
(t)ñt
=-l(t)KFFt(t,t) ,
     (34)
which was seen to have the following interpretation: sF(t) is the rate at which F is driven or transferred by the external source, whereas dáF(t)ñt/dt is the total time rate-of-change of áF(t)ñt in the system at time t, and
á
×
F
 
(t)ñt

is the rate of change produced by internal relaxation. Thus, we can turn the scenario around and take the source as given and predict áF(t)ñt, which is the more likely experimental arrangement. This reversal of viewpoint is much like that associated with (4) suggesting that one could as well consider l the independent variable, rather than f, as was discussed in connection with (II-10); in fact, this is usually what is done in practice in applications of (17), where the temperature is specified. If the source strength is given, then the second line of (34) provides a nonlinear transcendental equation determining the Lagrange multiplier function l(t).

An important reason for eventually including spatial dependence is that we can now derive the macroscopic equations of motion. For example, if F(t) is one of the conserved densities e(x,t) in a simple fluid, such as those in (II-66), and J(x,t) is the corresponding current density, then the local microscopic continuity equation
×
e
 
(x,t)+\boldnabla·J(x,t)=0      (35)
is satisfied irrespective of the the state of the system. When this is substituted into (34) we obtain the macroscopic conservation law
 d

dt
áe(x,t)ñt +\boldnabla·áJ(x,t)ñt = se(x,t) .      (36)
Specification of sources automatically provides the thermokinetic equations of motion, and in II it was shown how all these expressions reduce to those of the steady state when the driving rate is constant.

Everything to this point is nonlinear, but in many applications some sort of approximation becomes necessary, and often sufficient, for extracting the desired physical properties of a particular model. The most common procedure is to linearize the density matrix in terms of the departure from equilibrium, which means that the integrals in (28), for example, are in some sense small. The formalism for this was discussed briefly in II and a systematic exposition can be found elsewhere (Heims and Jaynes, 1962; Grandy, 1988). In linear approximation the expectation value of any operator C(x,t) is given by
áC(x,t)ñ-áC(x)ñ0
=- ó
õ
KCF(x,t;x¢,t¢)l(x¢,t¢d3x¢ dt¢ ,
(37)
KCF(x,t;x¢,t¢)
=á

F(x¢,t¢)
 
C(x,t)ñ0 -áF(x¢)ñ0áC(x)ñ0 ,
(38)
where we have re-inserted the spatial dependence. The integration limits in (37) have been omitted deliberately so that the general form applies to any of the preceding scenarios. The subscripts 0 indicate that all expectation values on the right-hand sides of (37) and (38) are to be taken with the equilibrium distribution (17), including the linear covariance function KCF=K0CF and the Kubo transform (32); KCF is independent of l. It may be useful to note that, rather than linearize about equilibrium, the same procedure can also be used to linearize about the steady state.

The space-time transformation properties of the linear covariance function (38) are of some importance in later applications, so it's a moment well spent to examine these. We generally presume time and space translation invariance in the initial homogeneous equilibrium system, such that the total energy and number operators of (II-67), as well as the total momentum operator P, commute with one another. In this system these translations are generated, respectively, by the unitary operators
U(t)=e-iHt/ ℏ ,        U(x)=e-ix·P/ ℏ ,      (39)
and F(x,t)=Uf(x)Uf(tF U(t)U(x). Translation invariance, along with (32) and cyclic invariance of the trace, provide two further simplifications: the single-operator expectation values are independent of x and t in an initially homogeneous system, and the arguments of KCF can now be taken as r=x-x¢, t = t-t¢.

Generally, the operators encountered in covariance functions possess definite transformation properties under space inversion (parity) and time reversal. Under the former A(r,t) becomes PAA(-r,t), PA=±1, and under the latter TAA(r,-t), TA=±1. Under inversion the covariance function (38) behaves as follows:
KFC(-r,-t)
=KCF(r,t) = PCPFKCF(-r,t)
=TCTFKCF(r,-t)=PCPFTCTFKCF(-r,-t
     (40)
where the first equality again follows from cyclic invariance. For many operators, including those describing a simple fluid, PT=+1 and the full reciprocity relation holds:
KCF(r,t)=KFC(r,t) .      (41)
In fact, by changing integration variables in (32) it is easy to show that the nonlinear covariance function (31) also satisfies a reciprocity relation: KtCF(x¢,t¢;x,t)=KtFC(x,t;x¢,t¢).

One further property of linear covariance functions will be found useful. Consider the spatial Fourier transform in which we examine the limit k=|k|® 0,

lim
k® 0 
Kab(k,t)=
lim
k® 0 
ó
õ
eik·r Kab(r,td3r = ó
õ
Kab(r,td3r .      (42)
That is, taking the limit is equivalent to integrating over the entire volume. But this is also the long-wavelength limit, in which the wavelengths of slowly-decaying modes span the entire volume.

Suppose now that a is a locally-conserved density, such as those describing a simple fluid, whose volume integral A is then a conserved quantity commuting with the Hamiltonian in the equilibrium system. In this event (39) implies that the left-hand side of (42) reduces to KAb(0,0), independent of space, time, and Kubo transform; the covariance function has become a constant correlation function, as in (14), and is just a thermodynamic quantity.



2. Nonequilibrium Thermodynamics


In equilibrium thermodynamics everything starts with the entropy, as in (10), and the same is true here. The instantaneous maximum entropy of thermal driving is exhibited in (29), and with St now a function of time one can compute its total time derivative as
 1

k
 dSt

dt
= æ
è
 lnZt

a
ö
ø
×
a
 
+b  dáHñt

dt
-l(t) ó
õ
t

0 
l(t¢)KtFF(t,t¢dt¢ ,      (43)
the spatial variables again being omitted temporarily. Because H is not explicitly driven its Lagrange multiplier remains the equilibrium parameter b. With a = V, the system volume, the equilibrium expressions (8) and (11) identified the term in lnZ as a work term. In complete analogy, the first term on the right-hand side of (43) is seen to be a power term when the volume is changing; one identifies the time-varying pressure by writing this term as bP(t).

Commonly the volume is held constant and the term containing the Hamiltonian written out explicitly, so that (43) becomes
 1

k
 dSt

dt
=-bl(t)KtHF(t,0)-l(t) ó
õ
t

0 
l(t¢)KtFF(t,t¢dt¢
=gF(t)sF(t) ,
     (44)
where we have employed (34) and defined a new parameter
gF(t)
º b  KtHF(t,0)

KtFF(t,t)
+ ó
õ
t

0 
l(t¢)  KtFF(t,t¢)

KtFF(t,t)
 dt¢
= æ
è
 dSt

dáF(t)ñt
ö
ø


[(thermal) || (driving)] 
 ,
     (45)
as discussed in II. The subscript `thermal driving' reminds us that this derivative is evaluated somewhat differently than in the equilibrium formalism because the expectation values of H and F are not independent here. When the source strength sF(t) is specified the Lagrange multiplier itself is determined from (34) and gF is interpreted as a transfer potential governing the transfer of F to or from the system. If two systems can exchange quantities Fi under thermal driving, then the conditions for migrational equilibrium at time t are
gFi(t)1=gFi(t)2 .      (46)

In II it was noted that St refers only to the information encoded in the distribution of (28) and cannot refer to the internal entropy of the system. In equilibrium the maximum of the information entropy is the same as the experimental entropy, but that is not necessarily the case here. For example, if the driving is removed at time t=t1, then St1 in (29) can only provide the entropy of that nonequilibrium state at t=t1; its value will remain the same during subsequent relaxation, owing to unitary time evolution. Although the maximum information (or theoretical) entropy provides a complete description of the system based on all known physical constraints on that system, it cannot describe the ensuing relaxation, for it contains no new information about that process. We return to this in Section 4 below.

Combination of (34) and the second line of (43) strongly suggests the natural expression
 1

k
×
S
 

t 
=gF(t) æ
è
 d

dt
áF(t)ñt -á
×
F
 
(t)ñt ö
ø
 .      (47)
in which the first term on the right-hand side represents the total time rate-of-change of entropy
×
S
 

tot 

arising from the thermal driving of F(t), whereas the second term is the rate-of-change of internal entropy
×
S
 

int 

owing to relaxation. Thus, the total rate of entropy production in the system can be written
×
S
 

tot 
(t)=
×
S
 

t 
+
×
S
 

int 
(t) ,      (48)
where the entropy production of transfer owing to the external source,
×
S
 

t 

, is given by the first line of (44). This latter quantity is a function only of the driven variable F(t), whereas the internal entropy depends on all variables, driven or not, necessary to describe the nonequilibrium state and is determined by the various relaxation processes taking place in the system. If spatial variation is included the right-hand side of (47) is integrated over the system volume.

It is important to understand very clearly the meaning of Eq.(48), so we re-state more carefully the interpretation of each term. From (44),
×
S
 

t 

is the rate of change of the entropy of the macroscopic state of the system due to the source alone; it involves the maximum of the information entropy and is associated entirely with the source. The term
×
S
 

int 

is the contribution to the rate at which the entropy of that state is changing due to relaxation mechanisms within the system itself. Thus,
×
S
 

tot 

is the total rate of change of the entropy of the macroscopic state. When the source is removed
×
S
 

t 
=0

, and
×
S
 

tot 

is the rate at which the entropy of the macroscopic state changes due to internal relaxation processes. Thus, entropy is always associated with a macroscopic state and its rate of change under various processes; the entropy of the surroundings does not enter into this discussion. Equation (48) does not apply to steady-state processes, in which no entropy is generated internally.


Linear Heating

As a specific example it is useful to make contact with classical thermodynamics and choose the driven variable to be the total-energy function for the system, E(t). 4 In the presence of external forces this quantity is not necessarily the Hamiltonian, but can be defined in the same way as H in the isolated system, (II-67), in terms of the energy density operator:
E(t) º ó
õ


V 
h(x,td3x      (49)
and the time evolution is no longer unitary. The point is that H does not change in time, only its expectation value.

In the case of pure heating
×
a
 
=
×
V
 
=0

and (43) and (44) become, respectively,
 1

k
×
S
 

t 
=gE(t)sE(t) ,      (50a)

gE(t)=b  KtHE(t,0)

KtEE(t,t)
+ ó
õ
t

0 
l(t¢)  KtEE(t,t¢)

KtEE(t,t)
 dt¢ .      (50b)
The dimension of gE(t) is E-1, so it is reasonable to interpret this transfer parameter as a time-dependent `inverse temperature' b(t)=[kT(t)]-1; the temperature must change continuously as heat is added to or removed from the system, though it is difficult to define a measurable quantity like this globally. Hence, in analogy with the equilibrium form S=dQ/T, Eq.(10), the content of (50a) is that
×
S
 

t 
=kgE
×
Q
 
 ,      (51)
because the rate of external driving is just
×
Q
 

here. A further analogy, this time with (11), follows from the first line of (34), which extends the First Law to
×
E
 
=
×
Q
 
+
×
W
 
 ,      (52)
because any work done in this scenario would change only the internal energy. These last two expressions are remarkably like those advocated by Truesdell (1984) in his development of Rational Thermodynamics, and are what one might expect from naïve time differentiation of the corresponding equilibrium expressions. Indeed, such an extrapolation may provide a useful guide to nonequilibrium relations, but in the end only direct derivation from a coherent theory should be trusted.

In (48) the term
×
S
 

int 

is positive semi-definite, for it corresponds to the increasing internal entropy of relaxation; this is demonstrated explicitly in Section 4. Combination with (51) then allows one to rewrite (48) as an inequality:
×
S
 

tot 
=kgE
×
Q
 
+
×
S
 

int 
³ kgE
×
Q
 
 .      (53)
One hesitates to refer to this expression as an extension of the Second Law, for such a designation is fraught with ambiguity; the latter remains a statement about the entropies of two equilibrium states. Instead, it may be more prudent to follow Truesdell in referring to (53) as the Clausius-Planck inequality.

A linear approximation in (50b), as described by (37) and (38), leads to considerable simplification, after which that expression becomes
b(t) @ b+ ó
õ
t

0 
l(t¢)  KEE(t-t¢)

KHH
 dt¢ ,      (54)
while recalling that b = b(0) is the equilibrium temperature. The static covariance function is now just an equilibrium thermodynamic function proportional to the energy fluctuations (and hence to the heat capacity). In this approximation the expectation value of the driven energy function is
áE(t)ñt @ áEñ0- ó
õ
t

0 
l(t¢)KEE(t-t¢dt¢ ,      (55)
so that if, for example, energy is being transferred into the system (sE > 0), then the integral must be negative. We can then write
b(t) @ b- ê
ê
ó
õ
t

0 
l(t¢)  KEE(t-t¢)

KHH
 dt¢ ê
ê
 ,      (56)
and b(t) is decreasing from the equilibrium value. The physical content of (56) therefore is that T(t) @ T(0)+DT(t), as expected. Although it is natural to interpret T(t) as a `temperature', we are cautioned that only at t=0 is that interpretation unambiguous.

A complementary calculation is also of interest, in which spatial variation is included and the homogeneous system is driven from equilibrium by a source coupled to the energy density h(x,t). In linear approximation the process is described by
áh(x,t)ñt-áhñ0
=- ó
õ


V 
d3x¢  ó
õ
t

0 
 dt¢ l(x¢,t¢)Khh(x-x¢,t-t¢) ,      
(57a)
sh(x,t)
=- ó
õ


V 
l(x¢,t)Khh(x-x¢,t=0) d3x¢ .
(57b)
After a well-defined period of driving the source is removed, the system is again isolated, and we expect it to relax to equilibrium (see Section 4); the upper limit of integration in (57a) is now a constant, say t1. Presumably this is a reproducible process.

For convenience we take the system volume to be all space and integrate Eqs.(57) over the entire volume, thereby converting the densities into total Hamiltonians. Owing to spatial uniformity and cyclic invariance the covariance function in (57a) is then independent of the time (the evolution being unitary after removal of the source), and we shall denote the volume integral of sh(x,t) by sh(t). Combination of the two equations for t > t1 yields
áHñ-áHñ0= ó
õ
t1

0 
sh(t¢dt¢ ,        t > t1 ,     (58a)
or
áHñ = áHñ0+DE ,      (58b)
which is independent of time and identical to (55) at t=t1. The total energy of the new equilibrium state is now known, and the density matrix describing that state can be constructed via the PME.

The last few paragraphs provide a formal description of slowly heating a pot of water on a stove, but in reality much more is going on in that pot than simply increasing the temperature. Experience tells us that the number density is also varying, though N/V is constant (if we ignore evaporation), and a proper treatment ought to include both densities. But thermal driving of h(x,t) requires that n(x,t) is explicitly not driven, changing only as a result of changes in h, through correlations. The proper density matrix describing this model 5 is
rt=  1

Zt
exp é
ë
-bH- ó
õ
d3x¢  ó
õ
t

0 
é
ë
lh(x¢,t¢)h(x¢,t¢)
+ln(x¢,t¢)n(x¢,t¢) ù
û
 dt¢ ù
û
 ,
     (59)
and the new constraint is expressed by the generalization of (34) to the set of equations
sh(x,t)
=- ó
õ
lh(x¢,t) Khh(x-x¢;t-td3x¢
             - ó
õ
ln(x¢,t) Khn(x-x¢;t-td3x¢ ,
0
=- ó
õ
lh(x¢,t) Knh(x-x¢;t-td3x¢
             - ó
õ
ln(x¢,t) Knn(x-x¢;t-td3x¢ ,
     (60)
asserting explicitly that sn º 0.

In this linear approximation ln is determined by lh and we can now carry out the spatial Fourier transformations in (60). The source strength driving the heating is thus
sh(k,t)=-lh(k,t)Khh(k,0) é
ë
1-  |Knh(k,0)|2

Khh(k,0)Knn(k,0)
ù
û
 ,      (61)
where the t=0 values in the covariance functions refer to equal times. For Hermitian operators the covariance functions satisfy a Schwarz inequality, so that the ratio in square brackets in this last expression is always less than or equal to unity; hence the driving strength is generally reduced by the no-driving constraint on n(x,t).

The expression (61) is somewhat awkward as it stands, so it's convenient to introduce a new variable, or operator,
h¢(k,t) º h(k,t) -  Knh(k,0)

Knn(k,0)
n(k,t) .     (62)
Some algebra then yields in place of (61)
sh(k,t)=-lh(k,t)Kh¢h¢(k,0) .      (63)
In the linear case, at least, it is actually h¢ that is the driven variable under the constraint that n is not driven, and the source term has been renormalized.

With (63) the two expectation values of interest are
áh¢(k,t)ñt-áh¢ñ0
= ó
õ
t

0 
sh(k,t¢)  Kh¢h¢(k,t-t¢)

Kh