<?xml version="1.0" encoding="UTF-8" standalone="yes"?><oembed><version><![CDATA[1.0]]></version><provider_name><![CDATA[Azimuth]]></provider_name><provider_url><![CDATA[https://johncarlosbaez.wordpress.com]]></provider_url><author_name><![CDATA[John Baez]]></author_name><author_url><![CDATA[https://johncarlosbaez.wordpress.com/author/johncarlosbaez/]]></author_url><title><![CDATA[The Mathematical Origin of&nbsp;Irreversibility]]></title><type><![CDATA[link]]></type><html><![CDATA[<p><i>guest post by <b><a href="http://aei-mpg.academia.edu/MatteoSmerlak/Papers">Matteo Smerlak</a></b></i></p>
<h3> Introduction </h3>
<p>Thermodynamical dissipation and adaptive evolution are two faces of the same Markovian coin!</p>
<p>Consider this. The <a href="http://en.wikipedia.org/wiki/Second_law_of_thermodynamics">Second Law of Thermodynamics</a> states that the entropy of an isolated thermodynamic system can never decrease; <a href="http://en.wikipedia.org/wiki/Landauer%27s_principle">Landauer&#8217;s principle</a> maintains that the erasure of information inevitably causes dissipation; <a href="http://en.wikipedia.org/wiki/Fisher%27s_fundamental_theorem_of_natural_selection">Fisher&#8217;s fundamental theorem of natural selection</a> asserts that any fitness difference within a population leads to adaptation in an evolution process governed by natural selection. Diverse as they are, these statements have two common characteristics: </p>
<p>1. they express the <i>irreversibility</i> of certain natural phenomena, and </p>
<p>2. the dynamical processes underlying these phenomena involve an element of <i>randomness</i>. </p>
<p>Doesn&#8217;t this suggest to you the following question: Could it be that thermal phenomena, forgetful information processing and adaptive evolution are governed by <i>the same stochastic mechanism?</i> </p>
<p>The answer is—yes! The key to this rather profound connection resides in a universal property of <a href="http://en.wikipedia.org/wiki/Markov_process">Markov processes</a> discovered recently in the context of non-equilibrium statistical mechanics, and known as the <a href="http://en.wikipedia.org/wiki/Fluctuation_theorem">&#8216;fluctuation theorem&#8217;</a>. Typically stated in terms of &#8216;dissipated work&#8217; or &#8216;entropy production&#8217;, this result can be seen as an extension of the Second Law of Thermodynamics to <i>small</i> systems, where thermal fluctuations cannot be neglected. But <i>it is actually much more than this</i>: it is the mathematical underpinning of irreversibility itself, be it thermodynamical, evolutionary, or else. To make this point clear, let me start by giving a general formulation of the fluctuation theorem that makes no reference to physics concepts such as &#8216;heat&#8217; or &#8216;work&#8217;.</p>
<h3> The mathematical fact </h3>
<p>Consider a system randomly jumping between states <img src='https://s0.wp.com/latex.php?latex=a%2C+b%2C%5Cdots&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a, b,&#92;dots' title='a, b,&#92;dots' class='latex' /> with (possibly time-dependent) transition rates <img src='https://s0.wp.com/latex.php?latex=%5Cgamma_%7Ba+b%7D%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;gamma_{a b}(t)' title='&#92;gamma_{a b}(t)' class='latex' /> where <img src='https://s0.wp.com/latex.php?latex=a&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a' title='a' class='latex' /> is the state prior to the jump, while <img src='https://s0.wp.com/latex.php?latex=b&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='b' title='b' class='latex' /> is the state after the jump. I&#8217;ll assume that this dynamics defines a (continuous-time) Markov process, namely that the numbers <img src='https://s0.wp.com/latex.php?latex=%5Cgamma_%7Ba+b%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;gamma_{a b}' title='&#92;gamma_{a b}' class='latex' /> are the matrix entries of an <a href="http://math.ucr.edu/home/baez/networks/networks_20.html">infinitesimal stochastic</a> matrix, which means that its off-diagonal entries are non-negative and that its columns sum up to zero. </p>
<p>Now, each possible history <img src='https://s0.wp.com/latex.php?latex=%5Comega%3D%28%5Comega_t%29_%7B0%5Cleq+t%5Cleq+T%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;omega=(&#92;omega_t)_{0&#92;leq t&#92;leq T}' title='&#92;omega=(&#92;omega_t)_{0&#92;leq t&#92;leq T}' class='latex' /> of this process can be characterized by the sequence of occupied states <img src='https://s0.wp.com/latex.php?latex=a_%7Bj%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a_{j}' title='a_{j}' class='latex' /> and by the times <img src='https://s0.wp.com/latex.php?latex=%5Ctau_%7Bj%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;tau_{j}' title='&#92;tau_{j}' class='latex' /> at which the transitions <img src='https://s0.wp.com/latex.php?latex=a_%7Bj-1%7D%5Clongrightarrow+a_%7Bj%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a_{j-1}&#92;longrightarrow a_{j}' title='a_{j-1}&#92;longrightarrow a_{j}' class='latex' /> occur <img src='https://s0.wp.com/latex.php?latex=%280%5Cleq+j%5Cleq+N%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(0&#92;leq j&#92;leq N)' title='(0&#92;leq j&#92;leq N)' class='latex' />:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Comega%3D%28%5Comega_%7B0%7D%3Da_%7B0%7D%5Coverset%7B%5Ctau_%7B0%7D%7D%7B%5Clongrightarrow%7D+a_%7B1%7D+%5Coverset%7B%5Ctau_%7B1%7D%7D%7B%5Clongrightarrow%7D%5Ccdots+%5Coverset%7B%5Ctau_%7BN%7D%7D%7B%5Clongrightarrow%7D+a_%7BN%7D%3D%5Comega_%7BT%7D%29.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;omega=(&#92;omega_{0}=a_{0}&#92;overset{&#92;tau_{0}}{&#92;longrightarrow} a_{1} &#92;overset{&#92;tau_{1}}{&#92;longrightarrow}&#92;cdots &#92;overset{&#92;tau_{N}}{&#92;longrightarrow} a_{N}=&#92;omega_{T}).' title='&#92;omega=(&#92;omega_{0}=a_{0}&#92;overset{&#92;tau_{0}}{&#92;longrightarrow} a_{1} &#92;overset{&#92;tau_{1}}{&#92;longrightarrow}&#92;cdots &#92;overset{&#92;tau_{N}}{&#92;longrightarrow} a_{N}=&#92;omega_{T}).' class='latex' /></p>
<p>Define the <b>skewness</b> <img src='https://s0.wp.com/latex.php?latex=%5Csigma_%7Bj%7D%28%5Ctau_%7Bj%7D%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;sigma_{j}(&#92;tau_{j})' title='&#92;sigma_{j}(&#92;tau_{j})' class='latex' /> of each of these transitions to be the logarithmic ratio of transition rates:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B%5Csigma_%7Bj%7D%28%5Ctau_%7Bj%7D%29%3A%3D%5Cln%5Cfrac%7B%5Cgamma_%7Ba_%7Bj%7Da_%7Bj-1%7D%7D%28%5Ctau_%7Bj%7D%29%7D%7B%5Cgamma_%7Ba_%7Bj-1%7Da_%7Bj%7D%7D%28%5Ctau_%7Bj%7D%29%7D%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{&#92;sigma_{j}(&#92;tau_{j}):=&#92;ln&#92;frac{&#92;gamma_{a_{j}a_{j-1}}(&#92;tau_{j})}{&#92;gamma_{a_{j-1}a_{j}}(&#92;tau_{j})}}' title='&#92;displaystyle{&#92;sigma_{j}(&#92;tau_{j}):=&#92;ln&#92;frac{&#92;gamma_{a_{j}a_{j-1}}(&#92;tau_{j})}{&#92;gamma_{a_{j-1}a_{j}}(&#92;tau_{j})}}' class='latex' /></p>
<p>Also define the <a href="http://en.wikipedia.org/wiki/Self-information"><b>self-information</b></a> of the system in state <img src='https://s0.wp.com/latex.php?latex=a&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a' title='a' class='latex' /> at time <img src='https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t' title='t' class='latex' /> by:</p>
<p><img src='https://s0.wp.com/latex.php?latex=i_a%28t%29%3A%3D+-%5Cln%5Cpi_%7Ba%7D%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i_a(t):= -&#92;ln&#92;pi_{a}(t)' title='i_a(t):= -&#92;ln&#92;pi_{a}(t)' class='latex' /></p>
<p>where <img src='https://s0.wp.com/latex.php?latex=%5Cpi_%7Ba%7D%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;pi_{a}(t)' title='&#92;pi_{a}(t)' class='latex' /> is the probability that the system is in state <img src='https://s0.wp.com/latex.php?latex=a&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a' title='a' class='latex' /> at time <img src='https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t' title='t' class='latex' />, given some prescribed initial distribution <img src='https://s0.wp.com/latex.php?latex=%5Cpi_%7Ba%7D%280%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;pi_{a}(0)' title='&#92;pi_{a}(0)' class='latex' />.  This quantity is also sometimes called the <b>surprisal</b>, as it measures the &#8216;surprise&#8217; of finding out that the system is in state <img src='https://s0.wp.com/latex.php?latex=a&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a' title='a' class='latex' /> at time <img src='https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t' title='t' class='latex' />.</p>
<p>Then the following identity&#8212;the <b>detailed fluctuation theorem</b>&#8212;holds:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cmathrm%7BProb%7D%5B%5CDelta+i-%5CSigma%3D-A%5D+%3D+e%5E%7B-A%7D%5C%3B%5Cmathrm%7BProb%7D%5B%5CDelta+i-%5CSigma%3DA%5D+%5C%3B&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;mathrm{Prob}[&#92;Delta i-&#92;Sigma=-A] = e^{-A}&#92;;&#92;mathrm{Prob}[&#92;Delta i-&#92;Sigma=A] &#92;;' title='&#92;mathrm{Prob}[&#92;Delta i-&#92;Sigma=-A] = e^{-A}&#92;;&#92;mathrm{Prob}[&#92;Delta i-&#92;Sigma=A] &#92;;' class='latex' /></p>
<p>where </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B%5CSigma%3A%3D%5Csum_%7Bj%7D%5Csigma_%7Bj%7D%28%5Ctau_%7Bj%7D%29%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{&#92;Sigma:=&#92;sum_{j}&#92;sigma_{j}(&#92;tau_{j})}' title='&#92;displaystyle{&#92;Sigma:=&#92;sum_{j}&#92;sigma_{j}(&#92;tau_{j})}' class='latex' /></p>
<p>is the <b>cumulative skewness</b> along a trajectory of the system, and </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5CDelta+i%3D+i_%7Ba_N%7D%28T%29-i_%7Ba_0%7D%280%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Delta i= i_{a_N}(T)-i_{a_0}(0)' title='&#92;Delta i= i_{a_N}(T)-i_{a_0}(0)' class='latex' /> </p>
<p>is the <b>variation of self-information</b> between the end points of this trajectory.  </p>
<p>This identity has an immediate consequence: if <img src='https://s0.wp.com/latex.php?latex=%5Clangle%5C%2C%5Ccdot%5C%2C%5Crangle&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;langle&#92;,&#92;cdot&#92;,&#92;rangle' title='&#92;langle&#92;,&#92;cdot&#92;,&#92;rangle' class='latex' /> denotes the average over all realizations of the process, then we have the <b>integral fluctuation theorem</b>: </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Clangle+e%5E%7B-%5CDelta+i%2B%5CSigma%7D%5Crangle%3D1%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;langle e^{-&#92;Delta i+&#92;Sigma}&#92;rangle=1,' title='&#92;langle e^{-&#92;Delta i+&#92;Sigma}&#92;rangle=1,' class='latex' /></p>
<p>which, by the convexity of the exponential and <a href="http://en.wikipedia.org/wiki/Jensen%27s_inequality">Jensen&#8217;s inequality</a>, implies:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Clangle+%5CDelta+i%5Crangle%3D%5CDelta+S%5Cgeq%5Clangle%5CSigma%5Crangle.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;langle &#92;Delta i&#92;rangle=&#92;Delta S&#92;geq&#92;langle&#92;Sigma&#92;rangle.' title='&#92;langle &#92;Delta i&#92;rangle=&#92;Delta S&#92;geq&#92;langle&#92;Sigma&#92;rangle.' class='latex' /></p>
<p>In short: <i>the mean variation of self-information, aka the variation of Shannon entropy</i> </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+S%28t%29%3A%3D+%5Csum_%7Ba%7D%5Cpi_%7Ba%7D%28t%29i_a%28t%29+%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ S(t):= &#92;sum_{a}&#92;pi_{a}(t)i_a(t) }' title='&#92;displaystyle{ S(t):= &#92;sum_{a}&#92;pi_{a}(t)i_a(t) }' class='latex' /></p>
<p><i>is bounded from below by the mean cumulative skewness of the underlying stochastic trajectory.</i>  </p>
<p>This is the fundamental mathematical fact underlying irreversibility. To unravel its physical and biological consequences, it suffices to consider the origin and interpretation of the &#8216;skewness&#8217; term in different contexts. (By the way, people usually call <img src='https://s0.wp.com/latex.php?latex=%5CSigma&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Sigma' title='&#92;Sigma' class='latex' /> the &#8216;entropy production&#8217; or &#8216;dissipation function&#8217;&#8212;but how tautological is that?)</p>
<h3> The physical and biological consequences </h3>
<p>Consider first the standard stochastic-thermodynamic scenario where a physical system is kept in contact with a thermal reservoir at inverse temperature <img src='https://s0.wp.com/latex.php?latex=%5Cbeta&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;beta' title='&#92;beta' class='latex' /> and undergoes thermally induced transitions between states <img src='https://s0.wp.com/latex.php?latex=a%2C+b%2C%5Cdots&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a, b,&#92;dots' title='a, b,&#92;dots' class='latex' />. By virtue of the <a href="http://en.wikipedia.org/wiki/Detailed_balance"><b>detailed balance condition</b></a>:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+e%5E%7B-%5Cbeta+E_%7Ba%7D%28t%29%7D%5Cgamma_%7Ba+b%7D%28t%29%3De%5E%7B-%5Cbeta+E_%7Bb%7D%28t%29%7D%5Cgamma_%7Bb+a%7D%28t%29%2C%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ e^{-&#92;beta E_{a}(t)}&#92;gamma_{a b}(t)=e^{-&#92;beta E_{b}(t)}&#92;gamma_{b a}(t),}' title='&#92;displaystyle{ e^{-&#92;beta E_{a}(t)}&#92;gamma_{a b}(t)=e^{-&#92;beta E_{b}(t)}&#92;gamma_{b a}(t),}' class='latex' /></p>
<p>the skewness <img src='https://s0.wp.com/latex.php?latex=%5Csigma_%7Bj%7D%28%5Ctau_%7Bj%7D%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;sigma_{j}(&#92;tau_{j})' title='&#92;sigma_{j}(&#92;tau_{j})' class='latex' /> of each such transition is <img src='https://s0.wp.com/latex.php?latex=%5Cbeta+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;beta ' title='&#92;beta ' class='latex' /> times the energy difference between the states <img src='https://s0.wp.com/latex.php?latex=a_%7Bj%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a_{j}' title='a_{j}' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=a_%7Bj-1%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a_{j-1}' title='a_{j-1}' class='latex' />, namely the <i>heat</i> received from the reservoir during the transition. Hence, the mean cumulative skewness <img src='https://s0.wp.com/latex.php?latex=%5Clangle+%5CSigma%5Crangle&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;langle &#92;Sigma&#92;rangle' title='&#92;langle &#92;Sigma&#92;rangle' class='latex' /> is nothing but <img src='https://s0.wp.com/latex.php?latex=%5Cbeta%5Clangle+Q%5Crangle%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;beta&#92;langle Q&#92;rangle,' title='&#92;beta&#92;langle Q&#92;rangle,' class='latex' /> with <img src='https://s0.wp.com/latex.php?latex=Q&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='Q' title='Q' class='latex' /> the total heat received by the system along the process. It follows from the detailed fluctuation theorem that </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Clangle+e%5E%7B-%5CDelta+i%2B%5Cbeta+Q%7D%5Crangle%3D1&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;langle e^{-&#92;Delta i+&#92;beta Q}&#92;rangle=1' title='&#92;langle e^{-&#92;Delta i+&#92;beta Q}&#92;rangle=1' class='latex' /></p>
<p>and therefore </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5CDelta+S%5Cgeq%5Cbeta%5Clangle+Q%5Crangle&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Delta S&#92;geq&#92;beta&#92;langle Q&#92;rangle' title='&#92;Delta S&#92;geq&#92;beta&#92;langle Q&#92;rangle' class='latex' /> </p>
<p>which is of course <a href="http://en.wikipedia.org/wiki/Clausius_theorem">Clausius&#8217; inequality</a>. In a computational context where the control parameter is the entropy variation itself (such as in a bit-erasure protocol, where <img src='https://s0.wp.com/latex.php?latex=%5CDelta+S%3D-%5Cln+2&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Delta S=-&#92;ln 2' title='&#92;Delta S=-&#92;ln 2' class='latex' />), this inequality in turn expresses Landauer&#8217;s principle: it impossible to decrease the self-information of the system&#8217;s state without dissipating a minimal amount of heat into the environment (in this case <img src='https://s0.wp.com/latex.php?latex=-Q+%5Cgeq+k+T%5Cln2&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='-Q &#92;geq k T&#92;ln2' title='-Q &#92;geq k T&#92;ln2' class='latex' />, the <a href="http://en.wikipedia.org/wiki/Landauer%27s_principle">&#8216;Landauer bound&#8217;</a>). More general situations (several types of reservoirs, <a href="http://en.wikipedia.org/wiki/Maxwell_demon">Maxwell-demon</a>-like feedback controls) can be treated along the same lines, and the various forms of the Second Law derived from the detailed fluctuation theorem. </p>
<p>Now, many would agree that evolutionary dynamics is a wholly different business from thermodynamics; in particular, notions such as &#8216;heat&#8217; or &#8216;temperature&#8217; are clearly irrelevant to Darwinian evolution. However, the stochastic framework of Markov processes <i>is</i> relevant to describe the genetic evolution of a population, and this fact alone has important consequences. As a simple example, consider the time evolution of mutant fixations <img src='https://s0.wp.com/latex.php?latex=x_%7Ba%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='x_{a}' title='x_{a}' class='latex' /> in a population, with <img src='https://s0.wp.com/latex.php?latex=a&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a' title='a' class='latex' /> ranging over the possible genotypes. In a &#8216;symmetric mutation scheme&#8217;, which I understand is biological parlance for &#8216;reversible Markov process&#8217;, meaning one that obeys <a href="http://en.wikipedia.org/wiki/Detailed_balance">detailed balance</a>, the ratio between the <img src='https://s0.wp.com/latex.php?latex=a%5Cmapsto+b&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a&#92;mapsto b' title='a&#92;mapsto b' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=b%5Cmapsto+a&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='b&#92;mapsto a' title='b&#92;mapsto a' class='latex' /> transition rates is completely determined by the <a href="http://en.wikipedia.org/wiki/Fitness_landscape"><b>fitnesses</b></a> <img src='https://s0.wp.com/latex.php?latex=f_%7Ba%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f_{a}' title='f_{a}' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=f_b&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f_b' title='f_b' class='latex' /> of <img src='https://s0.wp.com/latex.php?latex=a&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a' title='a' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=b&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='b' title='b' class='latex' />, according to </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B%5Cfrac%7B%5Cgamma_%7Ba+b%7D%7D%7B%5Cgamma_%7Bb+a%7D%7D+%3D%5Cleft%28%5Cfrac%7Bf_%7Bb%7D%7D%7Bf_%7Ba%7D%7D%5Cright%29%5E%7B%5Cnu%7D+%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{&#92;frac{&#92;gamma_{a b}}{&#92;gamma_{b a}} =&#92;left(&#92;frac{f_{b}}{f_{a}}&#92;right)^{&#92;nu} }' title='&#92;displaystyle{&#92;frac{&#92;gamma_{a b}}{&#92;gamma_{b a}} =&#92;left(&#92;frac{f_{b}}{f_{a}}&#92;right)^{&#92;nu} }' class='latex' /></p>
<p>where <img src='https://s0.wp.com/latex.php?latex=%5Cnu&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;nu' title='&#92;nu' class='latex' /> is a model-dependent function of the effective population size [Sella2005]. Along a given history of mutant fixations, the cumulated skewness <img src='https://s0.wp.com/latex.php?latex=%5CSigma&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Sigma' title='&#92;Sigma' class='latex' /> is therefore given by minus the <b>fitness flux</b>: </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B%5CPhi%3D%5Cnu%5Csum_%7Bj%7D%28%5Cln+f_%7Ba_j%7D-%5Cln+f_%7Ba_%7Bj-1%7D%7D%29.%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{&#92;Phi=&#92;nu&#92;sum_{j}(&#92;ln f_{a_j}-&#92;ln f_{a_{j-1}}).}' title='&#92;displaystyle{&#92;Phi=&#92;nu&#92;sum_{j}(&#92;ln f_{a_j}-&#92;ln f_{a_{j-1}}).}' class='latex' /></p>
<p>The integral fluctuation theorem then becomes the <b>fitness flux theorem</b>: </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Clangle+e%5E%7B-%5CDelta+i+-%5CPhi%7D%5Crangle%3D1%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;langle e^{-&#92;Delta i -&#92;Phi}&#92;rangle=1}' title='&#92;displaystyle{ &#92;langle e^{-&#92;Delta i -&#92;Phi}&#92;rangle=1}' class='latex' /></p>
<p>discussed recently by Mustonen and L&auml;ssig [Mustonen2010] and implying Fisher&#8217;s fundamental theorem of natural selection as a special case. (Incidentally, the &#8216;fitness flux theorem&#8217; derived in this reference is more general than this; for instance, it does not rely on the &#8216;symmetric mutation scheme&#8217; assumption above.) The ensuing inequality </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Clangle+%5CPhi%5Crangle%5Cgeq-%5CDelta+S+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;langle &#92;Phi&#92;rangle&#92;geq-&#92;Delta S ' title='&#92;langle &#92;Phi&#92;rangle&#92;geq-&#92;Delta S ' class='latex' /> </p>
<p>shows that a positive fitness flux is &#8220;an almost universal evolutionary principle of biological systems&#8221; [Mustonen2010], with negative contributions limited to time intervals with a systematic loss of adaptation (<img src='https://s0.wp.com/latex.php?latex=%5CDelta+S+%3E+0&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Delta S &gt; 0' title='&#92;Delta S &gt; 0' class='latex' />). This statement may well be the closest thing to a version of the Second Law of Thermodynamics applying to evolutionary dynamics. </p>
<p>It is really quite remarkable that thermodynamical dissipation and Darwinian evolution can be reduced to the same stochastic mechanism, and that notions such as &#8216;fitness flux&#8217; and &#8216;heat&#8217; can arise as two faces of the same mathematical coin, namely the &#8216;skewness&#8217; of Markovian transitions. After all, the phenomenon of life is in itself a direct challenge to thermodynamics, isn&#8217;t it? When thermal phenomena tend to increase the world&#8217;s disorder, life strives to bring about and maintain exquisitely fine spatial and chemical structures&#8212;which is why Schr&ouml;dinger famously proposed to <i>define</i> life as <i>negative entropy</i>. Could there be a more striking confirmation of his intuition&#8212;and a reconciliation of evolution and thermodynamics in the same go&#8212;than the fundamental inequality of adaptive evolution <img src='https://s0.wp.com/latex.php?latex=%5Clangle%5CPhi%5Crangle%5Cgeq-%5CDelta+S&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;langle&#92;Phi&#92;rangle&#92;geq-&#92;Delta S' title='&#92;langle&#92;Phi&#92;rangle&#92;geq-&#92;Delta S' class='latex' />?</p>
<p>Surely the detailed fluctuation theorem for Markov processes has other applications, pertaining neither to thermodynamics nor adaptive evolution. Can you think of any?</p>
<h3> Proof of the fluctuation theorem </h3>
<p>I am a physicist, but knowing that many readers of John&#8217;s blog are mathematicians, I&#8217;ll do my best to frame&#8212;and prove&#8212;the FT as an actual theorem. </p>
<p>Let <img src='https://s0.wp.com/latex.php?latex=%28%5COmega%2C%5Cmathcal%7BT%7D%2Cp%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(&#92;Omega,&#92;mathcal{T},p)' title='(&#92;Omega,&#92;mathcal{T},p)' class='latex' /> be a probability space and <img src='https://s0.wp.com/latex.php?latex=%28%5C%2C%5Ccdot%5C%2C%29%5E%7B%5Cdagger%7D%3D%5COmega%5Cto+%5COmega&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(&#92;,&#92;cdot&#92;,)^{&#92;dagger}=&#92;Omega&#92;to &#92;Omega' title='(&#92;,&#92;cdot&#92;,)^{&#92;dagger}=&#92;Omega&#92;to &#92;Omega' class='latex' /> a measurable involution of <img src='https://s0.wp.com/latex.php?latex=%5COmega&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Omega' title='&#92;Omega' class='latex' />. Denote <img src='https://s0.wp.com/latex.php?latex=p%5E%7B%5Cdagger%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p^{&#92;dagger}' title='p^{&#92;dagger}' class='latex' /> the pushforward probability measure through this involution, and </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+R%3D%5Cln+%5Cfrac%7Bd+p%7D%7Bd+p%5E%5Cdagger%7D+%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ R=&#92;ln &#92;frac{d p}{d p^&#92;dagger} }' title='&#92;displaystyle{ R=&#92;ln &#92;frac{d p}{d p^&#92;dagger} }' class='latex' /></p>
<p>the logarithm of the corresponding Radon-Nikodym derivative (we assume <img src='https://s0.wp.com/latex.php?latex=p%5E%5Cdagger&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p^&#92;dagger' title='p^&#92;dagger' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p' title='p' class='latex' /> are mutually absolutely continuous). Then the following lemmas are true, with <img src='https://s0.wp.com/latex.php?latex=%281%29%5CRightarrow%282%29%5CRightarrow%283%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(1)&#92;Rightarrow(2)&#92;Rightarrow(3)' title='(1)&#92;Rightarrow(2)&#92;Rightarrow(3)' class='latex' />:</p>
<p><b>Lemma 1.</b> The detailed fluctuation relation:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cforall+A%5Cin%5Cmathbb%7BR%7D+%5Cquad++p%5Cbig%28R%5E%7B-1%7D%28-A%29+%5Cbig%29%3De%5E%7B-A%7Dp+%5Cbig%28R%5E%7B-1%7D%28A%29+%5Cbig%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;forall A&#92;in&#92;mathbb{R} &#92;quad  p&#92;big(R^{-1}(-A) &#92;big)=e^{-A}p &#92;big(R^{-1}(A) &#92;big)' title='&#92;forall A&#92;in&#92;mathbb{R} &#92;quad  p&#92;big(R^{-1}(-A) &#92;big)=e^{-A}p &#92;big(R^{-1}(A) &#92;big)' class='latex' /></p>
<p><b>Lemma 2.</b>  The integral fluctuation relation:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B%5Cint_%7B%5COmega%7D+d+p%28%5Comega%29%5C%2Ce%5E%7B-R%28%5Comega%29%7D%3D1+%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{&#92;int_{&#92;Omega} d p(&#92;omega)&#92;,e^{-R(&#92;omega)}=1 }' title='&#92;displaystyle{&#92;int_{&#92;Omega} d p(&#92;omega)&#92;,e^{-R(&#92;omega)}=1 }' class='latex' /></p>
<p><b>Lemma 3.</b>  The positivity of the Kullback-Leibler divergence:</p>
<p><img src='https://s0.wp.com/latex.php?latex=D%28p%5C%2C%5CVert%5C%2C+p%5E%7B%5Cdagger%7D%29%3A%3D%5Cint_%7B%5COmega%7D+d+p%28%5Comega%29%5C%2CR%28%5Comega%29%5Cgeq+0.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='D(p&#92;,&#92;Vert&#92;, p^{&#92;dagger}):=&#92;int_{&#92;Omega} d p(&#92;omega)&#92;,R(&#92;omega)&#92;geq 0.' title='D(p&#92;,&#92;Vert&#92;, p^{&#92;dagger}):=&#92;int_{&#92;Omega} d p(&#92;omega)&#92;,R(&#92;omega)&#92;geq 0.' class='latex' /></p>
<p>These are basic facts which anyone can show: <img src='https://s0.wp.com/latex.php?latex=%282%29%5CRightarrow%283%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(2)&#92;Rightarrow(3)' title='(2)&#92;Rightarrow(3)' class='latex' /> by Jensen&#8217;s inequality, <img src='https://s0.wp.com/latex.php?latex=%281%29%5CRightarrow%282%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(1)&#92;Rightarrow(2)' title='(1)&#92;Rightarrow(2)' class='latex' /> trivially, and <img src='https://s0.wp.com/latex.php?latex=%281%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(1)' title='(1)' class='latex' /> follows from <img src='https://s0.wp.com/latex.php?latex=R%28%5Comega%5E%7B%5Cdagger%7D%29%3D-R%28%5Comega%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='R(&#92;omega^{&#92;dagger})=-R(&#92;omega)' title='R(&#92;omega^{&#92;dagger})=-R(&#92;omega)' class='latex' /> and the change of variables theorem, as follows,</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cbegin%7Barray%7D%7Bccl%7D+%5Cdisplaystyle%7B+%5Cint_%7BR%5E%7B-1%7D%28-A%29%7D+d+p%28%5Comega%29%7D+%26%3D%26+%5Cdisplaystyle%7B+%5Cint_%7BR%5E%7B-1%7D%28A%29%7Dd+p%5E%7B%5Cdagger%7D%28%5Comega%29+%7D+%5C%5C+%5C%5C+%26%3D%26+%5Cdisplaystyle%7B+%5Cint_%7BR%5E%7B-1%7D%28A%29%7D+d+p%28%5Comega%29%5C%2C+e%5E%7B-R%28%5Comega%29%7D+%7D+%5C%5C+%5C%5C+%26%3D%26+%5Cdisplaystyle%7B+e%5E%7B-A%7D+%5Cint_%7BR%5E%7B-1%7D%28A%29%7D+d+p%28%5Comega%29%7D+.%5Cend%7Barray%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;begin{array}{ccl} &#92;displaystyle{ &#92;int_{R^{-1}(-A)} d p(&#92;omega)} &amp;=&amp; &#92;displaystyle{ &#92;int_{R^{-1}(A)}d p^{&#92;dagger}(&#92;omega) } &#92;&#92; &#92;&#92; &amp;=&amp; &#92;displaystyle{ &#92;int_{R^{-1}(A)} d p(&#92;omega)&#92;, e^{-R(&#92;omega)} } &#92;&#92; &#92;&#92; &amp;=&amp; &#92;displaystyle{ e^{-A} &#92;int_{R^{-1}(A)} d p(&#92;omega)} .&#92;end{array}' title='&#92;begin{array}{ccl} &#92;displaystyle{ &#92;int_{R^{-1}(-A)} d p(&#92;omega)} &amp;=&amp; &#92;displaystyle{ &#92;int_{R^{-1}(A)}d p^{&#92;dagger}(&#92;omega) } &#92;&#92; &#92;&#92; &amp;=&amp; &#92;displaystyle{ &#92;int_{R^{-1}(A)} d p(&#92;omega)&#92;, e^{-R(&#92;omega)} } &#92;&#92; &#92;&#92; &amp;=&amp; &#92;displaystyle{ e^{-A} &#92;int_{R^{-1}(A)} d p(&#92;omega)} .&#92;end{array}' class='latex' /></p>
<p>But here is the beauty: if </p>
<p>&bull; <img src='https://s0.wp.com/latex.php?latex=%28%5COmega%2C%5Cmathcal%7BT%7D%2Cp%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(&#92;Omega,&#92;mathcal{T},p)' title='(&#92;Omega,&#92;mathcal{T},p)' class='latex' /> is actually a Markov process defined over some time interval <img src='https://s0.wp.com/latex.php?latex=%5B0%2CT%5D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='[0,T]' title='[0,T]' class='latex' /> and valued in some (say discrete) state space <img src='https://s0.wp.com/latex.php?latex=%5CSigma&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Sigma' title='&#92;Sigma' class='latex' />, with the instantaneous probability <img src='https://s0.wp.com/latex.php?latex=%5Cpi_%7Ba%7D%28t%29%3Dp%5Cbig%28%5C%7B%5Comega_%7Bt%7D%3Da%5C%7D+%5Cbig%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;pi_{a}(t)=p&#92;big(&#92;{&#92;omega_{t}=a&#92;} &#92;big)' title='&#92;pi_{a}(t)=p&#92;big(&#92;{&#92;omega_{t}=a&#92;} &#92;big)' class='latex' /> of each state <img src='https://s0.wp.com/latex.php?latex=a%5Cin%5CSigma&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a&#92;in&#92;Sigma' title='a&#92;in&#92;Sigma' class='latex' /> satisfying the <b>master equation</b> (aka Kolmogorov equation)</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd%5Cpi_%7Ba%7D%28t%29%7D%7Bdt%7D%3D%5Csum_%7Bb%5Cneq+a%7D%5CBig%28%5Cgamma_%7Bb+a%7D%28t%29%5Cpi_%7Ba%7D%28t%29-%5Cgamma_%7Ba+b%7D%28t%29%5Cpi_%7Bb%7D%28t%29%5CBig%29%2C%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d&#92;pi_{a}(t)}{dt}=&#92;sum_{b&#92;neq a}&#92;Big(&#92;gamma_{b a}(t)&#92;pi_{a}(t)-&#92;gamma_{a b}(t)&#92;pi_{b}(t)&#92;Big),} ' title='&#92;displaystyle{ &#92;frac{d&#92;pi_{a}(t)}{dt}=&#92;sum_{b&#92;neq a}&#92;Big(&#92;gamma_{b a}(t)&#92;pi_{a}(t)-&#92;gamma_{a b}(t)&#92;pi_{b}(t)&#92;Big),} ' class='latex' /></p>
<p>and</p>
<p>&bull;  the dagger involution is time-reversal, that is <img src='https://s0.wp.com/latex.php?latex=%5Comega%5E%7B%5Cdagger%7D_%7Bt%7D%3A%3D%5Comega_%7BT-t%7D%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;omega^{&#92;dagger}_{t}:=&#92;omega_{T-t},' title='&#92;omega^{&#92;dagger}_{t}:=&#92;omega_{T-t},' class='latex' /></p>
<p>then for a given path</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B%5Comega%3D%28%5Comega_%7B0%7D%3Da_%7B0%7D%5Coverset%7B%5Ctau_%7B0%7D%7D%7B%5Clongrightarrow%7D+a_%7B1%7D+%5Coverset%7B%5Ctau_%7B1%7D%7D%7B%5Clongrightarrow%7D%5Ccdots+%5Coverset%7B%5Ctau_%7BN%7D%7D%7B%5Clongrightarrow%7D+a_%7BN%7D%3D%5Comega_%7BT%7D%29%5Cin%5COmega%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{&#92;omega=(&#92;omega_{0}=a_{0}&#92;overset{&#92;tau_{0}}{&#92;longrightarrow} a_{1} &#92;overset{&#92;tau_{1}}{&#92;longrightarrow}&#92;cdots &#92;overset{&#92;tau_{N}}{&#92;longrightarrow} a_{N}=&#92;omega_{T})&#92;in&#92;Omega}' title='&#92;displaystyle{&#92;omega=(&#92;omega_{0}=a_{0}&#92;overset{&#92;tau_{0}}{&#92;longrightarrow} a_{1} &#92;overset{&#92;tau_{1}}{&#92;longrightarrow}&#92;cdots &#92;overset{&#92;tau_{N}}{&#92;longrightarrow} a_{N}=&#92;omega_{T})&#92;in&#92;Omega}' class='latex' /></p>
<p>the logarithmic ratio <img src='https://s0.wp.com/latex.php?latex=R%28%5Comega%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='R(&#92;omega)' title='R(&#92;omega)' class='latex' /> decomposes into &#8216;variation of self-information&#8217; and &#8216;cumulative skewness&#8217; along <img src='https://s0.wp.com/latex.php?latex=%5Comega&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;omega' title='&#92;omega' class='latex' />:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+R%28%5Comega%29%3D%5Cunderbrace%7B%5CBig%28%5Cln%5Cpi_%7Ba_0%7D%280%29-%5Cln%5Cpi_%7Ba_N%7D%28T%29+%5CBig%29%7D_%7B%5CDelta+i%28%5Comega%29%7D-%5Cunderbrace%7B%5Csum_%7Bj%3D1%7D%5E%7BN%7D%5Cln%5Cfrac%7B%5Cgamma_%7Ba_%7Bj%7Da_%7Bj-1%7D%7D%28%5Ctau_%7Bj%7D%29%7D%7B%5Cgamma_%7Ba_%7Bj-1%7Da_%7Bj%7D%7D%28%5Ctau_%7Bj%7D%29%7D%7D_%7B%5CSigma%28%5Comega%29%7D.%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ R(&#92;omega)=&#92;underbrace{&#92;Big(&#92;ln&#92;pi_{a_0}(0)-&#92;ln&#92;pi_{a_N}(T) &#92;Big)}_{&#92;Delta i(&#92;omega)}-&#92;underbrace{&#92;sum_{j=1}^{N}&#92;ln&#92;frac{&#92;gamma_{a_{j}a_{j-1}}(&#92;tau_{j})}{&#92;gamma_{a_{j-1}a_{j}}(&#92;tau_{j})}}_{&#92;Sigma(&#92;omega)}.}' title='&#92;displaystyle{ R(&#92;omega)=&#92;underbrace{&#92;Big(&#92;ln&#92;pi_{a_0}(0)-&#92;ln&#92;pi_{a_N}(T) &#92;Big)}_{&#92;Delta i(&#92;omega)}-&#92;underbrace{&#92;sum_{j=1}^{N}&#92;ln&#92;frac{&#92;gamma_{a_{j}a_{j-1}}(&#92;tau_{j})}{&#92;gamma_{a_{j-1}a_{j}}(&#92;tau_{j})}}_{&#92;Sigma(&#92;omega)}.}' class='latex' /></p>
<p>This is easy to see if one writes the probability of a path explicitly as</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7Bp%28%5Comega%29%3D%5Cpi_%7Ba_%7B0%7D%7D%280%29%5Cleft%5B%5Cprod_%7Bj%3D1%7D%5E%7BN%7D%5Cphi_%7Ba_%7Bj-1%7D%7D%28%5Ctau_%7Bj-1%7D%2C%5Ctau_%7Bj%7D%29%5Cgamma_%7Ba_%7Bj-1%7Da_%7Bj%7D%7D%28%5Ctau_%7Bj%7D%29%5Cright%5D%5Cphi_%7Ba_%7BN%7D%7D%28%5Ctau_%7BN%7D%2CT%29%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{p(&#92;omega)=&#92;pi_{a_{0}}(0)&#92;left[&#92;prod_{j=1}^{N}&#92;phi_{a_{j-1}}(&#92;tau_{j-1},&#92;tau_{j})&#92;gamma_{a_{j-1}a_{j}}(&#92;tau_{j})&#92;right]&#92;phi_{a_{N}}(&#92;tau_{N},T)}' title='&#92;displaystyle{p(&#92;omega)=&#92;pi_{a_{0}}(0)&#92;left[&#92;prod_{j=1}^{N}&#92;phi_{a_{j-1}}(&#92;tau_{j-1},&#92;tau_{j})&#92;gamma_{a_{j-1}a_{j}}(&#92;tau_{j})&#92;right]&#92;phi_{a_{N}}(&#92;tau_{N},T)}' class='latex' /></p>
<p>where</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cphi_%7Ba%7D%28%5Ctau%2C%5Ctau%27%29%3D%5Cphi_%7Ba%7D%28%5Ctau%27%2C%5Ctau%29%3D%5Cexp%5CBig%28-%5Csum_%7Bb%5Cneq+a%7D%5Cint_%7B%5Ctau%7D%5E%7B%5Ctau%27%7Ddt%5C%2C+%5Cgamma_%7Ba+b%7D%28t%29%5CBig%29%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;phi_{a}(&#92;tau,&#92;tau&#039;)=&#92;phi_{a}(&#92;tau&#039;,&#92;tau)=&#92;exp&#92;Big(-&#92;sum_{b&#92;neq a}&#92;int_{&#92;tau}^{&#92;tau&#039;}dt&#92;, &#92;gamma_{a b}(t)&#92;Big)}' title='&#92;displaystyle{ &#92;phi_{a}(&#92;tau,&#92;tau&#039;)=&#92;phi_{a}(&#92;tau&#039;,&#92;tau)=&#92;exp&#92;Big(-&#92;sum_{b&#92;neq a}&#92;int_{&#92;tau}^{&#92;tau&#039;}dt&#92;, &#92;gamma_{a b}(t)&#92;Big)}' class='latex' /></p>
<p>is the probability that the process remains in the state <img src='https://s0.wp.com/latex.php?latex=a&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='a' title='a' class='latex' /> between the times <img src='https://s0.wp.com/latex.php?latex=%5Ctau&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;tau' title='&#92;tau' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=%5Ctau%27&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;tau&#039;' title='&#92;tau&#039;' class='latex' />. It follows from the above lemma that</p>
<p><b>Theorem.</b> Let <img src='https://s0.wp.com/latex.php?latex=%28%5COmega%2C%5Cmathcal%7BT%7D%2Cp%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(&#92;Omega,&#92;mathcal{T},p)' title='(&#92;Omega,&#92;mathcal{T},p)' class='latex' /> be a Markov process and let <img src='https://s0.wp.com/latex.php?latex=i%2C%5CSigma%3A%5COmega%5Crightarrow+%5Cmathbb%7BR%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i,&#92;Sigma:&#92;Omega&#92;rightarrow &#92;mathbb{R}' title='i,&#92;Sigma:&#92;Omega&#92;rightarrow &#92;mathbb{R}' class='latex' /> be defined as above. Then we have</p>
<p>1. The detailed fluctuation theorem:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cforall+A%5Cin%5Cmathbb%7BR%7D%2C+p%5Cbig%28%28%5CDelta+i-%5CSigma%29%5E%7B-1%7D%28-A%29+%5Cbig%29%3De%5E%7B-A%7Dp+%5Cbig%28%28%5CDelta+i-%5CSigma%29%5E%7B-1%7D%28A%29+%5Cbig%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;forall A&#92;in&#92;mathbb{R}, p&#92;big((&#92;Delta i-&#92;Sigma)^{-1}(-A) &#92;big)=e^{-A}p &#92;big((&#92;Delta i-&#92;Sigma)^{-1}(A) &#92;big)' title='&#92;forall A&#92;in&#92;mathbb{R}, p&#92;big((&#92;Delta i-&#92;Sigma)^{-1}(-A) &#92;big)=e^{-A}p &#92;big((&#92;Delta i-&#92;Sigma)^{-1}(A) &#92;big)' class='latex' /></p>
<p>2.  The integral fluctuation theorem:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cint_%7B%5COmega%7D+d+p%28%5Comega%29%5C%2Ce%5E%7B-%5CDelta+i%28%5Comega%29%2B%5CSigma%28%5Comega%29%7D%3D1&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;int_{&#92;Omega} d p(&#92;omega)&#92;,e^{-&#92;Delta i(&#92;omega)+&#92;Sigma(&#92;omega)}=1' title='&#92;int_{&#92;Omega} d p(&#92;omega)&#92;,e^{-&#92;Delta i(&#92;omega)+&#92;Sigma(&#92;omega)}=1' class='latex' /></p>
<p>3.  The &#8216;Second Law&#8217; inequality:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5CDelta+S%3A%3D%5Cint_%7B%5COmega%7D+d+p%28%5Comega%29%5C%2C%5CDelta+i%28%5Comega%29%5Cgeq+%5Cint_%7B%5COmega%7D+d+p%28%5Comega%29%5C%2C%5CSigma%28%5Comega%29%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;Delta S:=&#92;int_{&#92;Omega} d p(&#92;omega)&#92;,&#92;Delta i(&#92;omega)&#92;geq &#92;int_{&#92;Omega} d p(&#92;omega)&#92;,&#92;Sigma(&#92;omega)} ' title='&#92;displaystyle{ &#92;Delta S:=&#92;int_{&#92;Omega} d p(&#92;omega)&#92;,&#92;Delta i(&#92;omega)&#92;geq &#92;int_{&#92;Omega} d p(&#92;omega)&#92;,&#92;Sigma(&#92;omega)} ' class='latex' /></p>
<p>The same theorem can be formulated for other kinds of Markov processes as well, including diffusion processes (in which case it follows from the <a href="http://en.wikipedia.org/wiki/Girsanov_theorem">Girsanov theorem</a>).</p>
<h3> References </h3>
<p>Landauer&#8217;s principle was introduced here:</p>
<p>&bull; [Landauer1961]  R. Landauer, Irreversibility and heat generation in the computing process}, <i>IBM Journal of Research and Development</i> <b>5</b>, (1961) 183&#8211;191.</p>
<p>and is now being verified experimentally by various groups worldwide.</p>
<p>The &#8216;fundamental theorem of natural selection&#8217; was derived by Fisher in his book:</p>
<p>&bull; [Fisher1930]  R. Fisher, <i>The Genetical Theory of Natural Selection</i>, Clarendon Press, Oxford, 1930.</p>
<p>His derivation has long been considered obscure, even perhaps wrong, but apparently the theorem is now well accepted. I believe the first Markovian models of genetic evolution appeared here:</p>
<p>&bull; [Fisher1922]  R. A. Fisher, On the dominance ratio, <i>Proc. Roy. Soc. Edinb.</i> <b>42</b> (1922), 321&#8211;341.</p>
<p>&bull; [Wright1931]  S. Wright, Evolution in Mendelian populations, <i>Genetics</i> <b>16</b> (1931), 97&#8211;159.</p>
<p>Fluctuation theorems are reviewed here:</p>
<p>&bull; [Sevick2008]  E. Sevick, R. Prabhakar, S. R. Williams, and D. J. Searles, <a href="http://arxiv.org/abs/0709.3888">Fluctuation theorems</a>, <i>Ann. Rev. Phys. Chem.</i> <b>59</b> (2008), 603&#8211;633.</p>
<p>Two of the key ideas for the &#8216;detailed fluctuation theorem&#8217; discussed here are due to Crooks: </p>
<p>&bull; [Crooks1999]  Gavin Crooks, <a href="http://arxiv.org/abs/cond-mat/9901352">The entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences</a>, <a href="http://dx.doi.org/10.1103/PhysRevE.60.2721"><i>Phys. Rev. E</i></a> <b>60</b> (1999), 2721&#8211;2726.</p>
<p>who identified <img src='https://s0.wp.com/latex.php?latex=%28E_%7Ba%7D%28%5Ctau_%7Bj%7D%29-E_%7Ba%7D%28%5Ctau_%7Bj-1%7D%29%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='(E_{a}(&#92;tau_{j})-E_{a}(&#92;tau_{j-1}))' title='(E_{a}(&#92;tau_{j})-E_{a}(&#92;tau_{j-1}))' class='latex' /> as heat, and Seifert:</p>
<p>&bull; [Seifert2005]  Udo Seifert, <a href="http://arxiv.org/abs/cond-mat/0503686">Entropy production along a stochastic trajectory and an integral fluctuation theorem</a>, <a href="http://dx.doi.org/10.1103/PhysRevLett.95.040602"><i>Phys. Rev. Lett.</i></a> <b>95</b> (2005), 4.</p>
<p>who understood the relevance of the self-information in this context. </p>
<p>The connection between statistical physics and evolutionary biology is discussed here:</p>
<p>&bull; [Sella2005] G. Sella and A.E. Hirsh, <a href="http://www.pnas.org/content/102/27/9541.full.pdf+html">The application of statistical physics to evolutionary biology</a>, <a href="http://www.pnas.org/content/102/27/9541.short"><i>Proc. Nat. Acad. Sci. USA</i></a> <b>102</b> (2005), 9541&#8211;9546.</p>
<p>and the &#8216;fitness flux theorem&#8217; is derived in </p>
<p>&bull; [Mustonen2010]  V. Mustonen and M. L&auml;ssig, <a href="http://www.pnas.org/content/107/9/4248.full.pdf+html">Fitness flux and ubiquity of adaptive evolution</a>, <a href="http://www.pnas.org/content/107/9/4248.short"><i>Proc. Nat. Acad. Sci. USA</i></a> <b>107</b> (2010), 4248&#8211;4253.</p>
<p>Schr&ouml;dinger&#8217;s famous discussion of the physical nature of life was published here:</p>
<p>&bull; [Schr&ouml;dinger1944]  E. Schr&ouml;dinger, <i>What is Life?</i>, Cambridge University Press, Cambridge, 1944.</p>
]]></html></oembed>