<?xml version="1.0" encoding="UTF-8" standalone="yes"?><oembed><version><![CDATA[1.0]]></version><provider_name><![CDATA[Azimuth]]></provider_name><provider_url><![CDATA[https://johncarlosbaez.wordpress.com]]></provider_url><author_name><![CDATA[John Baez]]></author_name><author_url><![CDATA[https://johncarlosbaez.wordpress.com/author/johncarlosbaez/]]></author_url><title><![CDATA[Information Geometry (Part&nbsp;9)]]></title><type><![CDATA[link]]></type><html><![CDATA[<div align="center">
<a href="http://www.crm.cat/en/Activities/Pages/ActivityDescriptions/Exploratory-Conference-on-the-Mathematics-of-Biodiversity.aspx"><br />
<img width="450" src="https://i2.wp.com/math.ucr.edu/home/baez/barcelona_biodiversity_poster.jpg" /><br />
</a>
</div>
<p>It&#8217;s time to continue this <a href="http://math.ucr.edu/home/baez/information/">information geometry</a> series, because I&#8217;ve promised to give the following talk at a  <a href="http://www.crm.cat/en/Activities/Pages/Exploratory_Program.aspx">conference on the mathematics of biodiversity</a> in early July&#8230; and I still need to do some of the research!  <img src="https://i0.wp.com/math.ucr.edu/home/baez/emoticons/uhh.gif" alt="" /></p>
<blockquote>
<h4>Diversity, information geometry and learning</h4>
<p>As is well known, some measures of biodiversity are formally identical to measures of information developed by Shannon and others.  Furthermore, Marc Harper has shown that the replicator equation in evolutionary game theory is formally identical to a process of Bayesian inference, which is studied in the field of machine learning using ideas from information geometry. Thus, in this simple model, a population of organisms can be thought of as a &#8216;hypothesis&#8217; about how to survive, and natural selection acts to update this hypothesis according to Bayes&#8217; rule.  The question thus arises to what extent natural changes in biodiversity can be usefully seen as analogous to a form of learning. However, some of the same mathematical structures arise in the study of chemical reaction networks, where the increase of entropy, or more precisely decrease of free energy, is not usually considered a form of &#8216;learning&#8217;. We report on some preliminary work on these issues.
</p></blockquote>
<p>So, let&#8217;s dive in!  To some extent I&#8217;ll be explaining these two papers:</p>
<p>&bull; Marc Harper, <a href="http://arxiv.org/abs/0911.1383">Information geometry and evolutionary game theory</a>.</p>
<p>&bull; Marc Harper, <a href="http://arxiv.org/abs/0911.1763">The replicator equation as an inference dynamic</a>.</p>
<p>However, I hope to bring in some more ideas from physics, the study of biodiversity, and the theory of stochastic Petri nets, also known as chemical reaction networks.  So, this series may start to overlap with my <a href="http://math.ucr.edu/home/baez/networks/">network theory</a> posts.  We&#8217;ll see.  We won&#8217;t get far today: for now, I just want to review and expand on what we did <a href="https://johncarlosbaez.wordpress.com/2011/05/26/information-geometry-part-8/">last time</a>.</p>
<h3> The replicator equation </h3>
<p>The <b><a href="http://www.crm.cat/en/Activities/Pages/Exploratory_Program.aspx">replicator equation</a></b> is a simplified model of how populations change.  Suppose we have <img src='https://s0.wp.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='n' title='n' class='latex' /> types of self-replicating entity.  I&#8217;ll call these entities <b>replicators</b>.  I&#8217;ll call the types of replicators <b>species</b>, but they don&#8217;t need to be species in the biological sense.  For example, the replicators could be genes, and the types could be <a href="http://en.wikipedia.org/wiki/Allele">alleles</a>.  Or the replicators could be restaurants, and the types could be restaurant chains.</p>
<p>Let <img src='https://s0.wp.com/latex.php?latex=P_i%28t%29%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P_i(t),' title='P_i(t),' class='latex' /> or just <img src='https://s0.wp.com/latex.php?latex=P_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P_i' title='P_i' class='latex' /> for short, be the population of the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th species at time <img src='https://s0.wp.com/latex.php?latex=t.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t.' title='t.' class='latex' />  Then the replicator equation says</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd+P_i%7D%7Bd+t%7D+%3D+f_i%28P_1%2C+%5Cdots%2C+P_n%29+%5C%2C+P_i+%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d P_i}{d t} = f_i(P_1, &#92;dots, P_n) &#92;, P_i } ' title='&#92;displaystyle{ &#92;frac{d P_i}{d t} = f_i(P_1, &#92;dots, P_n) &#92;, P_i } ' class='latex' /></p>
<p>So, the population <img src='https://s0.wp.com/latex.php?latex=P_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P_i' title='P_i' class='latex' /> changes at a rate proportional to <img src='https://s0.wp.com/latex.php?latex=P_i%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P_i,' title='P_i,' class='latex' /> but the &#8216;constant of proportionality&#8217; need not be constant: it can be any smooth function <img src='https://s0.wp.com/latex.php?latex=f_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f_i' title='f_i' class='latex' /> of the populations of all the species.  We call <img src='https://s0.wp.com/latex.php?latex=f_i%28P_1%2C+%5Cdots%2C+P_n%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f_i(P_1, &#92;dots, P_n)' title='f_i(P_1, &#92;dots, P_n)' class='latex' /> the <b><a href="http://en.wikipedia.org/wiki/Fitness_%28biology%29">fitness</a></b> of the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th species.</p>
<p>Of course this model is absurdly general, while still leaving out lots of important effects, like the spatial variation of populations, or the ability for the population of some species to start at zero and become nonzero&#8212;which happens thanks to mutation.  Nonetheless this model is worth taking a good look at.</p>
<p>Using the magic of vectors we can write</p>
<p><img src='https://s0.wp.com/latex.php?latex=P+%3D+%28P_1%2C+%5Cdots+%2C+P_n%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P = (P_1, &#92;dots , P_n)' title='P = (P_1, &#92;dots , P_n)' class='latex' /></p>
<p>and</p>
<p><img src='https://s0.wp.com/latex.php?latex=f%28P%29+%3D+%28f_1%28P%29%2C+%5Cdots%2C+f_n%28P%29%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f(P) = (f_1(P), &#92;dots, f_n(P))' title='f(P) = (f_1(P), &#92;dots, f_n(P))' class='latex' /></p>
<p>This lets us write the replicator equation a wee bit more tersely as</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd+P%7D%7Bd+t%7D+%3D+f%28P%29+P%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d P}{d t} = f(P) P} ' title='&#92;displaystyle{ &#92;frac{d P}{d t} = f(P) P} ' class='latex' /></p>
<p>where on the right I&#8217;m multiplying vectors componentwise, the way your teachers tried to brainwash you into never doing:</p>
<p><img src='https://s0.wp.com/latex.php?latex=f%28P%29+P+%3D+%28f%28P%29_1+P_1%2C+%5Cdots%2C+f%28P%29_n+P_n%29+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f(P) P = (f(P)_1 P_1, &#92;dots, f(P)_n P_n) ' title='f(P) P = (f(P)_1 P_1, &#92;dots, f(P)_n P_n) ' class='latex' /></p>
<p>In other words, I&#8217;m thinking of <img src='https://s0.wp.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P' title='P' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=f%28P%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f(P)' title='f(P)' class='latex' /> as functions on the set <img src='https://s0.wp.com/latex.php?latex=%5C%7B1%2C+%5Cdots%2C+n%5C%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;{1, &#92;dots, n&#92;}' title='&#92;{1, &#92;dots, n&#92;}' class='latex' /> and multiplying them pointwise.  This will be a nice way of thinking if we want to replace this finite set by some more general space.</p>
<p>Why would we want to do that?  Well, we might be studying lizards with different length tails, and we might find it convenient to think of the set of possible tail lengths as the half-line <img src='https://s0.wp.com/latex.php?latex=%5B0%2C%5Cinfty%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='[0,&#92;infty)' title='[0,&#92;infty)' class='latex' /> instead of a finite set.</p>
<p>Or, just to get started, we might want to study the pathetically simple case where <img src='https://s0.wp.com/latex.php?latex=f%28P%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f(P)' title='f(P)' class='latex' /> doesn&#8217;t depend on <img src='https://s0.wp.com/latex.php?latex=P.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P.' title='P.' class='latex' />  Then we just have a fixed function <img src='https://s0.wp.com/latex.php?latex=f&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f' title='f' class='latex' /> and a time-dependent function <img src='https://s0.wp.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P' title='P' class='latex' /> obeying</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd+P%7D%7Bd+t%7D+%3D+f+P%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d P}{d t} = f P} ' title='&#92;displaystyle{ &#92;frac{d P}{d t} = f P} ' class='latex' /></p>
<p>If we&#8217;re physicists, we might write <img src='https://s0.wp.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P' title='P' class='latex' /> more suggestively as <img src='https://s0.wp.com/latex.php?latex=%5Cpsi&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;psi' title='&#92;psi' class='latex' /> and write the operator multiplying by <img src='https://s0.wp.com/latex.php?latex=f&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f' title='f' class='latex' /> as <img src='https://s0.wp.com/latex.php?latex=-+H.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='- H.' title='- H.' class='latex' />  Then our equation becomes</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd+%5Cpsi%7D%7Bd+t%7D+%3D+-+H+%5Cpsi+%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d &#92;psi}{d t} = - H &#92;psi } ' title='&#92;displaystyle{ &#92;frac{d &#92;psi}{d t} = - H &#92;psi } ' class='latex' /></p>
<p>This looks a lot like Schr&ouml;dinger&#8217;s equation, but since there&#8217;s no factor of <img src='https://s0.wp.com/latex.php?latex=%5Csqrt%7B-1%7D%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;sqrt{-1},' title='&#92;sqrt{-1},' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=%5Cpsi&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;psi' title='&#92;psi' class='latex' /> is real-valued, it&#8217;s more like the heat equation or the &#8216;master equation&#8217;, the basic equation of stochastic mechanics.</p>
<p>For an explanation of Schr&ouml;dinger&#8217;s equation and the master equation, try <a href="http://math.ucr.edu/home/baez/networks/networks_12.html">Part 12</a> of the network theory series.  In that post I didn&#8217;t include a minus sign in front of the <img src='https://s0.wp.com/latex.php?latex=H.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='H.' title='H.' class='latex' />  That&#8217;s no big deal: it&#8217;s just a different convention than the one I want today.  A more serious issue is that in stochastic mechanics, <img src='https://s0.wp.com/latex.php?latex=%5Cpsi&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;psi' title='&#92;psi' class='latex' /> stands for a <i>probability distribution</i>.  This suggests that we should get probabilities into the game somehow.</p>
<h3> The replicator equation in terms of probabilities </h3>
<p>Luckily, that&#8217;s exactly what people usually do!   Instead of talking about the population <img src='https://s0.wp.com/latex.php?latex=P_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P_i' title='P_i' class='latex' /> of the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th species, they talk about the <i>probability</i> <img src='https://s0.wp.com/latex.php?latex=p_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i' title='p_i' class='latex' /> that one of our organisms will belong to the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th species.  This amounts to normalizing our populations:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B++p_i+%3D+%5Cfrac%7BP_i%7D%7B%5Csum_j+P_j%7D+%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{  p_i = &#92;frac{P_i}{&#92;sum_j P_j} } ' title='&#92;displaystyle{  p_i = &#92;frac{P_i}{&#92;sum_j P_j} } ' class='latex' /></p>
<p>Don&#8217;t you love it when notations work out well?  Our big <b>P</b>opulation <img src='https://s0.wp.com/latex.php?latex=P_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P_i' title='P_i' class='latex' /> has gotten normalized to give little <b>p</b>robability <img src='https://s0.wp.com/latex.php?latex=p_i.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i.' title='p_i.' class='latex' /></p>
<p>How do these probabilities <img src='https://s0.wp.com/latex.php?latex=p_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i' title='p_i' class='latex' /> change with time?  Now is the moment for that least loved rule of elementary calculus to come out and take a bow: the quotient rule for derivatives!</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd+p_i%7D%7Bd+t%7D+%3D+%5Cleft%28%5Cfrac%7Bd+P_i%7D%7Bd+t%7D+%5Csum_j+P_j+%5Cquad+-+%5Cquad+P_i+%5Csum_j+%5Cfrac%7Bd+P_j%7D%7Bd+t%7D%5Cright%29+%5Cbig%7B%2F%7D+%5Cleft%28++%5Csum_j+P_j+%5Cright%29%5E2+%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d p_i}{d t} = &#92;left(&#92;frac{d P_i}{d t} &#92;sum_j P_j &#92;quad - &#92;quad P_i &#92;sum_j &#92;frac{d P_j}{d t}&#92;right) &#92;big{/} &#92;left(  &#92;sum_j P_j &#92;right)^2 }' title='&#92;displaystyle{ &#92;frac{d p_i}{d t} = &#92;left(&#92;frac{d P_i}{d t} &#92;sum_j P_j &#92;quad - &#92;quad P_i &#92;sum_j &#92;frac{d P_j}{d t}&#92;right) &#92;big{/} &#92;left(  &#92;sum_j P_j &#92;right)^2 }' class='latex' /></p>
<p>Using our earlier version of the replicator equation, this gives:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd+p_i%7D%7Bd+t%7D+%3D++%5Cleft%28f_i%28P%29+P_i+%5Csum_j+P_j+%5Cquad+-+%5Cquad+P_i+%5Csum_j+f_j%28P%29+P_j+%5Cright%29+%5Cbig%7B%2F%7D+%5Cleft%28++%5Csum_j+P_j+%5Cright%29%5E2+%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d p_i}{d t} =  &#92;left(f_i(P) P_i &#92;sum_j P_j &#92;quad - &#92;quad P_i &#92;sum_j f_j(P) P_j &#92;right) &#92;big{/} &#92;left(  &#92;sum_j P_j &#92;right)^2 }' title='&#92;displaystyle{ &#92;frac{d p_i}{d t} =  &#92;left(f_i(P) P_i &#92;sum_j P_j &#92;quad - &#92;quad P_i &#92;sum_j f_j(P) P_j &#92;right) &#92;big{/} &#92;left(  &#92;sum_j P_j &#92;right)^2 }' class='latex' /></p>
<p>Using the definition of <img src='https://s0.wp.com/latex.php?latex=p_i%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i,' title='p_i,' class='latex' /> this simplifies to:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd+p_i%7D%7Bd+t%7D+%3D++f_i%28P%29+p_i+%5Cquad+-+%5Cquad+%5Cleft%28+%5Csum_j+f_j%28P%29+p_j+%5Cright%29+p_i+%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d p_i}{d t} =  f_i(P) p_i &#92;quad - &#92;quad &#92;left( &#92;sum_j f_j(P) p_j &#92;right) p_i }' title='&#92;displaystyle{ &#92;frac{d p_i}{d t} =  f_i(P) p_i &#92;quad - &#92;quad &#92;left( &#92;sum_j f_j(P) p_j &#92;right) p_i }' class='latex' /></p>
<p>The stuff in parentheses actually has a nice meaning: it&#8217;s just the <b>mean fitness</b>.  In other words, it&#8217;s the average, or expected, fitness of an organism chosen at random from the whole population.  Let&#8217;s write it like this:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Clangle+f%28P%29+%5Crangle+%3D+%5Csum_j+f_j%28P%29+p_j++%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;langle f(P) &#92;rangle = &#92;sum_j f_j(P) p_j  } ' title='&#92;displaystyle{ &#92;langle f(P) &#92;rangle = &#92;sum_j f_j(P) p_j  } ' class='latex' /></p>
<p>So, we get the <b><a href="http://www.crm.cat/en/Activities/Pages/Exploratory_Program.aspx">replicator equation</a></b> in its classic form:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd+p_i%7D%7Bd+t%7D+%3D+%5CBig%28+f_i%28P%29+-+%5Clangle+f%28P%29+%5Crangle+%5CBig%29+%5C%2C+p_i+%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d p_i}{d t} = &#92;Big( f_i(P) - &#92;langle f(P) &#92;rangle &#92;Big) &#92;, p_i }' title='&#92;displaystyle{ &#92;frac{d p_i}{d t} = &#92;Big( f_i(P) - &#92;langle f(P) &#92;rangle &#92;Big) &#92;, p_i }' class='latex' /></p>
<p>This has a nice meaning: for the fraction of organisms of the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th type to increase, their fitness must exceed the mean fitness.  If you&#8217;re trying to increase <a href="http://en.wikipedia.org/wiki/Market_share">market share</a>, what matters is not how good you are, but how much <i>better than average</i> you are.  If everyone else is lousy, you&#8217;re in luck.</p>
<h3> Entropy </h3>
<p>Now for something a bit new.  Once we&#8217;ve gotten a probability distribution into the game, its entropy is sure to follow:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+S%28p%29+%3D+-+%5Csum_i+p_i+%5C%2C+%5Cln%28p_i%29+%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ S(p) = - &#92;sum_i p_i &#92;, &#92;ln(p_i) } ' title='&#92;displaystyle{ S(p) = - &#92;sum_i p_i &#92;, &#92;ln(p_i) } ' class='latex' /></p>
<p>This says how &#8216;smeared-out&#8217; the overall population is among the various different species.  Alternatively, it says how much <i>information</i> it takes, on average, to say which species a randomly chosen organism belongs to.   For example, if there are <img src='https://s0.wp.com/latex.php?latex=2%5EN&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='2^N' title='2^N' class='latex' /> species, all with equal populations, the entropy <img src='https://s0.wp.com/latex.php?latex=S&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='S' title='S' class='latex' /> works out to <img src='https://s0.wp.com/latex.php?latex=N+%5Cln+2.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='N &#92;ln 2.' title='N &#92;ln 2.' class='latex' />  So in this case, it takes <img src='https://s0.wp.com/latex.php?latex=N&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='N' title='N' class='latex' /> bits of information to say which species a randomly chosen organism belongs to.</p>
<p>In biology, entropy is one of many ways people measure biodiversity.  For a quick intro to some of the issues involved, try:</p>
<p>&bull; Tom Leinster, <a href="https://johncarlosbaez.wordpress.com/2011/11/07/measuring-biodiversity/">Measuring biodiversity</a>, <i>Azimuth</i>, 7 November 2011.</p>
<p>&bull; Lou Jost, <a href="http://www.loujost.com/Statistics%20and%20Physics/Diversity%20and%20Similarity/JostEntropy%20AndDiversity.pdf">Entropy and diversity</a>, <i>Oikos</i> <b>113</b> (2006), 363&#8211;375.</p>
<p>But we don&#8217;t need to understand this stuff to see how entropy is connected to the replicator equation.  Marc Harper&#8217;s paper explains this in detail:</p>
<p>&bull; Marc Harper, <a href="http://arxiv.org/abs/0911.1763">The replicator equation as an inference dynamic</a>.</p>
<p>and I hope to go through quite a bit of it here.  But not today!  Today I just want to look at a pathetically simple, yet still interesting, example.</p>
<h3> Exponential growth </h3>
<p>Suppose the fitness of each species is independent of the populations of all the species.   In other words, suppose each fitness <img src='https://s0.wp.com/latex.php?latex=f_i%28P%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f_i(P)' title='f_i(P)' class='latex' /> is actually a constant, say <img src='https://s0.wp.com/latex.php?latex=f_i.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='f_i.' title='f_i.' class='latex' />  Then the replicator equation reduces to</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Cfrac%7Bd+P_i%7D%7Bd+t%7D+%3D+f_i+%5C%2C+P_i+%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;frac{d P_i}{d t} = f_i &#92;, P_i } ' title='&#92;displaystyle{ &#92;frac{d P_i}{d t} = f_i &#92;, P_i } ' class='latex' /></p>
<p>so it&#8217;s easy to solve:</p>
<p><img src='https://s0.wp.com/latex.php?latex=P_i%28t%29+%3D+e%5E%7Bt+f_i%7D+P_i%280%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='P_i(t) = e^{t f_i} P_i(0)' title='P_i(t) = e^{t f_i} P_i(0)' class='latex' /></p>
<p>You don&#8217;t need a detailed calculation to see what&#8217;s going to happen to the probabilities</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+p_i%28t%29+%3D+%5Cfrac%7BP_i%28t%29%7D%7B%5Csum_j+P_j%28t%29%7D%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ p_i(t) = &#92;frac{P_i(t)}{&#92;sum_j P_j(t)}} ' title='&#92;displaystyle{ p_i(t) = &#92;frac{P_i(t)}{&#92;sum_j P_j(t)}} ' class='latex' /></p>
<p>The most fit species present will eventually take over!   If one species, say the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th one, has a fitness greater than the rest, then the population of this species will eventually grow faster than all the rest, at least if its population starts out greater than zero.  So as <img src='https://s0.wp.com/latex.php?latex=t+%5Cto+%2B%5Cinfty%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t &#92;to +&#92;infty,' title='t &#92;to +&#92;infty,' class='latex' /> we&#8217;ll have</p>
<p><img src='https://s0.wp.com/latex.php?latex=p_i%28t%29+%5Cto+1&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i(t) &#92;to 1' title='p_i(t) &#92;to 1' class='latex' /></p>
<p>and</p>
<p><img src='https://s0.wp.com/latex.php?latex=p_j%28t%29+%5Cto+0+%5Cquad+%5Cmathrm%7Bfor%7D+%5Cquad+j+%5Cne+i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_j(t) &#92;to 0 &#92;quad &#92;mathrm{for} &#92;quad j &#92;ne i' title='p_j(t) &#92;to 0 &#92;quad &#92;mathrm{for} &#92;quad j &#92;ne i' class='latex' /></p>
<p>Thus the probability distribution <img src='https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p' title='p' class='latex' /> will become more sharply peaked, and <i>its entropy will eventually approach zero</i>.</p>
<p>With a bit more thought you can see that even if more than one species shares the maximum possible fitness, the entropy will eventually decrease, though not approach zero.</p>
<p>In other words, <i>the biodiversity will eventually drop</i> as all but the most fit species are overwhelmed.  Of course, this is only true in our simple idealization.  In reality, biodiversity behaves in more complex ways&amp;mdash;in part because species interact, and in part because mutation tends to smear out the probability distribution <img src='https://s0.wp.com/latex.php?latex=p_i.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i.' title='p_i.' class='latex' />  We&#8217;re not looking at these effects yet.  They&#8217;re extremely important&#8230; in ways we can only fully understand if we start by looking at what happens when they&#8217;re not present.</p>
<p>In still other words, <i>the population will absorb information from its environment</i>.  This should make intuitive sense: the process of natural selection resembles &#8216;learning&#8217;.  As fitter organisms become more common and less fit ones die out, the environment puts its stamp on the probability distribution <img src='https://s0.wp.com/latex.php?latex=p.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p.' title='p.' class='latex' />  So, this probability distribution should gain information.</p>
<p>While intuitively clear, this last claim also follows more rigorously from thinking of entropy as negative information.  Admittedly, it&#8217;s always easy to get confused by minus signs when relating entropy and information.   A while back I said the entropy</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+S%28p%29+%3D+-+%5Csum_i+p_i+%5C%2C+%5Cln%28p_i%29+%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ S(p) = - &#92;sum_i p_i &#92;, &#92;ln(p_i) } ' title='&#92;displaystyle{ S(p) = - &#92;sum_i p_i &#92;, &#92;ln(p_i) } ' class='latex' /></p>
<p>was the average information required to say which species a randomly chosen organism belongs to.  If this entropy is going down, isn&#8217;t the population <i>losing</i> information?</p>
<p>No, this is a classic sign error.  It&#8217;s like the concept of &#8216;work&#8217; in physics.  We can talk about the work some system does on its environment, or the work done by the environment on the system, and these are almost the same&#8230; <i>except one is minus the other!</i></p>
<p>When you are very ignorant about some system&#8212;say, some rolled dice&mdash;your estimated probabilities <img src='https://s0.wp.com/latex.php?latex=p_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i' title='p_i' class='latex' /> for its various possible states are very smeared-out, so the entropy <img src='https://s0.wp.com/latex.php?latex=S%28p%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='S(p)' title='S(p)' class='latex' /> is large.  As you gain information, you revise your probabilities and they typically become more sharply peaked, so <img src='https://s0.wp.com/latex.php?latex=S%28p%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='S(p)' title='S(p)' class='latex' /> goes down.   When you know as much as you possibly can, <img src='https://s0.wp.com/latex.php?latex=S%28p%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='S(p)' title='S(p)' class='latex' /> equals zero.</p>
<p>So, the entropy <img src='https://s0.wp.com/latex.php?latex=S%28p%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='S(p)' title='S(p)' class='latex' /> is the amount of information you have left to learn: the amount of information you <i>lack</i>, not the amount you <i>have</i>.  As you gain information, this goes down.  There&#8217;s no paradox here.</p>
<p>It works the same way with our population of replicators&#8212;at least in the special case where the fitness of each species is independent of its population.  The probability distribution <img src='https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p' title='p' class='latex' /> is like a &#8216;hypothesis&#8217; assigning to each species <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' /> the probability <img src='https://s0.wp.com/latex.php?latex=p_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i' title='p_i' class='latex' /> that it&#8217;s the best at self-replicating.   As some replicators die off while others prosper, they gather information their environment, and this hypothesis gets refined.  So, the entropy <img src='https://s0.wp.com/latex.php?latex=S%28p%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='S(p)' title='S(p)' class='latex' /> drops.</p>
<h3> Next time </h3>
<p>Of course, to make closer contact to reality, we need to go beyond the special case where the fitness of each species is a constant!   Marc Harper does this, and I want to talk about his work someday, but first I have a few more remarks to make about the pathetically simple special case I&#8217;ve been focusing on. I&#8217;ll save these for next time, since I&#8217;ve probably strained your patience already.</p>
]]></html><thumbnail_url><![CDATA[https://i2.wp.com/math.ucr.edu/home/baez/barcelona_biodiversity_poster.jpg?fit=440%2C330]]></thumbnail_url><thumbnail_height><![CDATA[330]]></thumbnail_height><thumbnail_width><![CDATA[198]]></thumbnail_width></oembed>