<?xml version="1.0" encoding="UTF-8" standalone="yes"?><oembed><version><![CDATA[1.0]]></version><provider_name><![CDATA[Azimuth]]></provider_name><provider_url><![CDATA[https://johncarlosbaez.wordpress.com]]></provider_url><author_name><![CDATA[John Baez]]></author_name><author_url><![CDATA[https://johncarlosbaez.wordpress.com/author/johncarlosbaez/]]></author_url><title><![CDATA[Entropy and Information in Biological Systems (Part&nbsp;1)]]></title><type><![CDATA[link]]></type><html><![CDATA[<p><a href="http://socrates.berkeley.edu/~hartelab/">John Harte</a> is an ecologist who uses maximum entropy methods to predict the distribution, abundance and energy usage of species.  <a href="http://people.mbi.ucla.edu/marcharper/">Marc Harper</a> uses information theory in bioinformatics and evolutionary game theory.   Harper, Harte and I are organizing a workshop on entropy and information in biological systems, and I&#8217;m really excited about it!</p>
<p>It&#8217;ll take place at the <a href="http://www.nimbios.org/">National Institute for Mathematical and Biological Synthesis</a> in Knoxville Tennesee.    We are scheduling it for Wednesday-Friday, April 8-10, 2015.  When the date gets confirmed, I&#8217;ll post an advertisement so you can apply to attend.</p>
<p>Writing the proposal was fun, because we got to pull together lots of interesting people who are applying information theory and entropy to biology in quite different ways.   So, here it is!</p>
<div align="center"><a href="http://www.nimbios.org/"><img width="450" src="https://i1.wp.com/www.utk.edu/tntoday/images/nimbios_logo_lg.jpg" /></a></div>
<h3> Proposal </h3>
<p>Ever since Shannon initiated research on information theory in 1948, there have been hopes that the concept of information could serve as a tool to help systematize and unify work in biology.  The link between information and <i>entropy</i> was noted very early on, and it suggested that a full thermodynamic understanding of biology would naturally involve the information processing and storage that are characteristic of living organisms.  However, the subject is full of conceptual pitfalls for the unwary, and progress has been slower than initially expected.  Premature attempts at &#8216;grand syntheses&#8217; have often misfired.  But applications of information theory and entropy to specific highly focused topics in biology have been increasingly successful, such as:</p>
<p>&bull;  the maximum entropy principle in ecology,<br />
&bull;   Shannon and R&eacute;nyi entropies as measures of biodiversity,<br />
&bull;  information theory in evolutionary game theory,<br />
&bull;  information and the thermodynamics of individual cells.</p>
<p>Because they work in diverse fields, researchers in these specific topics have had little opportunity to trade insights and take stock of the progress so far.  The aim of the workshop is to do just this.  </p>
<p>In what follows, participants&#8217; names are in boldface, while the main goals of the workshop are in italics.</p>
<p><b><a href="http://biology.anu.edu.au/roderick_dewar/">Roderick Dewar</a></b> is a key advocate of the principle of Maximum Entropy Production, which says that biological systems&#8212;and indeed all open, non-equilibrium systems&#8212;act to produce entropy at the maximum rate.  Along with others, he has applied this principle to make testable predictions in a wide range of biological systems, from ATP synthesis [DJZ2006] to respiration and photosynthesis of individual plants [D2010] and plant communities.  He has also sought to derive this principle from ideas in statistical mechanics [D2004, D2009], but it remains controversial.  </p>
<p><i>The first goal of this workshop is to study the validity of this principle</i>.</p>
<p>While they may be related, the principle of Maximum Entropy Production should not be confused with the MaxEnt inference procedure, which says that we should choose the probabilistic hypothesis with the highest entropy subject to the constraints provided by our data.  MaxEnt was first explicitly advocated by Jaynes.  He noted that it is already implicit in the procedures of statistical mechanics, but convincingly argued that it can also be applied to situations where entropy is more &#8216;informational&#8217; than &#8216;thermodynamic&#8217; in character.  </p>
<p>Recently <b><a href="http://socrates.berkeley.edu/~hartelab/">John Harte</a></b> has applied MaxEnt in this way to ecology, using it to make specific testable predictions for the distribution, abundance and energy usage of species across spatial scales and across habitats and taxonomic groups [Harte2008, Harte2009, Harte2011].  <b><a href="http://webapps.lsa.umich.edu/eeb/directory/faculty/aostling/">Annette Ostling</a></b> is an expert on other theories that attempt to explain the same data, such as the &#8216;neutral model&#8217; [AOE2008, ODLSG2009, O2005, O2012]. <b><a href="http://biology.anu.edu.au/roderick_dewar/">Dewar</a></b> has also used MaxEnt in ecology [D2008], and he has argued that it underlies the principle of Maximum Entropy Production.    </p>
<p><i>Thus, a second goal of this workshop is to familiarize all the participants with applications of the MaxEnt method to ecology, compare it with competing approaches, and study whether MaxEnt provides a sufficient justification for the principle of Maximum Entropy Production.</i></p>
<p>Entropy is not merely a predictive tool in ecology: it is also widely used as a measure of biodiversity.  Here Shannon&#8217;s original concept of entropy naturally generalizes to &#8216;R&eacute;nyi entropy&#8217;, which depends on a parameter <img src='https://s0.wp.com/latex.php?latex=%5Calpha+%5Cge+0&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;alpha &#92;ge 0' title='&#92;alpha &#92;ge 0' class='latex' />.  This equals</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+H_%5Calpha%28p%29+%3D+%5Cfrac%7B1%7D%7B1-%5Calpha%7D+%5Clog+%5Csum_i+p_i%5E%5Calpha++%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ H_&#92;alpha(p) = &#92;frac{1}{1-&#92;alpha} &#92;log &#92;sum_i p_i^&#92;alpha  } ' title='&#92;displaystyle{ H_&#92;alpha(p) = &#92;frac{1}{1-&#92;alpha} &#92;log &#92;sum_i p_i^&#92;alpha  } ' class='latex' /></p>
<p>where <img src='https://s0.wp.com/latex.php?latex=p_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i' title='p_i' class='latex' /> is the fraction of organisms of the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th type (which could mean species, some other taxon, etc.).    In the limit <img src='https://s0.wp.com/latex.php?latex=%5Calpha+%5Cto+1&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;alpha &#92;to 1' title='&#92;alpha &#92;to 1' class='latex' /> this reduces to the Shannon entropy:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B++H%28p%29+%3D+-+%5Csum_i+p_i+%5Clog+p_i+%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{  H(p) = - &#92;sum_i p_i &#92;log p_i } ' title='&#92;displaystyle{  H(p) = - &#92;sum_i p_i &#92;log p_i } ' class='latex' /></p>
<p>As <img src='https://s0.wp.com/latex.php?latex=%5Calpha&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;alpha' title='&#92;alpha' class='latex' /> increases, we give less weight to rare types of organisms.  <b><a href="http://www.maths.gla.ac.uk/~cc/">Christina Cobbold</a></b> and <b><a href="http://www.maths.ed.ac.uk/~tl/">Tom Leinster</a></b> have described a systematic and highly flexible system of biodiversity measurement, with R&eacute;nyi entropy at its heart [CL2012].    They consider both the case where all we have are the numbers <img src='https://s0.wp.com/latex.php?latex=p_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i' title='p_i' class='latex' />, and the more subtle case where we take the distance between different types of organisms into account.  </p>
<p><b><a href="http://math.ucr.edu/home/baez/">John Baez</a></b> has explained the role of R&eacute;nyi entropy in thermodynamics [B2011], and together with <b><a href="http://www.maths.ed.ac.uk/~tl/"></a><a>Tom Leinster</a></b> and <b><a href="http://users.icfo.es/Tobias.Fritz/">Tobias Fritz</a></b> he has proved other theorems characterizing entropy which explain its importance for information processing [BFL2011].  However, these ideas have not yet been connected to the widespread use of entropy in biodiversity studies.  More importantly, the use of entropy as a measure of biodiversity has not been clearly connected to MaxEnt methods in ecology.  Does the success of MaxEnt methods imply a tendency for ecosystems to maximize biodiversity subject to the constraints of resource availability?  This seems surprising, but a more nuanced statement along these general lines might be correct.    </p>
<p><i>So, a third goal of this workshop is to clarify relations between known characterizations of entropy, the use of entropy as a measure of biodiversity, and the use of MaxEnt methods in ecology.</i></p>
<p>As the amount of data to analyze in genomics continues to surpass the ability of humans to analyze it, we can expect automated experiment design to become ever more important.   In <b><a href="http://thinking.bioinformatics.ucla.edu/">Chris Lee</a></b> and <b><a href="http://people.mbi.ucla.edu/marcharper/">Marc Harper</a></b>’s RoboMendel program [LH2013], a mathematically precise concept of &#8216;potential information&#8217;&#8212;how much information is left to learn&#8212;plays a crucial role in deciding what experiment to do next, given the data obtained so far.  It will be useful for them to interact with <b><a href="http://www.princeton.edu/~wbialek/wbialek.html">William Bialek</a></b>, who has expertise in estimating entropy from empirical data and using it to constrain properties of models [BBS, BNS2001, BNS2002], and <b><a href="http://www2.hawaii.edu/~sstill/">Susanne Still</a></b>, who applies information theory to automated theory building and biology [CES2010, PS2012].</p>
<p>However, there is another link between biology and potential information.  <b><a href="http://people.mbi.ucla.edu/marcharper/">Harper</a></b> has noted that in an ecosystem where the population of each type of organism grows at a rate proportional to its fitness (which may depend on the fraction of organisms of each type), the quantity </p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+I%28q%7C%7Cp%29+%3D+%5Csum_i+q_i+%5Cln%28q_i%2Fp_i%29+%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ I(q||p) = &#92;sum_i q_i &#92;ln(q_i/p_i) } ' title='&#92;displaystyle{ I(q||p) = &#92;sum_i q_i &#92;ln(q_i/p_i) } ' class='latex' /></p>
<p>always decreases if there is an evolutionarily stable state [Harper2009].  Here <img src='https://s0.wp.com/latex.php?latex=p_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p_i' title='p_i' class='latex' /> is the fraction of organisms of the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th genotype at a given time, while <img src='https://s0.wp.com/latex.php?latex=q_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='q_i' title='q_i' class='latex' /> is this fraction in the evolutionarily stable state.  This quantity is often called the Shannon information of <img src='https://s0.wp.com/latex.php?latex=q&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='q' title='q' class='latex' /> &#8216;relative to&#8217; <img src='https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p' title='p' class='latex' />.  But in fact, it is precisely the same as <b><a href="http://thinking.bioinformatics.ucla.edu/">Lee</a></b> and <b><a href="http://people.mbi.ucla.edu/marcharper/">Harper</a></b>’s potential information!  Indeed, there is a precise mathematical analogy between evolutionary games and processes where a probabilistic hypothesis is refined by repeated experiments.  </p>
<p><i>Thus, a fourth goal of this workshop is to develop the concept of evolutionary games as &#8216;learning&#8217; processes in which information is gained over time.</i>  </p>
<p>We shall try to synthesize this with <b><a href="http://octavia.zoology.washington.edu/">Carl Bergstrom</a></b> and <b><a href="http://www.matina.org/">Matina Donaldson-Matasci</a></b>’s work on the &#8216;fitness value of information&#8217;: a measure of how much increase in fitness a population can obtain per bit of extra information [BL2004, DBL2010, DM2013].   Following <b><a href="http://people.mbi.ucla.edu/marcharper/">Harper</a></b>, we shall consider not only relative Shannon entropy, but also relative R&eacute;nyi entropy, as a measure of information gain [Harper2011].</p>
<p><i>A fifth and final goal of this workshop is to study the interplay between information theory and the thermodynamics of individual cells and organelles.</i></p>
<p><b><a href="http://www2.hawaii.edu/~sstill/">Susanne Still</a></b> has studied the thermodynamics of prediction in biological systems [BCSS2012].   And in a celebrated related piece of work, <b><a href="http://web.mit.edu/physics/people/faculty/england_jeremy.html">Jeremy England</a></b> used thermodynamic arguments to a derive a lower bound for the amount of entropy generated during a process of self-replication of a bacterial cell [England2013].  Interestingly, he showed that <i>E. coli</i> comes within a factor of 3 of this lower bound.   </p>
<p>In short, information theory and entropy methods are becoming powerful tools in biology, from the level of individual cells, to whole ecosystems, to experimental design, model-building, and the measurement of biodiversity. The time is ripe for an investigative workshop that brings together experts from different fields and lets them share insights and methods and begin to tackle some of the big remaining questions.</p>
<h3> Bibliography </h3>
<p>[AOE2008] D. Alonso, A. Ostling and R. Etienne, <a href="http://www-personal.umich.edu/~aostling/papers/alonso2008.pdf">The assumption of symmetry and species abundance distributions</a>, <i>Ecology Letters</i> <b>11</b> (2008), 93&#8211;105.</p>
<p>[TMMABB2012} D. Amodei, W. Bialek, M. J. Berry II, O. Marre, T. Mora, and G. Tkacik, <a href="http://arxiv.org/abs/1207.6319">The simplest maximum entropy model for collective behavior in a neural network</a>, arXiv:1207.6319 (2012).</p>
<p>[B2011] J. Baez, <a href="http://arxiv.org/abs/1102.2098">R&eacute;nyi entropy and free energy</a>, arXiv:1102.2098 (2011).</p>
<p>[BFL2011] J. Baez, T. Fritz and T. Leinster, <a href="http://arxiv.org/abs/1106.1791">A characterization of entropy in terms of information loss</a>, <i>Entropy</i> <b>13</b> (2011), 1945&#8211;1957.</p>
<p>[B2011] J. Baez and M. Stay, <a href="http://arxiv.org/abs/1010.2067">Algorithmic thermodynamics</a>, <i>Math. Struct. Comp. Sci.</i> <b>22</b> (2012), 771&#8211;787.</p>
<p>[BCSS2012] A. J. Bell, G. E. Crooks, S. Still and D. A Sivak, <a href="http://arxiv.org/abs/1203.3271">The thermodynamics of prediction</a>, <i>Phys. Rev. Lett.</i> <b>109</b> (2012), 120604.</p>
<p>[BL2004] C. T. Bergstrom and M. Lachmann, <a href="http://octavia.zoology.washington.edu/publications/BergstromAndLachmann04.pdf">Shannon information and biological fitness</a>, in <i>IEEE Information Theory Workshop 2004</i>, IEEE, 2004, pp. 50-54.</p>
<p>[BBS] M. J. Berry II, W. Bialek and E. Schneidman, <a href="http://arxiv.org/abs/physics/0212114">An information theoretic approach to the functional classification of neurons</a>, in <i>Advances in Neural Information Processing Systems 15</i>, MIT Press, 2005.</p>
<p>[BNS2001] W. Bialek, I. Nemenman and N. Tishby, <a href="http://www.princeton.edu/~wbialek/our_papers/bnt_01a.pdf">Predictability, complexity and learning</a>, <i>Neural Computation</i> <b>13</b> (2001), 2409&#8211;2463.</p>
<p>[BNS2002] W. Bialek, I. Nemenman and F. Shafee, <a href="http://books.nips.cc/papers/files/nips14/LT22.pdf">Entropy and inference, revisited</a>, in <i>Advances in Neural Information Processing Systems 14</i>, MIT Press, 2002.</p>
<p>[CL2012] C. Cobbold and T. Leinster, <a href="http://www.maths.ed.ac.uk/~tl/mdiss.pdf">Measuring diversity: the importance of species similarity</a>, <i>Ecology</i> <b>93</b> (2012), 477&#8211;489.</p>
<p>[CES2010] J. P. Crutchfield, S. Still and C. Ellison, <a href="http://arxiv.org/abs/0708.1580">Optimal causal inference: estimating stored information and approximating causal architecture</a>, <i>Chaos</i> <b>20</b> (2010), 037111.</p>
<p>[D2004] R. C. Dewar, Maximum entropy production and non-equilibrium statistical mechanics, in <i>Non-Equilibrium Thermodynamics and Entropy Production: Life, Earth and Beyond</i>, eds. A. Kleidon and R. Lorenz, Springer, New York, 2004, 41&#8211;55.</p>
<p>[DJZ2006] R. C. Dewar, D. Juret&iacute;c,  P. Zupanov&iacute;c, <a href="http://www.pmfst.hr/~juretic/CPLETT23896.pdf">The functional design of the rotary enzyme ATP synthase is consistent with maximum entropy production</a>, <i>Chem. Phys. Lett.</i> <b>430</b> (2006), 177&#8211;182. </p>
<p>[D2008] R. C. Dewar, A. Port&eacute;, <a href="http://arxiv.org/abs/q-bio/0703061">Statistical mechanics unifies different ecological patterns</a>, <i>J. Theor. Bio.</i> <b>251</b> (2008), 389&#8211;403. </p>
<p>[D2009] R. C. Dewar, <a href="http://www.mdpi.com/1099-4300/11/4/931/pdf">Maximum entropy production as an inference algorithm that translates physical assumptions into macroscopic predictions: don&#8217;t shoot the messenger</a>, <i>Entropy</i> <b>11</b> (2009), 931&#8211;944. </p>
<p>[D2010] R. C. Dewar, <a href="http://rstb.royalsocietypublishing.org/content/365/1545/1429.full">Maximum entropy production and plant optimization theories</a>, <i>Phil. Trans. Roy. Soc. B</i> <b>365</b> (2010) 1429&#8211;1435.</p>
<p>[DBL2010} M. C. Donaldson-Matasci, C. T. Bergstrom, and<br />
M. Lachmann, <a href="http://arxiv.org/abs/q-bio/0510007">The fitness value of information</a>, <i>Oikos</i> <b>119</b> (2010), 219-230.</p>
<p>[DM2013] M. C. Donaldson-Matasci, G. DeGrandi-Hoffman, and A. Dornhaus, Bigger is better: honey bee colonies as distributed information-gathering systems, <i>Animal Behaviour</i> <b>85</b> (2013), 585&#8211;592.</p>
<p>[England2013] J. L. England, <a href="http://arxiv.org/abs/1209.1179">Statistical physics of self-replication</a>, <i>J. Chem. Phys.</i> <b>139</b> (2013), 121923.</p>
<p>[ODLSG2009} J. L. Green,  J. K. Lake, J. P. O’Dwyer, A. Ostling and V. M. Savage, <a href="http://www-personal.umich.edu/~aostling/papers/ODwyer2009.pdf">An integrative framework for stochastic, size-structured community assembly</a>, <i>PNAS</i> <b>106</b> (2009), 6170&#8211;6175.</p>
<p>[Harper2009] M. Harper, <a href="http://arxiv.org/abs/0911.1383">Information geometry and evolutionary game theory</a>, arXiv:0911.1383 (2009).</p>
<p>[Harper2011] M. Harper, <a href="http://arxiv.org/abs/0911.1764">Escort evolutionary game theory</a>, <i>Physica D</i> <b>240</b> (2011), 1411&#8211;1415.</p>
<p>[Harte2008] J. Harte, T. Zillio, E. Conlisk and A. Smith, Maximum entropy and the state-variable approach to macroecology, <i>Ecology</i> <b>89</b> (2008), 2700&#8211;2711.</p>
<p>[Harte2009] J. Harte, A. Smith and D. Storch, Biodiversity scales from plots to biomes with a universal species-area curve, <i>Ecology Letters</i> <b>12</b> (2009), 789–797.</p>
<p>[Harte2011] J. Harte, <i>Maximum Entropy and Ecology: A Theory of Abundance, Distribution, and Energetics</i>, Oxford U. Press, Oxford, 2011.</p>
<p>[LH2013] M. Harper and C. Lee, <a href="http://arxiv.org/abs/1210.4808">Basic experiment planning via information metrics: the RoboMendel problem</a>, arXiv:1210.4808 (2012).</p>
<p>[O2005] A. Ostling, <a href="http://www-personal.umich.edu/~aostling/papers/O2005.pdf">Neutral theory tested by birds</a>, <i>Nature</i> <b>436</b> (2005), 635.</p>
<p>[O2012] A. Ostling, <a href="http://www-personal.umich.edu/~aostling/papers/O2012fit.pdf">Do fitness-equalizing tradeoffs lead to neutral communities?</a>, <i>Theoretical Ecology</i> <b>5</b> (2012), 181&#8211;194. </p>
<p>[PS2012] D. Precup and S. Still, <a href="http://www2.hawaii.edu/~sstill/StillPrecup2011.pdf">An information-theoretic approach to curiosity-driven reinforcement learning</a>, <i>Theory in Biosciences</i> <b>131</b> (2012), 139&#8211;148.</p>
]]></html><thumbnail_url><![CDATA[https://i1.wp.com/www.utk.edu/tntoday/images/nimbios_logo_lg.jpg?fit=440%2C330]]></thumbnail_url><thumbnail_height><![CDATA[111]]></thumbnail_height><thumbnail_width><![CDATA[440]]></thumbnail_width></oembed>