<?xml version="1.0" encoding="UTF-8" standalone="yes"?><oembed><version><![CDATA[1.0]]></version><provider_name><![CDATA[Azimuth]]></provider_name><provider_url><![CDATA[https://johncarlosbaez.wordpress.com]]></provider_url><author_name><![CDATA[John Baez]]></author_name><author_url><![CDATA[https://johncarlosbaez.wordpress.com/author/johncarlosbaez/]]></author_url><title><![CDATA[Networks in Climate&nbsp;Science]]></title><type><![CDATA[link]]></type><html><![CDATA[<p>What follows is draft of a talk I&#8217;ll be giving at the <a href="http://nips.cc/Conferences/2014/Program/schedule.php?Session=Conference%20Sessions">Neural Information Processing Seminar</a> on December 10th.  The actual talk may contain more stuff&#8212;for example, more work that Dara Shayda has done.  But I&#8217;d love comments <i>now,</i> so I&#8217;m posting this now and hoping you can help out.</p>
<p>You can click on any of the pictures to see where it came from or get more information.</p>
<h3> Preliminary throat-clearing</h3>
<p>I&#8217;m very flattered to be invited to speak here.  I was probably invited because of my abstract mathematical work on networks and category theory.  But when I got the invitation, instead of talking about something I understood, I thought I&#8217;d learn about something a bit more practical and talk about that.  That was a bad idea.  But I&#8217;ll try to make the best of it.</p>
<p>I&#8217;ve been trying to learn climate science.  There&#8217;s a subject called &#8216;complex networks&#8217; where people do statistical analyses of large graphs like the worldwide web or Facebook and draw conclusions from it.  People are trying to apply these ideas to climate science.  So that&#8217;s what I&#8217;ll talk about.   I&#8217;ll be reviewing a lot of other people&#8217;s work, but also describing some work by a project I&#8217;m involved in, the <a href="http://www.azimuthproject.org/azimuth/show/HomePage">Azimuth Project</a>.</p>
<p>The Azimuth Project is an all-volunteer project involving scientists and programmers, many outside academia, who are concerned about environmental issues and want to use their skills to help.  This talk is based on the work of many people in the Azimuth Project, including Jan Galkowski, Graham Jones, Nadja Kutz, Daniel Mahler, Blake Pollard, Paul Pukite, Dara Shayda, David Tanzer, David Tweed, Steve Wenner and others.  Needless to say, I&#8217;m to blame for all the mistakes.</p>
<h3> Climate variability and El Ni&ntilde;o</h3>
<p>Okay, let&#8217;s get started.</p>
<p>You&#8217;ve probably heard about the &#8216;global warming pause&#8217;.  Is this a real thing?  If so, is it due to &#8216;natural variability&#8217;, heat going into the deep oceans, some combination of both, a massive failure of our understanding of climate processes, or something else?</p>
<p>Here is chart of global average air temperatures at sea level, put together by NASA&#8217;s Goddard Institute of Space Science:</p>
<div align="center">
<a href="http://data.giss.nasa.gov/gistemp/graphs_v3/"><br />
<img width="400" src="https://i2.wp.com/math.ucr.edu/home/baez/climate_networks/gistemp_1880-2013.jpg" alt="" /></a></div>
<p>You can see a lot of fluctuations, including a big dip after 1940 and a tiny dip after 2000.  That tiny dip is the so-called &#8216;global warming pause&#8217;.   What causes these fluctuations?  That&#8217;s a big, complicated question.</p>
<p>One cause of temperature fluctuations is a kind of cycle whose extremes are called El Ni&ntilde;o and La Ni&ntilde;a.</p>
<div align="center"><a href="http://www.ncdc.noaa.gov/sotc/global/2012/13"><img width="440" src="https://i0.wp.com/math.ucr.edu/home/baez/climate_networks/ENSO_global_temperature_anomalies.png" alt="" /></a></div>
<p>A lot of things happen during an El Ni&ntilde;o.    For example, in 1997 and 1998, a big El Ni&ntilde;o, we saw all these events:</p>
<p><a href="https://www.shrimpnews.com/FreeReportsFolder/WeatherFolder/ElNino.html"><img width="440" src="https://i1.wp.com/math.ucr.edu/home/baez/climate_networks/ElNinoMap1998.jpg" alt="" /></a></p>
<p>El Ni&ntilde;o is part of an irregular cycle that happens every 3 to 7 years, called the <strong>El Ni&ntilde;o Southern Oscillation</strong> or <strong>ENSO</strong>.  Two strongly correlated signs of an El Ni&ntilde;o are:</p>
<p>1) Increased sea surface temperatures in a patch of the Pacific called the Ni&ntilde;o 3.4 region.  The <b>temperature anomaly</b> in this region&#8212;how much warmer it is than usual for that time of year&#8212;is called the <b><a href="http://www.azimuthproject.org/azimuth/show/ENSO#Nino3.4">Ni&ntilde;o 3.4 index</a></b>.</p>
<div align="center">
<img width="450" src="https://i2.wp.com/math.ucr.edu/home/baez/climate_networks/nino3.4_region.jpg" /></div>
<p>2) A decrease in air pressures in the <i>western</i> side of the Pacific compared to those further <i>east</i>.  This is measured by the <a href="http://www.azimuthproject.org/azimuth/show/ENSO#SOI"><b>Southern Oscillation Index</b> or <b>SOI</b></a>.</p>
<p>You can see the correlation here:</p>
<div align="center"><a href="http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensocycle/soi.shtml"><img width="440" src="https://i0.wp.com/math.ucr.edu/home/baez/ecological/el_nino/soi_nino34.gif" /></a></div>
<p>El Ni&ntilde;os are important because they can cause billions of dollars of economic damage.  They also seem to bring heat stored in the deeper waters of the Pacific into the atmosphere.  So, one reason for the &#8216;global warming pause&#8217; may be that we haven&#8217;t had a strong El Ni&ntilde;o <a href="http://ggweather.com/enso/oni.htm">since 1998</a>.   The global warming pause might end with the next El Ni&ntilde;o.  For a while it seemed we were due for a big one this fall, but that hasn&#8217;t happened.</p>
<h3> Teleconnections</h3>
<p>The ENSO cycle is just one of many cycles involving <strong>teleconnections</strong>: strong correlations between weather at distant locations, typically thousands of kilometers.  People have systematically looked for these teleconnections using <a href="https://en.wikipedia.org/wiki/Principal_component_analysis">principal component analysis</a> of climate data, and also other techniques.</p>
<p>The ENSO cycle shows up automatically when you do this kind of study.  It stands out as the biggest source of climate variability on time scales greater than a year and less than a decade.  Some others include:</p>
<p>&bull; The <a href="http://www.ncdc.noaa.gov/teleconnections/pna.php">Pacific-North America Oscillation</a>.<br />
&bull; The <a href="http://www.ncdc.noaa.gov/teleconnections/pdo/">Pacific Decadal Oscillation</a>.<br />
&bull; The <a href="http://www.ncdc.noaa.gov/teleconnections/nao.php">North Atlantic Oscillation</a>.<br />
&bull; The <a href="http://www.ncdc.noaa.gov/teleconnections/ao.php">Arctic Oscillation</a>.</p>
<p>For example, the Pacific Decadal Oscillation is a longer-period relative of the ENSO, centered in the north Pacific:</p>
<div align="center">
<a href="http://jisao.washington.edu/pdo/"><img width="440" src="https://i0.wp.com/math.ucr.edu/home/baez/climate_networks/pacific_decadal_oscillation.jpg" /></a></div>
<h3> Complex network theory</h3>
<p>Recently people have begun to study teleconnections using ideas from &#8216;complex network theory&#8217;.</p>
<p>What&#8217;s that?  In complex network theory, people often start with a <strong>weighted graph</strong>: that is, a set <img src='https://s0.wp.com/latex.php?latex=N&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='N' title='N' class='latex' /> of <strong>nodes</strong> and for any pair of nodes <img src='https://s0.wp.com/latex.php?latex=i%2C+j+%5Cin+N%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i, j &#92;in N,' title='i, j &#92;in N,' class='latex' /> a <strong>weight</strong> <img src='https://s0.wp.com/latex.php?latex=A_%7Bi+j%7D%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='A_{i j},' title='A_{i j},' class='latex' /> which can be any nonnegative real number.</p>
<p>Why is this called a weighted graph?  It&#8217;s really just a matrix of nonnegative real numbers!</p>
<p>The reason is that we can turn any weighted graph into a graph by drawing an edge from node <img src='https://s0.wp.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j' title='j' class='latex' /> to node <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' /> whenever <img src='https://s0.wp.com/latex.php?latex=A_%7Bi+j%7D+%3E0.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='A_{i j} &gt;0.' title='A_{i j} &gt;0.' class='latex' />   This is a <strong>directed</strong> graph, meaning that we should draw an arrow pointing from <img src='https://s0.wp.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j' title='j' class='latex' /> to <img src='https://s0.wp.com/latex.php?latex=i.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i.' title='i.' class='latex' />  We could have an edge from <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' /> to <img src='https://s0.wp.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j' title='j' class='latex' /> but not vice versa!   Note that we can also have an edge from a node to itself.</p>
<p>Conversely, if we have any directed graph, we can turn it into a weighted graph by choosing the weight <img src='https://s0.wp.com/latex.php?latex=A_%7Bi+j%7D+%3D+1&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='A_{i j} = 1' title='A_{i j} = 1' class='latex' /> when there&#8217;s an edge from <img src='https://s0.wp.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j' title='j' class='latex' /> to <img src='https://s0.wp.com/latex.php?latex=i%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i,' title='i,' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=A_%7Bi+j%7D+%3D+0&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='A_{i j} = 0' title='A_{i j} = 0' class='latex' /> otherwise.</p>
<p>For example, we can make a weighted graph where the nodes are web pages and <img src='https://s0.wp.com/latex.php?latex=A_%7Bi+j%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='A_{i j}' title='A_{i j}' class='latex' /> is the number of links from the web page <img src='https://s0.wp.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j' title='j' class='latex' /> to the web page <img src='https://s0.wp.com/latex.php?latex=i.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i.' title='i.' class='latex' /></p>
<p>People in complex network theory like examples of this sort: large weighted graphs that describe connections between web pages, or people, or cities, or neurons, or other things.  The goal, so far, is to compute numbers from weighted graphs in ways that describe interesting properties of these complex networks&#8212;and then formulate and test hypotheses about the complex networks we see in real life.</p>
<h3> The El Ni&ntilde;o basin</h3>
<p>Here&#8217;s a very simple example of what we can do with a weighted graph.  For any node <img src='https://s0.wp.com/latex.php?latex=i%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i,' title='i,' class='latex' /> we can sum up the weights of edges going into <img src='https://s0.wp.com/latex.php?latex=i%3A&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i:' title='i:' class='latex' /></p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Csum_%7Bj+%5Cin+N%7D+A_%7Bj+i%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;sum_{j &#92;in N} A_{j i} ' title='&#92;sum_{j &#92;in N} A_{j i} ' class='latex' /></p>
<p>This is called the <strong>degree</strong> of the node <img src='https://s0.wp.com/latex.php?latex=i.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i.' title='i.' class='latex' />  For example, if lots of people have web pages with lots of links to yours, your webpage will have a high degree.  If lots of people like you on Facebook, <em>you</em> will have a high degree.</p>
<p>So, the degree is some measure of how &#8216;important&#8217; a node is.</p>
<p>People have constructed climate networks where the nodes are locations on the Earth&#8217;s surface, and the weight <img src='https://s0.wp.com/latex.php?latex=A_%7Bi+j%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='A_{i j}' title='A_{i j}' class='latex' /> measures how correlated the weather is at the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th and <img src='https://s0.wp.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j' title='j' class='latex' />th location.  Then, the degree says how &#8216;important&#8217; a given location is for the Earth&#8217;s climate&#8212;in some vague sense.</p>
<p>For example, in <a href="https://www.pik-potsdam.de/members/kurths/publikationen/2009/complex-networks.pdf">Complex networks in climate dynamics</a>, Donges <em>et al</em> take surface air temperature data on a grid and compute the correlation between grid points.</p>
<p>More precisely, let <img src='https://s0.wp.com/latex.php?latex=T_i%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='T_i(t)' title='T_i(t)' class='latex' /> be the temperature at the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th grid point at month <img src='https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t' title='t' class='latex' /> after the average for that month in all years under consideration has been subtracted off, to eliminate some seasonal variations.    They compute the Pearson correlation <img src='https://s0.wp.com/latex.php?latex=A_%7Bi+j%7D&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='A_{i j}' title='A_{i j}' class='latex' /> of <img src='https://s0.wp.com/latex.php?latex=T_i%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='T_i(t)' title='T_i(t)' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=T_j%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='T_j(t)' title='T_j(t)' class='latex' /> for each pair of grid points <img src='https://s0.wp.com/latex.php?latex=i%2C+j.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i, j.' title='i, j.' class='latex' />  The <a href="https://en.wikipedia.org/w/index.php?title=Pearson_product-moment_correlation_coefficient">Pearson correlation</a> is the simplest measure of linear correlation, normalized to range between -1 and 1.</p>
<p>We could construct a weighted graph this way, and it would be symmetric, or undirected:</p>
<p><img src='https://s0.wp.com/latex.php?latex=A_%7Bi+j%7D+%3D+A_%7Bj+i%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='A_{i j} = A_{j i} ' title='A_{i j} = A_{j i} ' class='latex' /></p>
<p>However, Donges <em>et al</em> prefer to work with a graph rather than a weighted graph. So, they create a graph where there is an edge from <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' /> to <img src='https://s0.wp.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j' title='j' class='latex' /> (and also from <img src='https://s0.wp.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j' title='j' class='latex' /> to <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />) when <img src='https://s0.wp.com/latex.php?latex=%7CA_%7Bi+j%7D%7C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='|A_{i j}|' title='|A_{i j}|' class='latex' /> exceeds a certain threshold, and no edge otherwise.</p>
<p>They can adjust this threshold so that any desired fraction of pairs <img src='https://s0.wp.com/latex.php?latex=i%2C+j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i, j' title='i, j' class='latex' /> actually have an edge between them.  After some experimentation they chose this fraction to be 0.5%.</p>
<p><a href="https://www.pik-potsdam.de/members/kurths/publikationen/2009/complex-networks.pdf"><br />
<img width="440" src="https://i2.wp.com/math.ucr.edu/home/baez/climate_networks/area_weighted_connectivity_pearson_correlation_sea_surface_air_temperature_donges.jpg" alt="" /></a></p>
<p>A certain patch dominates the world!  This is the <strong>El Ni&ntilde;o basin</strong>.  The Indian Ocean comes in second.</p>
<p>(Some details, which I may not say:</p>
<p>The <strong>Pearson correlation</strong> is the <strong>covariance</strong></p>
<p><img src='https://s0.wp.com/latex.php?latex=%5CBig%5Clangle+%5Cleft%28+T_i+-+%5Clangle+T_i+%5Crangle+%5Cright%29+%5Cleft%28+T_j+-+%5Clangle+T_j+%5Crangle+%5Cright%29+%5CBig%5Crangle+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Big&#92;langle &#92;left( T_i - &#92;langle T_i &#92;rangle &#92;right) &#92;left( T_j - &#92;langle T_j &#92;rangle &#92;right) &#92;Big&#92;rangle ' title='&#92;Big&#92;langle &#92;left( T_i - &#92;langle T_i &#92;rangle &#92;right) &#92;left( T_j - &#92;langle T_j &#92;rangle &#92;right) &#92;Big&#92;rangle ' class='latex' /></p>
<p>normalized by dividing by the standard deviation of <img src='https://s0.wp.com/latex.php?latex=T_i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='T_i' title='T_i' class='latex' /> and the standard deviation of <img src='https://s0.wp.com/latex.php?latex=T_j.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='T_j.' title='T_j.' class='latex' /></p>
<p>The reddest shade of red in the above picture shows nodes that are connected to 5% or more of the other nodes.  These nodes are connected to at least 10 times as many nodes as average.)</p>
<p>The Pearson correlation detects linear correlations.  A more flexible measure is <a href="https://en.wikipedia.org/wiki/Mutual_information">mutual information</a>: how many bits of information knowing the temperature at time <img src='https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t' title='t' class='latex' /> at grid point <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' /> tells you about the temperature at the same time at grid point <img src='https://s0.wp.com/latex.php?latex=j.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j.' title='j.' class='latex' /></p>
<p>Donges <em>et al</em> create a climate network this way as well, putting an edge between nodes if their mutual information exceeds a certain cutoff.  They choose this cutoff so that 0.5% of node pairs have an edge between them, and get the following map:</p>
<p><a href="https://www.pik-potsdam.de/members/kurths/publikationen/2009/complex-networks.pdf"><br />
<img width="500" src="https://i0.wp.com/math.ucr.edu/home/baez/climate_networks/area_weighted_connectivity_mutual_information_sea_surface_air_temperature_donges.jpg" alt="" /></a></p>
<p>The result is almost indistinguishable in the El Ni&ntilde;o basin.  So, this feature is not just an artifact of focusing on linear correlations.</p>
<h3> El Ni&ntilde;o breaks climate links</h3>
<p>We can also look at how climate networks change with time&#8212;and in particular, how they are affected by El Ni&ntilde;os.  This is the subject of a 2008 paper by Tsonis and Swanson, <a href="https://pantherfile.uwm.edu/aatsonis/www/publications/2008-06_Tsonis-AA_TopologyandPredictabilityofElNinoandLaNinaNetworks-2.pdf">Topology and predictability of El Ni&ntilde;o and La Ni&ntilde;a networks</a>.</p>
<p>They create a climate network in a way that&#8217;s similar to the one I just described. The main differences are that they:</p>
<ol>
<li>separately create climate networks for El Ni&ntilde;o and La Ni&ntilde;a time periods;</li>
<li>
<p>create a link between grid points when their Pearson correlation has absolute value greater than $0.5;$</p>
</li>
<li>
<p>only use temperature data from November to March in each year, claiming that summertime introduces spurious links.</p>
</li>
</ol>
<p>They get this map for La Ni&ntilde;a conditions:</p>
<p><a href="https://pantherfile.uwm.edu/aatsonis/www/publications/2008-06_Tsonis-AA_TopologyandPredictabilityofElNinoandLaNinaNetworks-2.pdf"><br />
<img width="450" src="https://i2.wp.com/math.ucr.edu/home/baez/climate_networks/climate_backbone_la_nina_tsonis.jpg" alt="" /></a></p>
<p>and this map for El Ni&ntilde;o conditions:</p>
<p><a href="https://pantherfile.uwm.edu/aatsonis/www/publications/2008-06_Tsonis-AA_TopologyandPredictabilityofElNinoandLaNinaNetworks-2.pdf"><br />
<img width="450" src="https://i2.wp.com/math.ucr.edu/home/baez/climate_networks/climate_backbone_el_nino_tsonis.jpg" alt="" /></a></p>
<p>They conclude that &#8220;El Ni&ntilde;o breaks climate links&#8221;.</p>
<p>This may seem to contradict what I just said a minute ago.  But it doesn&#8217;t!   While the El Ni&ntilde;o basin is a region where the surface air temperatures are <em>highly correlated</em> to temperatures at many other points, when an El Ni&ntilde;o actually occurs it <em>disrupts</em> correlations between temperatures at different locations worldwide&#8212;and even in the El Ni&ntilde;o basin!</p>
<p>For the rest of the talk I want to focus on a third claim: namely, that El Ni&ntilde;os can be <em>predicted</em> by means of an <em>increase</em> in correlations between temperatures <em>within</em> the El Ni&ntilde;o basin and temperatures <em>outside</em> this region.  This claim was made in a recent paper by Ludescher <em>et al</em>.  I want to examine it somewhat critically.</p>
<h3>  Predicting El Ni&ntilde;os</h3>
<p>People really want to <em>predict</em> El Ni&ntilde;os, because they have huge effects on agriculture, especially around the Pacific ocean.  However, it&#8217;s generally regarded as very hard to predict El Ni&ntilde;os more than 6 months in advance.   There is also a <strong>spring barrier</strong>: it&#8217;s harder to predict El Ni&ntilde;os through the spring of any year.</p>
<p>It&#8217;s controversial how much of the unpredictability in the ENSO cycle is due to chaos intrinsic to the Pacific ocean system, and how much is due to noise from outside the system.  Both may be involved.</p>
<p>There are many teams trying to predict El Ni&ntilde;os, some using physical models of the Earth&#8217;s climate, and others using machine learning techniques.  There is a kind of competition going on, which you can see at a <a href="http://www.pmel.noaa.gov/tao/elnino/forecasts.html">National Oceanic and Atmospheric Administration website</a>.</p>
<p>The most recent predictions give a sense of how hard this job is:</p>
<p><a href="http://iri.columbia.edu/our-expertise/climate/forecasts/enso/current/"><img width="440" src="https://i1.wp.com/math.ucr.edu/home/baez/climate_networks/2014-11-20-Nino34-predictions.jpg" alt="" /></a></p>
<p>When the 3-month running average of the Ni&ntilde;o 3.4 index exceeds 0.5&deg;C for 5 months, we officially declare that there is an <strong>El Ni&ntilde;o</strong>.</p>
<p>As you can see, it&#8217;s hard to be sure if there will be an El Ni&ntilde;o early next year!  However, the consensus forecast is <em>yes, a weak El Ni&ntilde;o</em>.  This is the best we can do, now.  Right now multi-model ensembles have better predictive skill than any one model.</p>
<h3> The work of Ludescher <i>et al</i></h3>
<p>The Azimuth Project has carefully examined a 2013 paper by Ludescher <em>et al</em> called <a href="http://www.climatelinc.eu/fileadmin/UG_ADVANCED/Publications/BIU_-_Avi__Halvin__et_al-Very_early_warning_of_next_El_Nino.pdf">Very early warning of next El Niño</a>, which uses a climate network for El Ni&ntilde;o prediction.</p>
<p>They build their climate network using correlations between daily surface air temperature data between points inside the El Niño basin and certain points outside this region, as shown here:</p>
<p><a href="http://www.climatelinc.eu/fileadmin/UG_ADVANCED/Publications/BIU_-_Avi__Halvin__et_al-Very_early_warning_of_next_El_Nino.pdf"><br />
<img width="450" src="https://i0.wp.com/math.ucr.edu/home/baez/ecological/el_nino/ludescher_el_nino_cooperativity_1a.jpg" alt="" /></a></p>
<p>The red dots are the points in their version of the El Ni&ntilde;o basin.</p>
<p>(Next I will describe Ludescher&#8217;s procedure.  I may omit some details in the actual talk, but let me include them here.)</p>
<p>The main idea of Ludescher <em>et al</em> is to construct a climate network that is a weighted graph, and to say an El Ni&ntilde;o will occur if the average weight of edges between points <em>in</em> the El Ni&ntilde;o basin and points <em>outside</em> this basin exceeds a certain threshold.</p>
<p>As in the other papers I mentioned, Ludescher <em>et al</em> let <img src='https://s0.wp.com/latex.php?latex=T_i%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='T_i(t)' title='T_i(t)' class='latex' /> be the surface air temperature at the <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' />th grid point at time <img src='https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t' title='t' class='latex' /> minus the average temperature at that location at that time of year in all years under consideration, to eliminate the most obvious seasonal effects.</p>
<p>They consider a <strong>time-delayed covariance</strong> between temperatures at different grid points:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Clangle+T_i%28t%29+T_j%28t+-+%5Ctau%29+%5Crangle+-+%5Clangle+T_i%28t%29+%5Crangle+%5Clangle+T_j%28t+-+%5Ctau%29+%5Crangle++&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;langle T_i(t) T_j(t - &#92;tau) &#92;rangle - &#92;langle T_i(t) &#92;rangle &#92;langle T_j(t - &#92;tau) &#92;rangle  ' title='&#92;langle T_i(t) T_j(t - &#92;tau) &#92;rangle - &#92;langle T_i(t) &#92;rangle &#92;langle T_j(t - &#92;tau) &#92;rangle  ' class='latex' /></p>
<p>where <img src='https://s0.wp.com/latex.php?latex=%5Ctau&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;tau' title='&#92;tau' class='latex' /> is a time delay, and the angle brackets denote a running average over the last year, that is:</p>
<p><img src='https://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7B+%5Clangle+f%28t%29+%5Crangle+%3D+%5Cfrac%7B1%7D%7B365%7D+%5Csum_%7Bd+%3D+0%7D%5E%7B364%7D+f%28t+-+d%29+%7D+&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;displaystyle{ &#92;langle f(t) &#92;rangle = &#92;frac{1}{365} &#92;sum_{d = 0}^{364} f(t - d) } ' title='&#92;displaystyle{ &#92;langle f(t) &#92;rangle = &#92;frac{1}{365} &#92;sum_{d = 0}^{364} f(t - d) } ' class='latex' /></p>
<p>where <img src='https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t' title='t' class='latex' /> is the time in days.</p>
<p>They normalize this to define a correlation <img src='https://s0.wp.com/latex.php?latex=C_%7Bi%2Cj%7D%5Et%28%5Ctau%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='C_{i,j}^t(&#92;tau)' title='C_{i,j}^t(&#92;tau)' class='latex' /> that ranges from -1 to 1.</p>
<p>Next, for any pair of nodes <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' /> and <img src='https://s0.wp.com/latex.php?latex=j%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j,' title='j,' class='latex' /> and for each time <img src='https://s0.wp.com/latex.php?latex=t%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='t,' title='t,' class='latex' /> they determine the maximum, the mean and the standard deviation of <img src='https://s0.wp.com/latex.php?latex=%7CC_%7Bi%2Cj%7D%5Et%28%5Ctau%29%7C%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='|C_{i,j}^t(&#92;tau)|,' title='|C_{i,j}^t(&#92;tau)|,' class='latex' /> as the delay <img src='https://s0.wp.com/latex.php?latex=%5Ctau&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;tau' title='&#92;tau' class='latex' /> ranges from -200 to 200 days.</p>
<p>They define the <b>link strength</b> <img src='https://s0.wp.com/latex.php?latex=S_%7Bi%2Cj%7D%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='S_{i,j}(t)' title='S_{i,j}(t)' class='latex' /> as the difference between the maximum and the mean value of <img src='https://s0.wp.com/latex.php?latex=%7CC_%7Bi%2Cj%7D%5Et%28%5Ctau%29%7C%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='|C_{i,j}^t(&#92;tau)|,' title='|C_{i,j}^t(&#92;tau)|,' class='latex' /> divided by its standard deviation.</p>
<p>Finally, they let <img src='https://s0.wp.com/latex.php?latex=S%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='S(t)' title='S(t)' class='latex' /> be the <b>average link strength</b>, calculated by averaging <img src='https://s0.wp.com/latex.php?latex=S_%7Bi+j%7D%28t%29&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='S_{i j}(t)' title='S_{i j}(t)' class='latex' /> over all pairs <img src='https://s0.wp.com/latex.php?latex=i%2Cj&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i,j' title='i,j' class='latex' /> where <img src='https://s0.wp.com/latex.php?latex=i&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='i' title='i' class='latex' /> is a grid point <em>inside</em> their El Niño basin and <img src='https://s0.wp.com/latex.php?latex=j&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='j' title='j' class='latex' /> is a grid point <em>outside</em> this basin, but still in their larger rectangle.</p>
<p>Here is what they get:</p>
<p><a href="http://www.climatelinc.eu/fileadmin/UG_ADVANCED/Publications/BIU_-_Avi__Halvin__et_al-Very_early_warning_of_next_El_Nino.pdf"><br />
<img width="440" src="https://i0.wp.com/math.ucr.edu/home/baez/ecological/el_nino/ludescher_el_nino_cooperativity_2a.jpg" alt="" /></a></p>
<p>The blue peaks are El Ni&ntilde;os: episodes where the Ni&ntilde;o 3.4 index is over 0.5&deg;C for at least 5 months.</p>
<p>The red line is their &#8216;average link strength&#8217;.  Whenever this exceeds a certain threshold <img src='https://s0.wp.com/latex.php?latex=%5CTheta+%3D+2.82%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='&#92;Theta = 2.82,' title='&#92;Theta = 2.82,' class='latex' /> and the Ni&ntilde;o 3.4 index is not <i>already</i> over 0.5&deg;C, they predict an El Ni&ntilde;o will start in the following calendar year.</p>
<p>Ludescher <em>et al</em> chose their threshold for El Ni&ntilde;o prediction by training their algorithm on climate data from 1948 to 1980, and tested it on data from 1981 to 2013.  They claim that with this threshold, their El Ni&ntilde;o predictions were correct 76% of the time, and their predictions of no El Ni&ntilde;o were correct in 86% of all cases.</p>
<p>On this basis they claimed&#8212;when their paper was published in February 2014&#8212;that the Ni&ntilde;o 3.4 index would exceed 0.5 by the end of 2014 with probability 3/4.</p>
<p>The latest data as of <a href="http://math.ucr.edu/home/baez/climate_networks/enso_evolution-status-fcsts-web_2014_12_01.pdf">1 December 2014</a> seems to say: <i> yes, it happened!</i></p>
<h3> Replication and critique</h3>
<p>Graham Jones of the Azimuth Project <a href="https://johncarlosbaez.wordpress.com/2014/07/08/el-nino-project-part-4/">wrote code implementing Ludescher <em>et al&#8217;s</em> algorithm</a>, as best as we could understand it, and got results close to theirs, though not identical.  The code is open-source; one goal of the Azimuth Project is to do science &#8216;in the open&#8217;.</p>
<p>More interesting than the small discrepancies between our calculation and theirs is the question of whether &#8216;average link strengths&#8217; between points in the El Ni&ntilde;o basin and points outside are really helpful in predicting El Ni&ntilde;os.</p>
<p>Steve Wenner, a statistician helping the Azimuth Project, <a href="https://johncarlosbaez.wordpress.com/2014/07/23/el-nino-project-part-6/">noted some ambiguities in Ludescher <em>et al</em>&#8216;s El Ni&ntilde;o prediction rules</a> and disambiguated them in a number of ways.  For each way he used Fischer&#8217;s exact test to compute the <img src='https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p' title='p' class='latex' />-value of the null hypothesis that Ludescher <em>et al</em>&#8216;s El Ni&ntilde;o prediction does not improve the odds that what they predict will occur.</p>
<p>The best he got (that is, the lowest <img src='https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p' title='p' class='latex' />-value) was 0.03.  This is just a bit more significant than the conventional 0.05 threshold for rejecting a null hypothesis.</p>
<p>Do high average link strengths between points in the El Ni&ntilde;o basin and points elsewhere in the Pacific really increase the chance that an El Ni&ntilde;o is coming?  It is hard to tell from the work of Ludescher <em>et al</em>.</p>
<p>One reason is that they treat El Ni&ntilde;o as a binary condition, either on or off depending on whether the Ni&ntilde;o 3.4 index for a given month exceeds 0.5 or not.  This is not the usual definition of El Ni&ntilde;o, but the real problem is that they are only making a single yes-or-no prediction each year for 65 years: does an El Ni&ntilde;o occur during this year, or not?  31 of these years (1950-1980) are used for training their algorithm, leaving just 34 retrodictions and one actual prediction (1981-2013, and 2014).</p>
<p>So, there is a serious problem with small sample size.</p>
<p>We can learn a bit by taking a different approach, and simply running some linear regressions between the average link strength and the Ni&ntilde;o 3.4 index for each month.  There are 766 months from 1950 to 2013, so this gives us more data to look at.  Of course, it&#8217;s possible that the relation between average link strength and Ni&ntilde;o is highly nonlinear, so a linear regression may not be appropriate.  But it is at least worth looking at!</p>
<p>Daniel Mahler and Dara Shayda of the Azimuth Project did this and found the following interesting results.</p>
<h3> Simple linear models</h3>
<p>Here is a scatter plot showing the Ni&ntilde;o 3.4 index as a function of the average link strength <em>on the same month</em>:</p>
<p><a href="http://azimuth.mathforge.org/discussion/1523/crunch-time/?Focus=13487#Comment_13487"><br />
<img width="450" src="https://i0.wp.com/math.ucr.edu/home/baez/climate_networks/mahler_nino3.4_versus_link_strength_no_lag.png" alt="" /></a></p>
<p>(Click on these scatter plots for more information.)</p>
<p>The <a href="http://en.wikipedia.org/wiki/Coefficient_of_determination">coefficient of determination</a>, <img src='https://s0.wp.com/latex.php?latex=R%5E2%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='R^2,' title='R^2,' class='latex' /> is 0.0175.   In simple terms, this means that the average link strength in a given month explains just 1.75% of the variance of the Ni&ntilde;o 3.4 index.  That&#8217;s quite low!</p>
<p>Here is a scatter plot showing the Ni&ntilde;o 3.4 index as a function of the average link strength <em>six months earlier</em>:</p>
<p><a href="http://azimuth.mathforge.org/discussion/1523/crunch-time/?Focus=13487#Comment_13487"><br />
<img width="450" src="https://i1.wp.com/math.ucr.edu/home/baez/climate_networks/mahler_nino3.4_versus_link_strength_6_months_earlier.png" alt="" /></a></p>
<p>Now <img src='https://s0.wp.com/latex.php?latex=R%5E2&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='R^2' title='R^2' class='latex' /> is 0.088.  So, the link strength explains 8.8% of the variance in the Ni&ntilde;o 3.4 index 6 months later.  This is still not much&#8212;but interestingly, it&#8217;s much more than when we try to relate them at the same moment in time!  And the <img src='https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='p' title='p' class='latex' />-value is less than <img src='https://s0.wp.com/latex.php?latex=2.2+%5Ccdot+10%5E%7B-16%7D%2C&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='2.2 &#92;cdot 10^{-16},' title='2.2 &#92;cdot 10^{-16},' class='latex' /> so the effect is statistically significant.</p>
<p>Of course, we could also try to use Ni&ntilde;o 3.4 to predict <em>itself</em>.   Here is  the Ni&ntilde;o 3.4 index plotted against the Ni&ntilde;o 3.4 index six months earlier:</p>
<p><a href="http://azimuth.mathforge.org/discussion/1523/crunch-time/?Focus=13487#Comment_13487"><img width="450" src="https://i2.wp.com/math.ucr.edu/home/baez/climate_networks/mahler_nino3.4_versus_nino3.4_6_months_earlier.png" alt="" /></a></p>
<p>Now <img src='https://s0.wp.com/latex.php?latex=R%5E2+%3D+0.162.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='R^2 = 0.162.' title='R^2 = 0.162.' class='latex' />  So, this is better than using the average link strength!</p>
<p>That doesn&#8217;t sound good for average link strength.  But now let&#8217;s could try to predict Ni&ntilde;o 3.4 using <em>both</em> itself <em>and</em> the average link strength 6 months earlier.   Here is a scatter plot showing that:</p>
<p><a href="http://azimuth.mathforge.org/discussion/1523/crunch-time/?Focus=13526#Comment_13526"><img width="450" src="https://i1.wp.com/math.ucr.edu/home/baez/climate_networks/mahler_nino3.4_versus_link_strength_and_nino3.4_6_months_earlier.png" alt="" /></a></p>
<p>Here the <img src='https://s0.wp.com/latex.php?latex=x&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='x' title='x' class='latex' /> axis is an optimally chosen linear combination of average and link strength and Ni&ntilde;o 3.4: one that maximizes <img src='https://s0.wp.com/latex.php?latex=R%5E2&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='R^2' title='R^2' class='latex' />.</p>
<p>In this case we get <img src='https://s0.wp.com/latex.php?latex=R%5E2+%3D+0.22.&#038;bg=ffffff&#038;fg=000&#038;s=0' alt='R^2 = 0.22.' title='R^2 = 0.22.' class='latex' /></p>
<h3> Conclusions</h3>
<p>What can we conclude from this?</p>
<p>Using a linear model, the average link strength on a given month accounts for only 8% of the variance of Ni&ntilde;o 3.4 index 6 months in the future.  That sounds bad, and indeed it is.</p>
<p>However, there are more interesting things to say than this!</p>
<p>Both the Ni&ntilde;o 3.4 index and the average link strength can be computed from the surface air temperature of the Pacific during some window in time.  The Ni&ntilde;o 3.4 index explains 16% of its own variance 6 months into the future; the average link strength explains 8%, and taken together they explain 22%.  So, these two variables contain a fair amount of <em>independent</em> information about the Ni&ntilde;o 3.4 index 6 months in the future.</p>
<p>Furthermore, they explain a surprisingly large amount of its variance for just 2 variables.</p>
<p>For comparison, Mahler used a random forest variant called <a href="http://scikit-learn.org/dev/modules/generated/sklearn.ensemble.ExtraTreesRegressor.html">ExtraTreesRegressor</a> to predict the Ni&ntilde;o 3.4 index 6 months into the future from much larger collections of data.  Out of the 778 months available he trained the algorithm on the first 400 and tested it on the remaining 378.</p>
<p>The result: using a <a href="https://9d8c7e6f260445992e26252c839fa61414632ec6-www.googledrive.com/host/0B4cyIPgV_Vxrb2wxUnFteXVwWHM">full world-wide grid of surface air temperature values</a> at a given moment in time explains only 23% of the Ni&ntilde;o 3.4 index 6 months into the future.   A <a href="https://2bd877097dcd3604697f17379c3f2232c9730610-www.googledrive.com/host/0B4cyIPgV_VxrbDFJS1dTVWxhV28">full grid of surface air pressure values</a> does considerably better, but still explains only 34% of the variance.  Using <em><a href="http://www.googledrive.com/host/0B4cyIPgV_VxrX0lxSUxHU2VLN28/window-pressure-anom-predict.html">twelve months</a></em> of the full grid of pressure values only gets around 37%.</p>
<p>From this viewpoint, explaining 22% of the variance with just two variables doesn&#8217;t look so bad!</p>
<p>Moreover, while the Ni&ntilde;o 3.4 index is maximally correlated with itself at the <em>same moment in time</em>, for obvious reasons, the average link strength is maximally correlated with the Ni&ntilde;o 3.4 index <em>10 months into the future</em>:</p>
<p><a href="https://5619417f7fb3a489ed01c7f329cbd1e9b70a10d6-www.googledrive.com/host/0B4cyIPgV_VxrX0lxSUxHU2VLN28/link-anom.html"><img width="450" src="https://i1.wp.com/math.ucr.edu/home/baez/climate_networks/mahler_link_strength_nino3.4_correlation.png" alt="" /></a></p>
<p>(The lines here occur at monthly intervals.)</p>
<p>However, we have not tried to determine if the average link strength as Ludescher <em>et al</em> define it is <em>optimal</em> in this respect.  Graham Jones has shown that simplifying their definition of this quantity doesn&#8217;t change it much.  Maybe modifying their definition could improve it.  <b>There seems to be a real phenomenon at work here, but I don&#8217;t think we know exactly what it is!</b></p>
<p>My talk has avoided discussing physical models of the ENSO, because I wanted to focus on very simple, general ideas from complex network theory.  However, it seems obvious that really understanding the ENSO requires a lot of ideas from meteorology, oceanography, physics, and the like.  I am <em>not</em> advocating a &#8216;purely network-based approach&#8217;.</p>
]]></html><thumbnail_url><![CDATA[https://i2.wp.com/math.ucr.edu/home/baez/climate_networks/gistemp_1880-2013.jpg?fit=440%2C330]]></thumbnail_url><thumbnail_height><![CDATA[]]></thumbnail_height><thumbnail_width><![CDATA[]]></thumbnail_width></oembed>