<?xml version="1.0" encoding="UTF-8" standalone="yes"?><oembed><version><![CDATA[1.0]]></version><provider_name><![CDATA[Gigaom]]></provider_name><provider_url><![CDATA[http://gigaom.com]]></provider_url><author_name><![CDATA[Derrick Harris]]></author_name><author_url><![CDATA[http://search.gigaom.com/author/dharrisstructure/]]></author_url><title><![CDATA[Microsoft is building fast, low-power neural networks with FPGAs]]></title><type><![CDATA[link]]></type><html><![CDATA[<p>Microsoft on Monday <a href="http://research.microsoft.com/pubs/240715/CNN%20Whitepaper.pdf">released a white paper</a> explaining a current effort to run convolutional neural networks &#8212; the deep learning technique responsible for record-setting computer vision algorithms &#8212; on FPGAs rather than GPUs.</p>
<p>Microsoft claims that new FPGA designs provide greatly improved processing speed over earlier versions while consuming a fraction of the power of GPUs. This type of work could represent a big shift in deep learning if it catches on, because for the past few years the field has been largely centered around GPUs as the computing architecture of choice.</p>
<p>If there&#8217;s a major caveat to Microsoft&#8217;s efforts, it might have to do with performance. While Microsoft&#8217;s research shows FPGAs consuming about one-tenth the power of high-end GPUs (25W compared with 235W), GPUs still process images at a much higher rate. Nvidia&#8217;s Tesla K40 GPU can do between 500 and 824 images per second on one popular benchmark dataset, the white paper claims, while Microsoft predicts its preferred FPGA chip &#8212; the Altera Arria 10 &#8212; will be able to process about 233 images per second on the same dataset.</p>
<p>However, the paper&#8217;s authors note that performance per processor is relative because a multi-FPGA cluster could match a single GPU while still consuming much less power: &#8220;In the future, we anticipate further significant gains when mapping our design to newer FPGAs . . . and when combining a large number of FPGAs together to parallelize both evaluation and training.&#8221;</p>
<p>In <a href="http://blogs.technet.com/b/inside_microsoft_research/archive/2015/02/23/machine-learning-gets-big-boost-from-ultra-efficient-convolutional-neural-network-accelerator.aspx">a Microsoft Research blog post</a>, processor architect Doug Burger wrote, &#8220;We expect great performance and efficiency gains from scaling our [convolutional neural network] engine to Arria 10, conservatively estimated at a throughput increase of 70% with comparable energy used.&#8221;</p>
<p><img  src="https://gigaom2.files.wordpress.com/2015/02/fpgacnn.jpg?quality=80&#038;strip=all&#038;w=804" alt="fpgacnn"   data-attribution="Microsoft Research" class="aligncenter size-full wp-image-916470" /></p>
<p>This is not Microsoft&#8217;s first rodeo when it comes deploying FPGAs within its data centers, and in fact is a corollary of an earlier project. Last summer, the company <a href="https://gigaom.com/2014/06/16/why-microsoft-is-building-programmable-chips-that-specialize-in-search/">detailed a research project called Catapult</a> in which it was able to improve the speed and performance of Bing&#8217;s search-ranking algorithms by adding FPGA co-processors to each server in a rack. The company intends to port production Bing workloads onto the Catapult architecture later this year.</p>
<p>There have also been other attempts to port deep learning algorithms onto FPGAs, including <a href="https://gigaom.com/2014/08/14/researchers-hope-deep-learning-algorithms-can-run-on-fpgas-and-supercomputers/">one by State University of New York at Stony Brook professors</a> and <a href="https://gigaom.com/2014/09/22/baidu-is-trying-to-speed-up-image-search-using-fpgas/">another by Chinese search giant Baidu</a>. Ironically, Baidu Chief Scientist, and deep learning expert, Andrew Ng is big proponent of GPUs, and the company<a href="https://gigaom.com/2014/12/18/baidu-claims-deep-learning-breakthrough-with-deep-speech/"> claims a massive GPU-based deep learning system</a> as well as <a href="https://gigaom.com/2015/01/14/baidu-has-built-a-supercomputer-for-deep-learning/">a GPU-based supercomputer designed for computer vision</a>. But this needn&#8217;t be and either/or situation: companies could still use GPUs to maximize performance while training their models, and then port them to FPGAs for production workloads.</p>
<p>Expect to hear more about the future of deep learning architectures and applications at <a href="https://events.gigaom.com/structuredata-2015/">Gigaom&#8217;s Structure Data conference</a> March 18 and 19 in New York, which features experts from Facebook, Microsoft and elsewhere. Our <a href="https://events.gigaom.com/structure-intelligence-2015/">Structure Intelligence conference</a>, September 22-23 in San Francisco, will dive even deeper into deep learnings, as well as the broader field of artificial intelligence algorithms and applications.</p>
]]></html><thumbnail_url><![CDATA[https://i2.wp.com/gigaom2.files.wordpress.com/2014/08/altera_stratixivgx_fpga.jpg?fit=440%2C330&quality=80&strip=all]]></thumbnail_url><thumbnail_height><![CDATA[294]]></thumbnail_height><thumbnail_width><![CDATA[440]]></thumbnail_width></oembed>