Talk Machine learning on a cluster of autonomous GPUs to solve the deconvolution problem in Electron Paramagnetic Resonance Imaging

Presented by Yann Le Du in Scientific track 2011 on 2011/08/28 from 14:30 to 15:00

Yann Le Du, Mariem El Afrit, Laboratoire de Chimie de la Matière Condensée (LCMCP), CNRS, Chimie-ParisTech.

In the context of exobiology, we use Electronic Paramagnetic Resonance Imaging (EPRI) to determine the distribution of organic matter in Terrestrial and Martian rock samples. EPRI resolution theoretically allows to go down to structures of a few hundred micrometers, but only if we manage to properly solve the specific inverse problem involved, a deconvolution.

EPRI proceeds in two steps: extract the linear density of organic matter obtained by EPR scanning in many directions through the sample, and combine them in order to reconstruct the 2d or 3d distribution. In the first step, we are faced with a deconvolution problem, and the many existing methods (mainly based on Fourier analysis) suffer from many difficulties which render them non optimal for our practical use : heavy manual adjustments are needed, and the results are often unphysical (negative densities). Working closely with expert users of EPRI who also helped us design special phantoms, we thus developed and coded a system based on machine learning : a special breed of neural network (reservoir computing) coupled to a genetic algorithm that evolves the neural weights of a swarm of reservoirs.

The code runs on the Hybrid Processing Units for Science (HPU4Science) cluster located at the Laboratoire de Chimie de la Matière Condensée de Paris (LCMCP), which we also assembled completely using only consumer grade computer parts. The cluster is composed of a central data storage machine and a heterogeneous ensemble of 6 decentralized nodes. Each node comprises a Core2 Quad or i7 CPU and 3-7 NVIDIA Graphical Processing Units (GPUs) including the GF110 series. Each of the 28 GPUs independently explores a different parameter space sphere of the same problem. Our application shows a sustained real performance of 15.6 TFLOPS. The HPU4Science cluster cost $36,090 resulting in a 432.3 MFLOPS/$ cost performance.

On the software side, we make heavy use of Python: it is both used for the general functioning of the cluster, and for the computational code that runs on the GPUs through PyCUDA (couples to some C kernels that we coded for some special tasks). We also use Sage together with Cython for many computations that help us when thinking about the maths involved in our code, and the whole project is written using the literate programming paradigm, thanks to noweb.

That talk is meant to demonstrate on a practical case how consumer grade computer hardware coupled to Python and its orbiting mathematical and GPU libraries can be used to tackle a difficult yet very elementary scientific problem : how do you go from formulating the problem, to choosing the right hardware and software, and all the way to programming the parallelized algorithms using the appropriate development tools and methodologies.

Presented by Yann Le Du

tagged by
no related entity