##### Abstract

Authors: A.Invernizzi, P.Dagna

Keywords: CUDA, Magma, Swig, Numpy

Proposal Submission Type: Poster

Abstract

Python is a general-purpose, high-level programming language. Its design philosophy emphasizes programmer productivity and code readability thanks to minimalistic syntax and semantics. Despite its simplicity, Python can be successfully used for scientific computing ensuring a good balance between computational performance and time investment. Development of scientific code in Python is made easier thanks to a large number of science-related modules. GPGPU seems to be the new frontier of scientific computing and the PYCUDA /PYOpenCL project aim to make available GPGPU computing in Python language.

pyMagma package is developed with the purpose to offer a matrix algebra library on GPU, in Python. Matrix algebra operations are the basis of several scientific algorithms (image processing, numerical discretization schemes, graph theory, statistical inferenceā¦) and are in many cases computationally intensive tasks.

Operation implemented in pyMagma are performed with algorithms developed by the MAGMA (Matrix Algebra on GPU and Multicore Architectures) research project. MAGMA project aims to achieve the fastest possible linear algebra libraries on hybrid multicore CPU and GPU architectures by exploiting their massive parallelism and minimizing communication latencies. Cholesky factorization, LU factorization, QR factorization, matrix inversion, linear system solver, eigenvalues solver, matrix-vector product, matrix-matrix product are some of the functionalities offered by pyMagma package. Some of these functionalities support multi-gpus architecture, enabling to deal with larger set of data in a reasonable computing time.

pyMagma is developed on MAGMA library and makes use of Numpy-arrays as working data-strucures. Numpy is a well-known fundamental package for scientific computing with Python, it contains among other things a powerful N-dimensional array object, well-suited for scientific computing. pyMagma functions are thought to work with Numpy-arrays data structures. In order to build a wrapper interface between modules written in C and Python,

the SWIG tools was used. SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages. It is an easy to use instrument to build interface for Python language. Even if linear algebra operation offered by Numpy and Scipy modules are already optimized, computing time can be further reduced by the use of pyMagma library. Cholesky decomposition, for example, consists in a decomposition of a Hermitian, positive-defined matrix into the product of a lower triangular matrix and its conjugate transpose. The computational complexity of commonly used algorithms is O(n3). The function that computes Cholesky decomposition in pyMagma is available with multi-GPUs support. After some benchmarking, performed increasing matrix size, we can notice that the benefits of pyMagma became more effective with large data sets. Considering a square matrix with 20480x20480 elements pyMagma Cholesky implementation is 20 times faster than the analogous numpy.linalg.cholesky function. Tests and comparative performance results are obtained on a machine with two NVIDIA Tesla M2050 cards, with 3GB of dedicated memory each, along with two exa-core Intel(R) Xeon(R) CPU X5660 at 2.80Ghz with a total of 24 GB RAM . Moreover we make use of the magma library v.1.1, the Enthought Python Distribution v.7.2, based on Python 2.7 and of the CUDA SDK/Toolkit v.4.1.

Finally, we can state that pyMagma package offers to python scientific developers the opportunity of taking advantages of hardware accelerators continuing to work in a familiar environment, without worrying of parallel computing because everything is resolved by Python function calling. The workflow implemented (that makes use of CUDA, SWIG, external C library) can be moreover a successful example on how to extend the Python interpreter with high-performance modules.