Talk Beyond NumPy: Numexpr, Blosc and CArray

Presented by Francesc Alted in Advanced tutorial track 2012 on 2012/08/23 from 11:00 to 12:30
Abstract

NumPy is de-facto standard for performing numerical computations in Python. It provides a compact container for representing data arrays and permits to perform arithmetic operations in an element-by-element fashion, as well as supporting array broadcasting, type casting, and several other outstanding features.

Unfortunately, NumPy cannot efficiently deal with the evaluation of complex expressions in the form 4.4*x**3+3.2*y**2. The reason is that it has to deal with temporaries for the above expression in the order established by the Python interpreter and, as it it turns out, this order is rather pessimal for modern computers.

During my tutorial, I'll briefly explain what is exactly the problem (which is explained in [1]), and will introduce a series of tools for evaluating complex expressions from Python at almost C speeds. Among the tools I'll talk about, there will be:

  • Numexpr, a memory-efficient computing kernel [2]
  • Blosc, a compressor that can run faster than a pure memcpy() [3]
  • carray, a library that can make use of the above tools [4]

After this, I'll propose some exercises that will allow the attendees to better grasp how to achieve very high performance in their computations by using the above tools in combination with NumPy.

Attendees should bring their laptops with numexpr, python-blosc and carray installed. Matplotlib can also be useful, but not strictly necessary.

[1] http://www.pytables.org/docs/CISE-12-2-ScientificPro.pdf [2] http://code.google.com/p/numexpr [3] http://blosc.pytables.org [4] https://github.com/FrancescAlted/carray

tagged by
no related entity