Talk Highly efficient computations in Python: well beyond NumPy

Presented by Francesc Alted in Advanced tutorial track 2010 on 2010/07/08 from 10:30 to 12:00 in room Dussane

NumPy is de-facto standard for performing numerical computations in Python. It provides a compact container for representing data arrays and permits to perform artihmetic operations in an element-by-element fashion, as well as supporting array broadcasting, type casting, and several other outstanding features.

Unfortunately, NumPy cannot efficiently deal with the evaluation of complex expressions in the form "4.4*x**3+3.2*y**2". The reason is that it has to deal with temporaries for the above expression in the order established by the Python interpreter and, as it it turns out, this order is rather pessimal for modern computers.

During my tutorial, I'll briefly explain what is exactly the problem, and will introduce a series of tools for evaluating complex expressions from Python at almost C speeds. Among the tools I'll talk about, there will be:

  • Numexpr, a memory-efficient computing kernel
  • Intel's Vector Math Library, for accelerating vector calculations
  • Blosc, a compressor that can run faster than a pure memcpy()
  • PyTables, a library that can make use of all the above tools

After this, I'll propose some exercises that will allow the attendees to better grasp how to achieve very high performance in their computations by using the above tools in combination with NumPy.

Attendees should bring their laptops with PyTables 2.2rc2 or higher installed. Matplotlib can also be useful, but not strictly necessary.