Talk High-throughput structural bioinformatics using Python and p3d

Presented by Christian Fufezan in Scientific track 2010 on 2010/07/10 from 16:30 to 16:45 in room Dussane
Abstract

Knowledge based approaches profit from the increasing amount of available data, the rapidly evolving microchips and therefore rapidly decreasing cost of computation, and last but not least from the ability to easily develop computational tools that help to analyze the tremendous amount of data.

The rapid development of these tools is possible thanks to high-level scripting languages, such as Python. Python offers not only abstract layers of network communication (to e.g. retrieve data) but also offers modules that allow seamless integration of multicores and multiprocessors, which is essential for high-throughput computational tools.

In this work, p3d a Python module for structural bioinformatics is presented. The module allows to rapidly develop scripts for high-throughput analysis of protein structures. p3d was developed using the advantages and incorporating the philosophy of Python (Fufezan & Specht, 2009).

p3d offers:

  • a BSPtree that allows fast spatial access (i.e. locate all atoms within a given radius or boundary box),
  • atoms that are pooled in sets that allows for combinatorial set operations (e.g. atoms that are nitrogens and part of the Tryptophan amino acids but not chain A),
  • atoms that are treated as vectors (thus simple vector operations can be done on an atom basis) and
  • functions that allow to combine the above items using a human readable query function (e.g. "Oxygens of a ligand and within 3 Å of proteinogenic carbonyl groups").

A short overview of the functionality and recent applications of p3d is given.