CellProfiler is an open-source system for analyzing large collections of images, with an emphasis on images of cells from large-scale biological experiments (i.e., "high content screening"). Its goal is to give experimental biologists access to powerful analysis algorithms in a user-friendly system, in order to support quantitative analysis of images. Users build pipelines from modules that break up complex tasks into simple steps, such as loading images, identifying objects, measuring particular features for each object, and writing results to a database or spreadsheet. Modules for many common image processing techniques are included, allowing for very flexible analysis. Support for cluster and distributed processing allows analysis of very large collections of images.
The CellProfiler project was started in 2004, at the Whitehead Institute for Biomedical Research and MIT, motivated by a need for flexible tools for analysis of high-throughput experiments being performed in the laboratory of David Sabatini. Its first public release was in 2005, with a technical article describing the software published in late 2006. Since then, it has been downloaded 20,000 times, and cited more than 250 times in the scientific literature.
CellProfiler 1.0 was written in MATLAB, but several factors prompted us to move to Python when it was time to develop version 2.0: - a desire to grow the developer community, - good support and fast bug-fixing from NumPy and SciPy communities, - portability and multi-platform compilation tools (py2exe, py2app), - better looking GUIs with less work using wxpython, - a simpler, more consistent language for development, and - better support for interfaces to third-party code (C, Java).
CellProfiler relies heavily on third-party libraries and tools. Image and data analysis are implemented with NumPy and SciPy (particularly scipy.ndimage), along with several low-level algorithms written in Cython, for performance reasons. The user interface is built in wxPython, with matplotlib for plotting. CellProfiler has interfaces to Java, via a Cython wrapper of the Java Native Interface, which is used for loading microscope images (Bio-Formats) and to support third-party analysis tools (ImageJ plugins). We use py2app and py2exe to distribute binaries to end users.
Future development of CellProfiler is targeted to three primary areas: a simplified user interface, new and faster analysis algorithms, and shifting useful algorithms out of CellProfiler and into external open-source projects. A simpler interface and faster algorithms will allow a larger number of users to take advantage of CellProfiler, as well as improve the experience for existing users. We are also adding methods for new types of biological image analysis problems, such as the Worm Toolbox, for analysis of images of C. elegans. Finally, there are several pieces of code in CellProfiler (such as morphological operators, nonlinear filters, and the JNI wrapper) that could be contributed upstream, e.g., to SciPy or scikits.image, or spun off into independent projects. This would simplify our code and development while improving our contribution to the open-source and scientific python communities.