Scientific names pose a challenge for modern biologists: a single species may have been given different names, or differently spelled names, while in some cases the same name has been used for different species. Yet they're the only available identifiers to connect a diverse array of biological data collected and stored separately. We need computational tools to make these connections quickly and repeatably.
Taxonome is a set of open source tools for reading, standardising and combining species-level data. At its core is an algorithm for matching scientific names to a given set of accepted names, overcoming the hurdles of synonyms, homonyms and spelling variations. Once the names are standardised, data from different sources can be combined to answer interesting biological questions. These tools can be used either as an object-oriented Python API, or a cross-platform GUI application, which is also written in Python. Other features include functions to standardise distribution information, and integration with a range of web services to retrieve relevant information.
I'll describe how Taxonome grew out of my own research, the tools on which it's built, and some of the results that were made possible by combining datasets of plant traits.