Recent comparative genomic and other large-scale bioinformatics studies increasingly have been using gene annotations, functional classifications, and complimentary data from the emerging “-omics” disciplines. Indeed, such analyses have better chances to uncover hidden patterns in complex multidimensional and heterogeneous biological systems data. On the other hand, inferences from such studies are extremely sensitive to data samples and quality, and are more difficult to compare or replicate owing to differences in supplementary data sources at times not publicly available. As a contribution toward the unification and integration of good quality data from heterogeneous bioinformatics resources, we present here an integrated data bank PANDITplus. It is built as an extension of PANDIT, the database of PFAM alignments and phylogenetic trees for known protein domains and families spanning lineages from the three domains of life. PANDITplus is a relational database containing information on functional categories, metabolic pathways, protein–protein interactions, disease associations, gene expression, three-dimensional structure, as well as estimates from evolutionary analyses of selective pressures. User-friendly interface enables customized queries and fast data access. We recommend PANDITplus as a common bioinformatics platform for testing evolutionary hypotheses, which go beyond the mere inferences from molecular data by incorporating supplementary gene information. Equally, PANDITplus provides an excellent resource for the development, testing, and comparison of statistical models of substitution and probabilistic dependencies between a molecular sequence and its various attributes. The database may be accessed via http://www.panditplus.org.
Comparative genomics, evolution, phylogenetics, gene family, protein