As genome sequences of pathogenic fungi become increasingly available, our primary challenge is the integration of existing knowledge about gene function with the genome sequences. The genomes of more than 30 fungi have been sequenced, however our knowledge of the gene content of these fungi is limited. Biologists are most confident in gene annotations that have been reviewed by teams of curators. Curators can review results of sequence similarity searches and other computational analyses, and incorporate knowledge from the scientific literature, to assist in their search for a gene’s function. This is a time consuming process, though, and it is becoming clear that the rate of genome sequence acquisition is outpacing our ability to assign gene functions manually. While computational techniques for predicting gene structure are well defined, methods for predicting gene function are limited. Improved computational tools for protein functional annotation are being developed, however these tools are limited in that they usually consider only one or a few properties of the proteins in their classification scheme. More sophisticated approaches are needed that can approximate the decision-making logic used by human curators and that can utilize more of our knowledge about protein function.
- AAPFC: Automated Annotation of Protein Functions (Texas A&M)