Two factors dominate current molecular biologyincreasing very rapidly and successful applications in biomedical research require carefully curated and annotated databases. The quality of the experimental data especially nucleic acid sequences is satisfactory; however, annotations depend on features inferred from the data rather than measured directly, for instance the identification of genes in genome sequences. It is essential that these inferences are as accurate as possible and this requires human intervention. With the recognition of the importance of accurate database annotation and the requirement for individuals with particular constellations of skills to carry it out, annotators are emerging as specialists within the profession of bioinformatics. This book compiles information about annotation its current status, what is required to improve it, what skills must be brought to bear on database curation and hence what is the proper training for annotators. The book should be essential reading for all people working on biological databases, both biologists and computer scientists.It will also be of interest to all users of such databases, including molecular biologists, geneticists, protein chemists, clinicians and drug developers.
Preface.List of Contributors.1. Annotation and DatabasesProspects (M. Hoebeke, H. Chiapello, J.-F. Gibrat, Ph. Bessieres and J. Garnier).I: THE DATABANKS.2. Survey of Sequence Databases: Archival Projects (M. Magrane, M. Garcia-Pastor and R. Apweiler).3. Survey of Sequence Databases: Derived Databases (M. Pruess, N. Mulder and R. Apweiler).4. Databanks of Macromolecular Structure (H.J. Bernstein and F.C. Bernstein).5. Taxonomy: a Moving Target for Sequence Data (M.I. Krichevsky).7. Genomics and Proteomics: Design and Sources of Annotation (K. Mayer and G. Mannhaupt).8. Annotation of Protein Sequences (W.C. Barker and C.H. Wu).9. Issues in the Annotation of Protein Structures (G.J. Swaminathan, J. Tate, R. Newman, A. Hussain, J. Ionides, K. Henrick and S. Velankar).10. Classification of Protein Function (A.M. Lesk, H. Parkinson and J.C. Whisstock).III: DATABASE DESIGN AND INTEGRATION.11. Information Flow and Data Integration of Databanks (C.H. Wu and W.C. Barker).12. Models of Database Interconnectivity (G.J.L.Kemp).13. The European Bioinformatics Institute Macromolecular Structure Relational Database Technology (H. Boutselakis, D. Dimitriopoulos, K. Henrick, J. Ionides, M. John, P.A. Keller, P. McNeil, J. Pineda and A. Suarez-Uruena).IV: CONCLUSIONS AND PROSPECTS.14. Looking Around, Looking Ahead (A.M. Lesk).Index.