JCSG Target Selection Strategy in PSI-2
JCSG is a large-scale center within the Protein Structure Initiative (PSI) Network:Target selection strategy at JCSG is aligned with the PSI main scientific mission of making the three-dimensional atomic-level structures of most proteins easily obtainable from knowledge of their corresponding DNA sequences. This mission is divided into a primary goal of determining structures of representatives of large protein families and a secondary goal of solving multiple representatives from specific families of significant, biomedical importance. As the PSI main scientific goal can be achieved only by synergistic development of high-throughput experimental structure determination methods and theoretical protein structure prediction algorithms, an important goal of target selection is to provide examples for the development and verification of the latter. Target selection at JCSG is closely coordinated with other PSI centers through the Target Selection Committee being part of the PSI-BIG4.
Understanding the Central Machinery of Life:Within these overall goals, JCSG focuses on determining structures of proteins from families with broad phylogenetic distribution, especially proteins conserved between prokaryotic and eukaryotic organisms. Such proteins perform most fundamental functions in living organisms and mutations or deletions of proteins from this group are usually lethal or lead to serious diseases. Building a full molecular catalog of structures from central machinery of life would significantly add to our understanding of life and the molecular mechanisms of fundamental biological processes. In addition, evolution of structures and functions of proteins performing such functions would allow us to understand mechanisms of change, adaptation and divergence in living organisms.
Aiming at the full structural
coverage of a simple model organism:In the first phase of PSI, JCSG made significant progress in
genome-wide structural coverage of the hyperthermophilic bacterium Thermotoga
maritima. Because of efforts of JCSG and other structural genomics
centers, T. maritima
is now one of the genomes with the most complete structural coverage.
Continuously updated report on the structural coverage of T. maritima
is available on-line at http://ffas.burnham.org/ffas-cgi/cgi/tm_cov.pl.
Primary goal of PSI ? structural coverage of large protein families:The bioinformatics groups of the PSI centers, in collaboration with broad community of researchers from sequence and structure analysis fields, are working on the development of the structure-centric definition of a protein family. The first, community wide meeting on this topic would take place in Bethesda,MD on June 26/27 2006 http://www.nigms.nih.gov/News/Meetings/PSI-TargetSelection2006.htm. In the meantime, 1269 PFAM families without structural coverage have been selected as PSI structure determination targets and divided between the four large-scale PSI centers. JCSG was assigned 271 families from this group and currently solved 22 of them. An additional group of 397 new families have been identified by bioinformatics groups of the PSI centers and will be divided between the centers in May 2006. Many of the families from this group have homologs in both prokaryotes and eukaryotes and/or in T. maritima , thus also fit specific research goals of JCSG.
Secondary goal of PSI ? fine-grained structural coverage of specific protein families:Multiple proteins from selected protein families are targeted to provide more detailed information about the structural divergence within the family. The main reason for this is to gain more information about biomedically important protein families, but also to provide material for the improvement of modeling methods. JCSG aims at spending approximately 30% of its efforts on the fine grained structural coverage goals.
JCSG target selection strategy ? optimizing for success:Protein Structure Initiative revolutionized many aspects of structural biology research, among them access to data on structure determination, including information on failed attempts. This in turn allowed large scale data mining and learning to identify protein physicochemical features correlated with success in structure determination. Our analysis suggests that in most protein families only a small percentage of proteins could be successfully crystallized without extensive sequence modification. In selecting individual protein targets for structure determination, JCSG attempts to identify protein most likely to yield well diffracting crystals. Distribution of such proteins is relatively broad between different organisms and while some organisms have much higher percentage of crystallizable proteins, a large number of target genomes are needed to assure optimal choice of targets from every interesting protein family (see JCSG target genomes).
Target feasibility categories based on the analysis of TargetDB
JCSG production pipeline
Strategy:The T. maritima full genome analysis has enabled sufficient flow through the pipeline to facilitate its assembly and testing. This high volume of targets through the pipeline has not only helped the development of automation of the individual process stages but also global pipeline processes such as target tracking, process optimization, and information management.