| |
JCSG
TECHNOLOGIES
JCSG is
deeply committed to the development of new technologies that facilitate
high throughput structural genomics. The areas of development include
hardware, software, new experimental methods, and adaptation of existing
technologies to advance genome research. In the hardware arena, our commitment
is to the development of technologies that accelerate structure solution by
increasing throughput rates at every stage of the production pipeline.
Therefore, one major area of hardware development has been the implementation
of robotics. In the software arena, we have developed enterprise resource
software that track success, failures, and sample histories from target
selection to PDB deposition, annotation and target management tools, and
helper applications aimed at facilitating and automating multiple steps
in the pipeline.
Click on the technologies listed
in the following index to access detailed descriptions
PROTEIN
PRODUCTION [back
to Index]
Microexpression System:
Small-scale expression provides enhanced screening capability, as many more
clones can be evaluated to identify targets, as well as truncations or mutations,
which either fail to express, or express in the insoluble fraction. A low-cost,
high-velocity incubating commercial shaker has been adapted for high-throughput
E. coli expression screening to accurately predict large-scale protein
behavior. Cultures (~750 µL) are grown in deep-well 96-well blocks
to achieve optical densities (O.D.) up to 10-20, that enables evaluation
of expression and solubility via small-scale purification by IMAC. Moreover,
this screening strategy can be adapted for SeMet or 15N/13C-labeled expression.
Of the soluble targets produced in the micro-expression device, 97% correlate
with successful expression in large-scale fermentation. This device is suited
for both nanocrystallization trials and NMR screening for protein folding.
[back
to Index]
Cloning
Robotics: A
large number of expression clones must be generated within the pipeline
to accommodate the number of targets, expression systems and variants
for each gene targeted. Many options for creating such expression clones
were evaluated, including recombinatorial (Gateway/Echo) and topoisomerase
treated systems. To maximize flexibility and minimize cost, we chose to
automate a conventional cloning approach. We developed a robotic platform,
which incorporates liquid and plate handling, with thermocyclers and a
plate reader, and demonstrated the capacity to provide up to 384 validated
expression clones per week, which is sufficient to meet our pipeline needs.
To date, over 2500 total expression clones have been generated with this
system by a single operator. [back
to Index]
Large-scale
bacterial expression:
Protein expression has primarily been performed in E. coli. To
allow expression at a scale sufficient for crystallization trials, we
developed a parallel fermentation system (GNFermentor), for parallel 96-culture
high-density cell growth that produces 2-4 g of cell pellet. Pre-induction
O.D. values vary only 5% between individual cultures, highlighting the
importance of the tightly regulated expression system (arabinose) that
we employ. To date, over 30,000 individual samples have been processed
through this system demonstrating its robust nature. [back
to Index]
Baculovirus
expression: We
have implemented a small-scale (10ml) baculovirus expression screening platform
and a large-scale (10 liter) expression platform for expression of over
50 mouse proteins to date. The small-scale screening platform integrates
two Tecan robotics platforms to perform transfection, viral amplification
and expression screening and can be used to perform parallel, small-scale
baculovirus screens using 96 different constructs that is of great value
when trying to identify which expression constructs produce the maximum
amount of soluble protein, especially for eukaryotic targets. We have also
demonstrated SeMet incorporation (80%) with the large-scale system. [back
to Index]
Automated
affinity purification: Processing
of the resulting cell pellets through affinity purification is performed
with custom automation (GNFuge). Fermentation tubes are directly processed
in the GNFuge, for the steps of lysis, removal of cell debris and affinity
purification. The resulting affinity purified proteins can then be processed
by secondary purification or can be advanced directly to crystallization
screening. [back
to Index]
Secondary
purification: Purification
beyond affinity steps is achieved using standard commercial instrumentation,
which has been configured for automated large-scale purification. By integrating
a custom valve configuration and an air sensor with the Akta Purifyer systems
(Pharmacia), we can achieve automatic loading and processing of up to 12
samples, without the limitations on initial sample volume imposed by commercial
autosamplers. With three such systems online, our demonstrated capacity
for secondary purification is approximately 48-96 proteins per week at a
10-50mg scale. [back
to Index]
BIOPHYSICAL
CHARACTERIZATION [back
to Index]
Biophysical characterization of samples is a critical component of our pipeline
process that provides guidance for target strategies, and metrics for evaluating
the various pipeline components. However, performing such characterization
on a large number of targets has serious implications on pipeline throughput.
The JCSG has devoted significant effort towards developing HT approaches
to protein characterization and the gathering and tracking of this information
for thousands of samples. The volume of data is enormous and has emphasized
the need for active target management to take advantage of such knowledge
as it arises. These biophysical data are also of tremendous value to the
scientific community and for collaborative functional studies. [back
to Index]
Multiparametric
Biophysical Protein
Characterization:
Biophysical parameters currently collected for each target are:
| Parameter |
Methodology |
| Toxicity during
expression |
Final optical
density |
| Cofactor binding |
UV/Vis absorbance
scan |
| Protein concentration |
Bradford |
| Protein purity |
SDS-PAGE |
| Isoelectric point |
IEF gel electrophoresis |
| Protein fingerprinting |
Tryptic Mass
Spectrometry |
| Thermostability |
Differential
Scanning Calorimetry |
| Polydispersity/Native
Mw |
Analytic Size
Exclusion Chromatography |
| Metal binding |
X-ray Absorption
Fine-Structure Spectroscopy |
1D
1H NMR Fold Screening: One example of biophysical testing
is presented here. 1D 1H NMR screening is used to characterize
the folded state of protein targets prior to crystallization trials in
order to prioritize targets that will undergo extensive crystallization
efforts, to identify targets suitable for NMR structure determination
and to design truncations. For efficient screening, we have identified
conditions suitable for both NMR screening and initial crystallization
trials. Following purification, proteins are concentrated to slightly
greater than 0.5 mM in screening buffer without D2O. D2O
is then added and samples are immediately flash frozen and stored at -80°
C until they are screened. After screening, the same samples are then
used for initial crystallization trials; samples prepared both with and
without D2O crystallize in the same/similar conditions.
Pre-saturation spectra are recorded at 285 K using a Bruker Avance600
spectrometer, from 20 seconds to 1 hour, depending on protein concentration,
with an average of 5 minutes per sample. The resulting spectra are then
graded for quality, with an ‘A’ spectra indicative of a well-folded
protein and a ‘D’ indicative of an unfolded protein; additional
comments on observed higher order structures are also recorded. If the
protein is well-folded, with ‘A’ or ‘A-B’ spectra,
it is suitable for structure efforts. If the protein appears to be unfolded,
with a ‘C-D’ or ‘D’ spectra, the target may be
setup for crystal trials, but likely enter salvage pathways or be dropped.
[back
to Index]
CRYSTALLIZATION
[back
to Index]
Nano-drop crystallization:
Nano-drop crystallization technologies were first developed by the members
of the JCSG. Despite many researchers being skeptical that nanoliter volumes
would yield diffraction quality crystals, we have routinely utilized these
technologies in PSI-1 to screen crystallization conditions and generate
diffraction-quality crystals. A combination of fully automated crystallization
and imaging robotics have been a key part of our pipeline since its inception
and have greatly contributed to our ability to process large numbers of
targets. Both custom and commercial instrumentation is currently in use
for our crystallization trials. Through a contractual agreement, for very
low volume (50nl) experiments, GNF maintains access to the custom crystallization
robotics developed at GNF and located at Syrrx. A commercial system (Apogent)
is also located at GNF for larger volume experiments (400nl). The JCSG
facilities at TSRI also include an Innovadyne dispenser. [back
to Index]
...
Crystal
imaging:
Purchase of a new, fully-integrated crystallization setup and imaging system
(Robodesign) has been approved using funds from the JCSG, as well as IAVI
and TSRI. The Robodesign system will provide 100nl dispensing capability
and fully automated plate setup and imaging. Currently, imaging at GNF is
performed using two custom robotic platforms located in constant temperature
4°C and 20°C rooms with capacity for 1536 plates. Plates are assigned
an imaging schedule and are automatically screened, typically at 7, 14 and
28 days. To date, over 3,000,000 images have been generated from these imagers.
The TSRI facility utilizes a Veeco imager and plates are manually tracked
for imaging. The new Robodesign platform to be installed at TSRI will have
capacity for 4000 plates at up to 6 temperatures and will utilize a fully
automated imaging schedule and image analysis software package. [back
to Index]
...
CRYSTAL SCREENING FOR DIFFRACTION QUALITY
[back
to Index]
To fulfill the demands of the
JCSG HT structure determination pipeline, it was clear at the outset that
an automated crystal screening capability would be a vital asset. The JCSG
pipeline is currently producing in excess of 500 crystals per month for
diffraction screening. X-ray screening forms a critical feedback loop, which
is used by the CC to identify promising targets and crystallization conditions.
Manual mounting and dismounting of crystal samples at the beam line is a
labor-intensive task, which wastes significant beam time and is prone to
human error. SDC has co-developed a completely automated crystal screening
system in close collaboration with the core Structural Molecular Biology
group at SSRL, which meets the needs of both JCSG and the wider structural
biology community. The key features are:
Compact
crystal cassette:
Secure crystal transport and storage is accomplished via a compact, cylindrical,
aluminum crystal cassette, which holds 96 crystals. Crystals are mounted
on standard Hampton Research sample pins. Two cassettes can fit inside a
standard vapor shipping dewar and twenty cassettes can be held inside a
Taylor-Wharton HC-35 storage dewar. JCSG crystals are shipped exclusively
using these cassettes. This system has been very robust and reliable. Kits
of cassettes with loading and handling tools have been fabricated and distributed
to SSRL users. [back
to Index]

Stanford Auto-Mounter (SAM):
Individual crystals are mounted onto the beam line for screening using the
SAM system. Three sample cassettes are held under liquid nitrogen in a dispensing
dewar, which is located close to the goniometer, inside the experimental
hutch. A commercial Epson ES553S 4-axis robot, outfitted with a pneumatically
operated cryo-tong, removes samples from the cassette and places them on
the goniometer. The SAM system also allows sorting of crystals from one
cassette to another. Thus, the most promising crystals can be consolidated
into a single cassette prior to data collection. The sorting facility is
now in a prototype stage and will be developed into a full user system in
the near future. SDC has fully integrated the SAM system with the existing
macromolecular crystallography beam line environment by implementing a user-interface
within the BLU-ICE data collection software. The system also communicates
with the JCSG database via a “beam line report”, which is an
Excel spreadsheet describing the crystals in each shipment. [back
to Index]

Sample
visualization and loop alignment system:
Reliable centering of the sample with the X-ray beam is an essential step
for automatic screening and requires good sample illumination and imaging.
A high-quality visualization system was developed by SDC on BL11-1 at SSRL
and replicated on all other beam lines. The system is composed of a Navitar
12x lens system, with a large depth of field. The lens system is coupled
to an Optronics CCD camera and images are digitized via an Axis 2400 www-based
image server. A bright, diffuse backlight provides high contrast images
for loop alignment. However, the long working distance creates shadows inside
the loop, which sometimes make it difficult to visualize the actual crystal.
In the future, we plan to upgrade the lighting system. SDC has developed
a software protocol, which uses standard edge detection techniques to align
the sample and its loop with the X-ray beam. Since a fairly large beam (0.25x0.25mm)
is used for crystal screening, this approximate alignment of the actual
crystal is adequate for automated screening. The entire alignment procedure
takes ~30 seconds with >95% reliability. Each crystal is mounted and
aligned with the X-ray beam. A visual JPEG image of the crystal and a corresponding
diffraction image (typically 15 seconds exposure) are collected at two crystal
orientations, 90° apart. A cassette of 96 crystals can be screened without
human intervention in ~5 hours. [back
to Index]

DIFFRACTION
DATA COLLECTION
The majority of the JCSG data collection has been conducted on the macromolecular
crystallography beam lines at SSRL. The SSRL storage ring, the Stanford
Positron Electron Asymmetric Ring (SPEAR), was recently upgraded to 3rd
generation synchrotron capabilities and now offers increased brightness
and higher operating ring current. All protein crystallography beam lines
have benefited from the upgrade and typical exposure times have been significantly
reduced. During the SPEAR-3 upgrade from April 2003 to March 2004 and also
during shorter SSRL maintenance shutdowns, JCSG data were collected at the
Advanced Light Source (ALS) and the Advanced Photon Source (APS). A program
proposal provided time at APS (distributed over: SBC-CAT, BIO-CARS and NE-CAT)
and a Memorandum of Understanding provided regular access at ALS. During
these shutdown periods, the SAM system was used with an X-ray microsource
generator to pre-screen crystals before trips to remote beamlines. [back
to Index]
Automated MAD data collection
with BLU-ICE: Over the last 4 years, JCSG has contributed
to the ongoing development of the BLU-ICE data collection software at SSRL.
In addition to the new crystal screening capabilities (described above),
BLU-ICE now supports completely automated execution of MAD data collection.
Suitable energies for the MAD experiment are derived automatically from
a Kramers-Kronig analysis of the fluorescence scan. The energies are imported
directly into the Data Collection Tab in BLU-ICE. All wavelength changes
are conducted automatically and the X-ray beam intensity is optimized at
each change. In addition, hardware upgrades on the wiggler side-station
beam lines now support MAD experiments. The experimental table is mounted
on a reproducible slide that can track the deflection of the X-ray beam
at different energies. A dose mode exposure time normalizes the beam intensity
across all wavelengths and data collection is paused automatically if the
storage ring beam is lost. [back
to Index]

Remote data collection:
With the SAM system in full operation, the complete diffraction experiment
can be initiated remotely. Thus, JCSG can capitalize on remote-access developments
which have were mainly funded through an NIH-NCRR grant for the creation
of a Crystallography Collaboratory at SSRL. The only time a staff member
is required at the beam line is to change one of the three crystal cassettes,
or if manual hardware maintenance is required. Live video feeds from the
beam line are now incorporated into BLU-ICE, which further helps diagnose
problems remotely. As a result, it is now possible to run and monitor the
beamline from a remote location, such as an office or at home. These features
greatly reduce the personnel requirements for JCSG data collection experiments.
[back
to Index]
DATA PROCESSING
AND STRUCTURE DETERMINATION [back
to Index]
SDC has developed tools to automate the analysis of crystallographic data.
The system includes an electronic notebook, which records all diffraction
experiments, and Xsolve, a Linux-based parallel processing environment.
Xsolve:
Xsolve can execute all crystallographic data processing and MAD structure
determination steps. Xsolve also prepares a standard set of files for
upload to the Structure Solution Tracking System (SSTS), which provides
a direct interface to the JCSG database. Xsolve allows parallel processing
of structure determination tasks using a variety of established crystallographic
applications. The Xsolve system has a flexible and open architecture so
that new versions of applications can readily be upgraded and newly emerging
programs can easily be incorporated. In this way, SDC can quickly capitalize
on developments made by the wider crystallographic community. Xsolve performs
all processing steps including initial indexing of a diffraction image,
integration, scaling, phase determination, phase improvement and initial
model building. The system has been optimized to provide high quality
results for direct upload to the JCSG central database. [back
to Index]
Customized scripts:
SDC has also developed several in-house scripts to prototype new programs
and allow rapid data processing at various remote synchrotron sources.
These scripts are made available to regular users at SSRL. One script
provides automatic data reduction and structure solution via XDS and Solve,
and another provides an easy interface to structure determination via
SHELX and Solve. [back
to Index]
Molecular Replacement
pipeline: The JCSG has also developed a highly parallelized
Molecular Replacement (MR) pipeline that facilitates all steps in MR structure
solution, including homology detection, model preparation, MR searches
and automated refinement and rebuilding. Processed diffraction data are
fed into the MR system directly from Xsolve. Search models are based on
sequence alignments generated using the profile-profile alignment method
implemented in the FFAS03 system. In collaboration with the research groups
at Burnham and UCSD, the JCSG team has used improved alignment and modeling
tools and massive computer power to push MR beyond the traditional limits.
In general, MR solutions are seldom attempted (and are even less often
successful) against templates with less than 35% sequence identity. To
date, the JCSG MR pipeline was successfully applied to over 26 cases with
less than 35% sequence identity, 10 cases with less than 30% and several
cases where sequence identity was close to 15%. Our analysis shows that
fold recognition models have a significantly higher success rate, especially
when the unknown structure and the search model share less than 35% sequence
identity. Using MOLREP and EPMR, 3 out of 26 MR targets under 35% sequence
identity could only be solved with models derived from fold recognition
methods and 6 showed significantly better statistics and behavior in subsequent
refinement. [back
to Index]
MODEL BUILDING,
REFINEMENT, AND QUALITY CONTROL
[back
to Index]
As the JCSG structure solution rate has increased, a bottleneck has developed
at the model building and refinement stages. A collaboration with Anastassis
Perrakis and the ARP/wARP development team is improving the initial models
built by Xsolve and internal methods development effort at SDC is addressing
subsequent model completion. A network of JCSG scientists was established
to perform structure refinement. In order to ensure uniform quality standards
for all JCSG structures, a formal internal Quality Control (QC) step was
introduced prior to structure deposition in the PDB. From the early structures
submitted for QC analysis, a detailed set of refinement guidelines was
developed, which has standardized the refinement protocol for all JCSG
structures. All JCSG refinement is carried out with the latest version
of Refmac. TLS parameters, a riding hydrogen model and NCS restraints
are evaluated for impact on the R-free. Experimental phase restraints
are always included when available. Whatcheck, ADIT and PDB deposition
tools and Molprobity are used to validate the structure. Missing atoms
and unknown ligands are treated in a uniform way. Residue numbering is
standardized and PDB REMARK cards are generated. Finally, before PDB deposition,
all other crystals and datasets from the same target are checked for any
“added value,” such as a new crystal or dataset with improved
resolution or a bound ligand. Through the implementation of these refinement
guidelines, both the quality and the refinement time for JCSG structures
have improved and the PDB deposition process has been streamlined. QC
has become an integral part of the pipeline and is no longer simply a
stage related to the preparation of files for deposition to the PDB. As
a result of these extensive efforts, the average quality of the JCSG structures
is significantly better than the average for both the PDB as a whole and
for the PSI structural genomics centers.
Validation
Suite: Prior to deposition in the Protein Data Bank, the
quality of JCSG structural models is validated through the JCSG
Validation Suite, which groups under a single web interface the programs
ProCheck, SFCheck, WhatCheck, Errat, DDQ, Prove, and Wasp.
[back to Index]
STRUCTURE DEPOSITION
IN THE PROTEIN DATA BANK [back
to Index]
The final step of the JCSG pipeline involves deposition into the PDB.
Coordinates of structures that passed the QC are combined with database-derived
information about the history of the targets and specific protocols used
in structure determination and parsed to two mmCIF files to deposit the
coordinates directly with the PDB. The first mmCIF file contains all data
needed to generate the release version of the PDB coordinate file. The
second file contains the structure factors, the unmerged reflection intensities
for all datasets used for refinement and phasing, and the experimental
phases and density modified experimental phases. The structure deposition
process is largely automated and uses mmCIF-writers (command line scripts)
to generate the two mmCIF files directly from data captured in the JCSG
database. The process still requires some manual oversight, mostly for
checking completeness and internal consistency of the annotations; however,
the entire process takes less than 3 hours. The data required to complete
the PDB deposition are now captured in the JCSG database, and software
is currently under development to complete the automation.
COMPUTATIONAL
TARGET ANALYSIS AND FUNCTIONAL ANNOTATION [back
to Index]
In collaboration with UCSD, Burnham and ANL bioinformatics groups, JCSG
has developed a unified protein structure and sequence analysis system
that includes predictions about the function of proteins solved by the
experimental pipeline. Elements of the system include structure similarity
analysis performed by DALI, CE and FATCAT structure alignment programs,
distant homology analysis performed by the FFAS profile-profile alignment
program, and genome context and pathway analysis performed by the SEED
system. These annotations are manually analyzed and subjected to internal
discussions using a unique system of interactive annotation pages developed
at JCSG. Through application of this system, functional annotations of
over half of the proteins solved by JCSG, including several previously
unannotated “hypothetical proteins,” have been established
with high reliability and have now been entered into public databases.
In addition, a functional annotation page has been created for each target,
which instantly allows JCSG scientists to curate and update biological
information generated during the structure determination process.
Protein Sequence Comparative
Analysis System (PSCA): Access to target annotations can
be accomplished through the PSCA system. Annotations from public databases,
links, and preprocessed target information are available through a tabbed
user interface. Data such as fold similarity, sequence similarity, domain
organization or physicochemical properties are periodically precalculated,
which highly speeds up access to a large collection of data for each target.[back
to Index]
Manual Annotation
System: The JCSB Bioinformatics Core has developed a collaborative
annotation system which allow multiple users to annotate targets. Users
can access the annotation history of each target , modify existing annotation
(grading the level of accuracy of the annotation), include links to other
resources, update bibliographical references, or upload supporting materials
including experimental data. [back
to Index]
Reports:
JCSG has developed a number of automated reporting tools that greatly
facilitate the work of JCSG researchers by extracting and summarizing
large amounts of raw data from the JCSG database. Information is displayed
in web-accessible tables or Excel spreadsheets that can be downloaded
for local access. [back
to Index]
PUBLICATION AND
DATA DISSEMINATION [back
to Index]
Public tracking system
and website: The central JCSG database provides high-level
tracking of targets and production metrics. Integral to this database
is an extensive body of bioinformatics data on individual targets. The
public tracking system provides access to the data contained in the JCSG
database and allows the extraction and filtering of specific subsets according
to user-defined criteria. The JCSG website is the main public outreach
and data dissemination tool. The website also plays a crucial role as
an internal data dissemination and communication tools between the JCSG
cores as well as being one of the entry points for experimental data deposition
in the JCSG database. Some of the innovative visualization tools available
via de JCSG website include a graphical view of the complete history of
every target in the JCSG pipeline.
Customized
tracking lists: The public tracking interface at www.jcsg.org
and the XML target list deposited weekly to TargetDB are generated automatically
from the database; however, they highlight only a small fraction of the
total data collected by JCSG. Users can register to obtain e-mail alerts
on individual targets and create personalized views of the JCSG database
that focus on groups of proteins of interest.
Structure
Notes: JCSG structures are shared with the scientific
community not only through deposition in the PDB, but also through publication
of a "structure note." Structure notes are short papers describing
the annotation, biology, structure and functional implications of each
protein. The process of collecting all relevant data, from all stages
of the JCSG pipeline has been streamlined through the central JCSG database,
which includes information on the sequence, annotation, cloning, purification,
crystallization, data collection, structure solution, tracing, refinement
and structural evaluation. The structure note automatically captures any
functional information in the JCSG annotation system (see above). The
paper introduction, for example, includes annotation information, with
a brief biological background taken and curated from the PFAM, Interpro,
SwissProt, BRENDA, and SEED databases. Methodological and experimental
data, as well as all crystallographic statistics, are automatically harvested
from the JCSG database and assembled into purification, crystallization,
structure solution and refinement paragraphs. The structure description
and the preparation of figures are done manually using PYMOL. Structures
are analyzed, compared and evaluated for biological significance using
a plethora of structure analysis tools including structural homology searches
(DALI, CE, FATCAT), and extensive literature searches.
Downloadable datasets:
JCSG has created a unique repository of X-ray crystallographic datasets
for the structures it has solved and deposited in PDB. This archive contains
the experimental and analysis data from data collection, data reduction,
phasing, density modification, model building and refinement. These datasets
are availble as test data to the crystallographic methods development
community.
DATABASE AND LIMS
DEVELOPMENT [back
to Index]
A dedicated database was developed by the JCSG programming team. The computational
development was carried out in parallel with the development of the physical
production pipeline. Currently, the JCSG database connects all experimental
elements in the pipeline. It interactively analyzes data at each stage
and provides up to date information to facilitate the optimal course of
action for each individual target.
Tracking Database:
The central JCSG tracking database was developed from scratch in Oracle
and contains 130 tables that describe 28 production stages and tracks
424 parameters. The interface, written mostly in Perl, include 50 custom
scripts, 100 user-interfaces, and 19 different reports that are preparated
daily in both XML and Excel formats and altogether comprise about 360,000
lines of code.. [back
to Index]
Laboratory Information
Management System: The JCSG database contains a Laboratory
Information Management Systems capable of tracking every step from target
activation to structure solution, refinement and deposition. This system
has submenus specifically taylored to the needs of each core. The LIMS
systems collects information, tracks materials, provides data entry and
visualization interfaces, and functions as central hub to directs the
flow of information within JCSG.
[back
to Index]
Automated PCR
primer generation tool: Once the target is selected, a primer
generation tool calls the cDNA sequence, creates primers with the correct
Tm for the selected experimental conditions, and submits them via pre-scripted
form to our vendor. In this way, human error as well as the tedium of
cut-and-paste are removed from the process. To see the tool in action,
click on the following image. [back
to Index]
PIPELINE DATA-FLOW
ANALYSIS AND DATA MINING [back
to Index]
The ability to mine data from a consistent process is invaluable for optimizing
our pipeline. Since our targets are processed using similar methods and
materials, often in parallel, more insightful comparisons can be made
than from extracting equivalent data from the literature. Furthermore,
the large number of targets processed, as well as their diverse nature,
makes identification of general principles more valid.
Analysis of PCR amplification
success rates: The feedback from analysis of success rates
was used to improve the primer generation system. As a result, a scoring
function that selects primers with optimal GC clamps within the specified
melting temperature and length range was added to the system. In its present
form, the optimized system is capable of generating primer sets with success
rates as high as 98%. [back
to Index]
Analysis of crystallization
screens: The realization that a significant number of
coarse screen crystallization conditions never yielded any crystals, whereas
in other cases proteins crystallized under many different conditions,
lead to the development of a minimal crystallization screen. Our large
number of crystallization trials (>500,000) and our consistent processing
approach allowed us to analyze and optimize our crystallization strategy.
Redundancy in the commercial conditions, particularly in the high molecular
weight PEGs, skews the statistics on relative efficacy of different crystallization
conditions. In review of our Tier 1 screening using the 480 available
screening conditions, we defined a small subset of 67 conditions which
optimally samples crystallization space and would have encompassed 84%
of the proteins which ultimately crystallized. This subset was expanded
slightly to 96 conditions (GNF96) and forms our basic screen to test whether
a particular protein construct will readily crystallize. Results to date
from 340,000 individual crystallization trials show that the minimal coarse
screen (GNF96) is highly effective in identifying targets which readily
crystallize and in providing crystal leads for fine screen optimization.
. [back
to Index]
|