You are here

e-Science Tools for Protein Crystallography Overview

Time magazine christened the first century of the new millennium the biotech century (1). The subsequent publication of the human genome (2,3), the explosion of genomic data from other organisms (4) and the creation and growth of structural genomics consortia in the USA, Europe and Japan (5,6) seems to have already confirmed this bold prediction. However, determining the molecular and cellular function of every encoded protein in a genome still remains a major challenge for biology. Furthermore, managing the plethora and diversity of data that is generated in the high throughput (HT) approaches used in these research programs is a major challenge for information technology. In this research we address both issues.


At the University of Queensland (UQ), we have pioneered a HT program of research (illustrated above) that links gene expression analysis by cDNA microarrays (to guide target selection) with parallel processing of proteins for structural studies (7). The combination of expression profiling and HT structural analysis provides a very powerful way to identify functions of proteins, as the former approach suggests a cellular role of a protein (e.g. involvement in a pathway), while the latter suggests a molecular (biochemical) function (e.g. an enzymatic function). At the same time, this strategy retains the cost-effective nature of pursuing the technically more tractable structures first.

This project will build onto the established structural pipeline, capitalizing on the large number of genes that we have already identified to be involved in macrophage biology, adding new structural and cellular functional information and enabling efficient management, analysis and interpretation of the large amounts of data generated.

Project Aims

The specific aims of this research are:

  1. Target selection, small-scale screening and large-scale production of macrophage proteins;
  2. Structure determination and structure-function analysis of macrophage proteins;
  3. Functional characterisation of protein targets in macrophages;
  4. Development of a secure "knowledgebase" and sevices for managing, tracking, validating, analysing, assimilating and interpreting the proteomic data and results.


UQ has established robotic equipment that underpins the HT pipeline. A 2006 LIEF grant will further enhance the automation for crystallisation and crystallography. The pipeline includes gene expression profiling, bioinformatics-based selection of proteins suitable for structure determination by X-ray crystallography, HT cloning of the corresponding cDNAs in expression vectors, HT recombinant protein expression screening to identify proteins that express in soluble form in bacteria, large-scale protein expression and purification, HT crystallisation and structure determination.


The Biomek 2000 is used for automated 96-well recombinatorial cloning, small scale expression and batch purification; the Caliper LabChip robot analyses DNA or protein from samples in 96-well plates; shaking incubators and dedicated warm room are used for bacterial expression by auto-induction at different temperatures; Akta FPLCs with autosamplers allow large scale purification and sampling from 96-well plates; custom-built crystallisation rooms are used for crystallisation; nanolitre crystallisation is performed using a Mosquito robot (hanging or sitting drop) or Topaz Chips (free interface diffusion); trials are monitored on a deCode Genetics Crystal Monitor. The Crystallography Facility includes a high-brilliance FR-E SuperBright generator and HU-2R with chromium anode (for sulfur SAD phasing); the combination of high brilliance and dual wavelength is unique in Australia. The infrastructure and pipeline provides an integrated approach to accelerate structural biology.

This equipment underpins structural research (aims 1 and 2) and supports cell biology (aim 3). Data generated is captured using a "knowledgebase" (aim 4) developed by the e-Research Group, as described on the Prototypes page.

This research is funded by ARC Discovery Project DP0770465.

References: 1 Isaacson W. The Biotech Century. Time, Jan 11 1999 2 Venter C et al (2001) The sequence of the human genome Science 291:1304-51 3 Lander ES et al (2001) Initial sequencing and analysis of the human genome Nature 409:860-921 4 Bernal A et al. (2001) Genomes OnLine Database (GOLD) Nucleic Acids Res 29:126-7 5 Nat Struct Biol supplement Nov 7 2000 6 Kouranov A et al (2006) The RCSB PDB information portal for structural genomics Nucleic Acids Res 34:D302-5 7 Walsh C et al (2001) Structural genomics: protein structures for the masses? Australian Biochemist 32:13-6