Structure Prediction Protein Design Biclustering + cMonkey Network Inference Publications          Links + Collaborations People                   Opportunities             Software + Code Teaching            


WCG Status Update

WCG Post

HPF1 Results Update

HPF1 has been finished for some months and we can now report that the first high profile paper written using grid results is out (Lars' thesis and a few things I've written were out last year, but this is the first paper that anyone will read ;-) ). Lars, David and I wrote the paper many months ago but the many steps to publishing take time. We think this is the first wcgrid paper, and the first of many.

Before we get to the paper I wanted to thank the crunchers because this wouldn't have been possible without you. You're cycles are invaluable to us and the scientific community. So far its been a great relationship between crunchers and scientists and we hope it continues.

Here are the main points:
1. This paper was about the methods for combining previous functional information (that was incoded in a database called the Gene Ontology database) with our structure predictions.
2. In order to determine the rate of success and develop the method we chose to first apply the method to Yeast, a key model organism. Much of the biology we know today comes from studies carried out in model organisms.
3. Biologists across the world now have access to the data via the yeast resource center. Mike Riffle tells me that the site is widely used (high use stats for this type of resource).

Our experience with yeast has been positive, we have a few more algorithmic details to work out (in how we integrate the data from the grid with other bio sources to maximize utility to biologists) and we're on to writing up the Human results. The next paper will detail results from all 150 proteomes (spanning the tree of life). The concluding sentence in the paper alludes to this work : "The information content in the predicted structures may be further leveraged by integration with other data such as global quantitative measurements of mRNA, protein expression levels, DNA-protein, and protein-protein interactions. Such datasets are available for yeast and several other organisms as part of ongoing functional genomics efforts, and integration of these data types with the predicted structures should contribute to the annotation of protein functions."

Here is one of the cooler figures from the paper:

Here is the summary from the paper:

The three-dimensional structure of a protein can reveal much about that protein's evolutionary relationships and functions. Such information about all the proteins in an organism-the proteome-would offer a more global view of these relationships, but solving each structure individually would be a formidable task. In this study, we have parsed all Saccharomyces cerevisiae proteins into nearly 15,000 distinct domains and then used de novo structure prediction methods together with worldwide distributed computing to predict structures for all domains lacking sequence similarity to proteins of known structure. To overcome the uncertainties in de novo structure prediction, we combined these predictions with data on the biological process, function, and localization of the proteins from previous experimental studies to assign the domains to families of evolutionarily related proteins. Our genome-wide domain predictions and superfamily assignments provide the basis for the generation of experimentally testable hypotheses about the mechanism of action for a large number of yeast proteins.

Read the whole paper here.