Structure Prediction Protein Design Biclustering + cMonkey Network Inference Publications          Links + Collaborations People                   Opportunities             Software + Code Teaching            

 

WCG Status Update

WCG Post

HPF2 Update - November 2009

Greetings WCG Volunteers,

As the first World Community Grid project, we'd like to celebrate the WCG's anniversary with a recap of all the contributions to protein science that your work as made. Over the past few years, WCG volunteers have provided over 50,000 CPU years (as calculated by the WCG) and folded over tens of thousands of protein sequences. Often there is very little known about the sequences we've folded, and WCG protein structure predictions provide the only available annotations for scientists studying these proteins. Biologists from different disciplines have used our structure predictions to make informed decisions about experiments and infer protein functions and molecular processes.

In the early stages of our project, an effort was made to make focused predictions for proteins of interest. The yeast proteome was originally targeted for the vast amount of other experimental data available.

Malmström L, Riffle M., Strauss CEM, Chivian, D, Davis TN., Bonneau R.3 and Baker D. Superfamily Assignments for the Yeast Proteome through Integration of Structure Prediction with the Gene Ontology. PLoS Biol. (2007) Apr;5(4):e76.

We predicted protein structures to further annotate this genome and compliment the array of protein interaction and molecular function information on this heavily studied model organism. Our results confirmed the feasibility of extending our approach to other less studied, larger proteomes.

A cross section of organisms (including Human, Mouse, Fly, E.Coli, Worm, and other unique organisms) have been processed completely, and protein sequences of unknown structure have been folded by the WCG. Our database has grown to include over a million protein sequences, and WCG predictions are complimented by known structures and a host of other structure and sequence metrics. We regularly receive special requests for predictions for proteins of varying kind (including but not limited to those related to HIV infection, the development of Malaria, and particular bacterial enzymatic processes).


A few high profile uses of our database include:

Bonneau, R, Facciotti, MT, Reiss, DJ, Madar A,, Baliga, NS, et al. A predictive model for transcriptional control of physiology in a free living cell. (2007) Cell. Dec 131:1354-1365.
Here we used our structure predictions to find transcription factors, the proteins that turn on and off genes. These predicted transcription factors proved critical (and accurate) in building the genome wide circuit for this organism. The general application here is environmental bioengineering and systems biology.

Mike Boxem, Zoltan Maliga, Niels J. Klitgord, Na Li, Irma Lemmens, Miyeko Mana, Lorenzo De Lichtervelde, Joram Mul, Diederik van de Peut, Maxime Devos, Nicolas Si-monis, Anne-Lore Schlaitz, Murat Cokol, Muhammed A. Yildirim, Tong Hao, Changyu Fan, Chenwei Lin, Mike Tipsword, Kevin Drew, Matilde Galli, Kahn Rhrissorrakrai, David Drech-sel, David E. Hill, Richard Bonneau, Kristin C. Gunsalus, Frederick P. Roth, Fabio Piano, Jan Tavernier, Sander van den Heuvel, Anthony A. Hyman, Marc Vidal. A Protein Domain-Based Interactome Network for C. elegans Early Embryogenesis. (2008) Cell, 134(3) pp. 534 - 545.
Here our predictions were used to map the boundaries between functional parts of proteins. This allows for a whole new way of looking at how proteins interact and co-function to form a working system that the cell relies on. The general application here is broad, as this describes a dataset all types of biologists will use.

Andersen-Nissen E, Smith KD, Bonneau R, Strong RK, Aderem A. A conserved surface on Toll-like receptor 5 recognizes bacterial flagellin. (2007) J Exp Med. Feb 19;204(2):393-403.
Here we predicted the structure of key immune proteins, resulting in a prediction that allowed us to re-engineer a key imune receptor allowing for a better animal model of innate immune responses (key to figuring out several aspects of our response to bacterial infection). This publication has direct application to immunology and fighting infectious disease.


Recently, we've been working towards a paper that will describe our new methods, highlight our successes, and publicize the already open access to our database. This year we've received an average of 6,300 unique visitors a month. That's over 200 users a day (including weekends)! With the publication of our new methods we expect a significant increase in exposure and are preparing to provide multiple means of user-friendly access for the sometimes complex data. This will include using BioNetBuilder.

Iliana Avila-Campillo, Kevin Drew, John Lin, David J. Reiss, Richard Bonneau. BioNetBuilder, an automatic network interface. (2007) Bioinformatics. Feb 1;23(3):392-3.

Future work will undoubtedly involve the refinement of our protein structure annotations. We're investigating methods for incorporating evolutionary information into our predictions, and overhauling parts of the pipeline that are outdated. There is significant room for improvement in our methods for selecting native-state conformations from structure predictions and assigning family annotations. With the WCG we've been able to cast a wide net, and now we're interested in the improvement of our algorithms and classifiers. WCG predictions will continue to provide data for our ever improving experiments and value to the scientific community.

Here at the Bonneau Lab, we thank you for your dedication to science and ask that you keep crunching!
--
Patrick Winters
Bonneau Lab