Structure Prediction Protein Design Biclustering + cMonkey Network Inference Publications          Links + Collaborations People                   Opportunities             Software + Code Teaching            


WCG Status Update

WCG Post

HPF2 Progress and News
Hello crunchers,

Much has happened since the last status report and the most exciting thing to report is how fast the grid is running. We are barely keeping up with you as your machines are folding proteins faster than we could provide new proteins. Coming up though, we have a few tough nuts to crack so we shouldn't have a problem. First off we've tackled many disease causing organisms including ones that cause malaria and anthrax. Another set of proteins which are from multiple organisms is the GOS data set which came from "sequencing the sea". These are all the finished organisms.

Currently we are working on the plasmodium vivax genome which causes a terrible form of malaria. We discussed many of these in the last two status updates but I just wanted to give you an idea of what you're working on. Up next will be phytoplankton which is responsible for much of the oxygen in our atmosphere. Better understanding of this organism will allow us to better understand the impacts of climate change on our earth. Its the only earth we got so we better take care of it.

After that are some big ones that will take some time, arabidopsis and rice. Both being plants, have very large genomes (arabidopsis = 32000 genes, rice = 61000 genes) and therefore will take quite a long time. Finally, we are planning on folding Trypanosoma cruzi which is another disease causing agent.

I think we've discussed this before but we did a rough calculation about how long it would take to do a single genome on a state of the art compute cluster without the help of the grid and we estimated a few years. With the grid we will have finished p. vivax (for example) if all goes well, in under a few months. Your participation in this project is allowing us to enter new levels of science and provide valuable information so we can tackle some of the human race's grand challenges. We can't thank you enough.

organism description status
plasmodium falciparum causes most deadly form of malaria (finished)
B. anthracis causative agent of anthrax (finished)
Gram-negative pathogens responsible for many food-borne illnesses and sexually transmitted diseases (finished)
Bacillus_subtilis model organism for studying evolution and other pathogenic organisms (finished)
GOS new antibiotics, new industrial enzymes, new organisms that bind toxic metals (finished)
plasmodium vivax recently sequenced genome which causes malaria, usually not deadly but truly awful disease (current)
Phytoplankton responsible for a large portion of the oxygen in our atmosphere and interesting for its impacts on climate change (not done)
rice major food source for a large portion of the worlds population (not done)
arabidopsis model organism for studying plants (not done)
Trypanosoma cruzi causes Chagas disease in Central and South America (not done)

The last bit of news is a recent publication from our lab that was in a high impact and prestigious journal, Cell. The work demonstrates one of many reasons what we are doing with HPF is so important, specifically deriving functional information from the predicted structures. The method in the paper, "A Predictive Model for Transcriptional Control of Physiology in a Free Living Cell" (Bonneau et al. 2007), utilized structure predictions and the functional information from them to great benefit. Just as a little background, the field of Systems Biology has many goals but one specific question to answer is "Given an organism in a particular environment can we predict its behavior?" Now, saying we predict behavior is a little vague. Somewhat more descriptive would be to predict the network of protein interactions which regulate biological processes.

Thats a bit of a mouthful so let me explain further. Proteins work together to produce some sort of outcome (biological process), lets say growth. The outcome needs to be regulated or controlled so as not to get out of hand. (The growth outcome is particularly important to regulate because uncontrolled growth leads to cancer.) To control the proteins that produce growth in the cell they interact with other proteins, some that activate the proteins and others that inactivate them. There are still other proteins that activate the activators and even much more complicated relationships. The interactions may change based on what environment the cell is in. If the cell is getting a signal from its environment to grow, proteins that control growth will turn on and activate proteins that are responsible for growth. All of these interactions can be viewed as a network or graph where a protein is a node and an interaction with another protein is an edge.

Now, how do we know what the network looks like and how can we predict what it will look like for future scenarios? The network is usually constructed from experimental data by determining what protein interacts with what other protein in a given environment. In conjunction with experimental data, function predictions can be integrated in as well and are very valuable where experimental methods are lacking. Knowing what the protein's function is allows us to know what likely interactors are and what processes its involved with. We can then use this network and a little bit of math to estimate what proteins will be activated or deactivated in a new environment. Finally one major result of the paper is that these networks are global, meaning that to understand the network and the outcomes we need to understand nearly every protein in the cell. A major focus of the HPF project is to provide predictions to a global set of proteins. These predictions made by the HPF project are providing the needed information to expand our knowledge of these complex biological networks and moving the field of Systems Biology forward.

Again, we appreciate your compute power and so do many researchers in a wide range of fields.


Bonneau Lab