Charting the Proteome – Part 1, setting the scene

By Paul Ko Ferrigno

Mapping the human genome has opened a wealth of possibilities for improving human health. Some of these possibilities are being turned into reality – for example the identification of gene expression changes that predict a patient’s chance of getting breast cancer, what treatment she should receive (breast cancer patients are predominantly, but not exclusively, women) and what her likely ‘outcome’ is. The great news is that the combination of this genetic information with a large body of other research now means that instead of 80% of patients dying of breast cancer within 5 years of diagnosis, 80% of diagnosed patients now survive beyond that 5 year mark. There are many questions surrounding the application of genome-wide research to breast cancer diagnosis and treatment.

One of the major limitations of relying on genetic information is that it is inherently limited, and inherently prone to both false negative results (missing a key piece of information) and false positive results (over-interpreting a piece of information). A friend and early business mentor Cassie Doherty had a useful analogy that explains why this is so. The sequence of the human genome is like the owner’s manual to a high performance car: the DNA sequence is a list of genes, and the car manual lists all the components and gives you some insight into how it all fits together, and what should happen if you turn the ignition key?  When you turn that key, the engine will roar into ‘life’, just like when a gene is turned on, it makes messenger RNA (mRNA). But the sound of that idling engine, the smoke coming out of those chrome tailpipes, only tell you what might happen: the car is still motionless. In the same way, the smoking gun of mRNA expressed from a gene does not tell you what is going to happen, only what might happen: the mRNA does not directly affect any biological process. It isn’t until you put the car into gear, engage the clutch and depress the accelerator that the car finally moves, and that car manual blueprint finally becomes glorious reality. In the case of  the mRNA, it isn’t until the mRNA is translated into protein, the protein correctly modified by the addition of various chemical groups and then assembled with other proteins into the right complex, that the altered biology of the cell/tissue/system/human-being will become…well, glorious reality.
Genomics is the term used to describe research that uses the totality of genetic information about an organism. In genomics, false negatives arise when hypotheses are made based on genetic information that cannot account for whether or not a protein is made, or whether it has been made and correctly or incorrectly chemically modified, or whether it has been assembled into a protein complex, and if so what its partners are. False positives occur for example when gene-level information (such as a mutation) is inferred to lead to a change in a protein when in fact that protein is never even made.
Clearly, an understanding of life at the molecular level needs much more than a genomic approach. The term ‘proteomics’ is used to describe research that would use the totality of protein-level information- clearly a daunting task. The first task would be to define the proteome: the catalogue of all the proteins made in a cell (or tissue, or system, or organism). If there are, say, 25,000 genes encoded by the human genome (this is pretty close to the actual number), and each of those is made into 4 alternative splice variants as the mRNA molecules are made (this is predicted to be close to the average number of splice variants for each gene), that would be 100,000 proteins- but how many of these are made in a typical cell, and how does the proteome differ between say a liver cell and a brain cell? And then there are those chemical modifications, and those multi-protein complexes- how many different complexes might any given protein be able to enter into?
Avacta Life Sciences doesn’t have the answers to these questions- in fact no-one does. But our next few blogs will be a series that begin to take a look at some of the first answers from recent literature, culminating in the recently published “human proteome map” papers from Nature. In the meantime, we continue our work to provide non-antibody tools that can be used for the study of human proteins – starting one protein at a time, looking at post-translationally-modified isoforms and at protein complexes, but quickly building to multiplexed assays capable of looking at the dynamics of protein interactions. Watch this space!
Beck M, Schmidt A, Malmstroem J, Claassen M, Ori A, Szymborska A, Herzog F, Rinner O, Ellenberg J, Aebersold R. The quantitative proteome of a human cell line. Mol Syst Biol. 2011 Nov 8;7:549. doi: 10.1038/msb.2011.82.
Nagaraj N1, Wisniewski JR, Geiger T, Cox J, Kircher M, Kelso J, Pääbo S, Mann M. Deep proteome and transcriptome mapping of a human cancer cell line. Mol Syst Biol. 2011 Nov 8;7:548. doi: 10.1038/msb.2011.81.
Ulrich Stelzl1, Uwe Worm1, Maciej Lalowski1, Christian Haenig1, Felix H. Brembeck1, Heike Goehler1, Martin Stroedicke1, Martina Zenkner1, Anke Schoenherr1, Susanne Koeppen2, Jan Timm1, Sascha Mintzlaff1, Claudia Abraham1, Nicole Bock2, Silvia Kietzmann2, Astrid Goedde3, Engin Toksöz1, Anja Droege1, Sylvia Krobitsch2, Bernhard Korn3, Walter Birchmeier1, Hans Lehrach2, Erich E. Wanker1, 2, , A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome. Cell Volume 122, Issue 6, 23 September 2005, Pages 957–968