Progress on the human proteome

Chair Gilbert S Omenn discusses recent advances from the Human Proteome Organization’s Human Proteome Project

The Human Proteome Project (HPP) was launched by the global Human Proteome Organization (HUPO) in 2011 on the foundation of the HUPO Protein Standards Initiative and the Plasma, Brain, Cardiovascular, Glycoproteomics, Liver, and Kidney/Urine initiatives. Now there are 25 teams organised by chromosome and mitochondria in the Chromosome-centric C-HPP, 22 teams in the Biology and Disease-driven HPP, and three resource pillars – ­Antibody Profiling, Mass Spectrometry, and Bioinformatics (https://hupo.org/). The two overarching goals are 1) making proteomics complementary to genomics throughout life sciences and biomedical research; and 2) progressively completing the protein parts list for humans with highly credible evidence for at least one expressed protein product from each predicted protein-coding gene and characterising the functions and interactions of the protein and its sequence variants, splice isoforms, post-translational modifications, and complexes. This global project will enhance our understanding of human biology at the cellular level and lay a foundation for medical and public health applications.

Chromosome-centric HPP 

The Chromosome-centric HPP1 is now chaired by Young-Ki Paik of Korea, Lydie Lane of Switzerland, and Chris Overall of Canada. Its chromosome-by-chromosome approach to annotation of protein information is an analogy to the Human Genome Project, distributing the enormous proteome-wide task geographically and addressing chromosome-specific properties like cis-regulatory phenomena and co-expression of proteins from genes in amplicons. According to neXtProt, there are 19,467 human protein-coding genes (excluding 588 dubious/uncertain genes labelled Protein Evidence (PE) level 5). Since 2012, the number of curated high-confidence proteins with PE1 evidence has increased from 13,664 to 16,518 in the 2016-02 version of neXtProt.2 The next update, with input from PeptideAtlas for mass spectrometry-based studies, will be released in February 2017.

Our strategy of searching for protein expression in tissues with high expression of the corresponding transcripts (PE2, transcript expression without sufficient protein-level evidence) was implemented effectively this year by the French/Swiss consortium for Chromosome 2 and Chromosome 14 for sperm and complementary effort from China for testis. The Human Protein Atlas3 (the HPP Antibody Pillar, led by Mathias Uhlen and Emma Lundberg of Sweden) had published in 2015 that human testis/the male reproductive tract has a far higher number of tissue-specific and tissue-enriched transcripts (879) than any other tissue (50X and 5X higher, expression). The new results published in the fourth annual C-HPP special issue of the Journal of Proteome Research (November 2016) presented 253 previously missing proteins. Once these and other findings have been subjected to our standardised reanalysis by PeptideAtlas and incorporated into the 2017-02 neXtProt, we expect to have at least 16,873 PE1 proteins, corresponding to 87% of the predicted proteins. Other strategies that will help identify missing proteins include enhanced methods for sample preparation and protein solubilisation from membranes, more sensitive instruments to detect proteins with low abundance, consideration of life stages, and search for exogenous inducers of protein expression with infection or inflammation or pharmaceuticals. The C-HPP is organising a ‘top 50’ challenge to identify the most tractable ~50 proteins from each chromosome for detection, followed by confirmation with targeted proteomics.

Biology and Disease-driven HPP

Meanwhile, the B/D-HPP,4 led by Jennifer van Eyk of the US and Fernando Corrales of Spain, has presented a strategy of identifying lists or panels of proteins that can be recommended to the broad research community for

organ-specific disease-oriented studies. From bibliometric analyses of the published literature, Lam et al.5 have identified top 50 lists of the most-reported, hence ‘popular proteins,’ for cardiovascular, cerebral, hepatic, renal, pulmonary, and intestinal research, each with selected reaction monitoring (SRM)-MS assays available.

In fact, the B/D-HPP has achieved a huge success with the combined publication from the labs of Ruedi Aebersold in Zurich and Rob Moritz in Seattle in the July 2016 Cell of the comprehensive Human SRM Atlas,6 with 166,000 uniquely mapping (proteotypic) peptides for >99% of the predicted human proteins. There are matching spectral libraries, algorithm-based informative transitions, and labelled synthetic peptides available. The HPP strongly encourages life sciences and biomedical researchers to adopt these SRM tools and methods or learn enough about them to request such assays from collaborators or proteomics service laboratories.

The Bioinformatics resource pillar, led by Eric Deutsch of the US, proposed Guidelines for Mass Spectrometry Data Analysis7 which were applied in 2012 and were made more stringent in 2016. The guidelines and the authors’ checklist have been adopted by the Journal of Proteome Research for the 2017 December special issue from the HPP. Investigators, including those not part of the HPP teams, are warmly welcomed to utilise the guidelines and to submit their original manuscripts to the JPR (deadline 31 May 2017).

Focus on cancer

The theme of the December 2016 HPP Workshop in Rio de Janeiro was collaboration across the C-HPP and the B/D-HPP. For example, the Cancer B/D-HPP, including the NCI Clinical Proteomic Tumor Analysis Consortium, and C-HPP teams focused on brain, liver, colon, gastric, and breast cancers are sharing methods, findings, and annotations. We recommend use of the CPTAC Assay Portal (https://assays.cancer.gov). The SRM Atlas is particularly well suited to standardising mass spec-based cancer assays. In addition, sensitive reagents for protein identification can contribute to multi-omic single cell analyses of heterogeneity within and across tumours, which eventually will play a key role in precision oncology.

References

1          Paik Y K, et al. Progress in chromosome-centric HPP. J Proteome Res 2016;15:3945-50

2          Omenn G S, et al. Metrics for the Human Proteome Project. J Proteome Res 2016;15:3951-60

3          Uhlen M, et al. Tissue-based map of the human proteome. Science 2015;347:1260419

4          Van Eyk J E, et al. Biology and disease-driven HPP. J Proteome Res 2016;15;3979-87

5          Lam M, et al. Popular proteins for targeted proteomics. J Proteome Res 2016;15:4126-34

6          Kusebauch U, et al., Human SRM Atlas. Cell 2016;166:766-778

7          Deutsch E W, et al. Guidelines for mass spectrometry. J Proteome Res 2016;15:3961-70

Gilbert S Omenn, MD, PhD

Chair

Human Proteome Project

Human Proteome Organization

www.thehpp.org

This article first appeared in issue 13 of Horizon 2020 Projects: Portal, which is now available here.