Enrichment-based proteogenomics identifies microproteins, missing proteins, and novel smORFs in Saccharomyces cerevisiae.

08:00 EDT 13th June 2018 | BioPortfolio

Summary of "Enrichment-based proteogenomics identifies microproteins, missing proteins, and novel smORFs in Saccharomyces cerevisiae."

Microproteins are peptides composed of 100 amino acids (AA) or less, encoded by small open reading frames (smORFs). It has been demonstrated that microproteins participate in and regulate a wide range of functions in cells. However, the annotation and identification of microproteins is challenging in part owing to their low molecular weight, low abundancy, and hydrophobicity. These factors have led to the un-annotation of smORFs in genome processing and have made their identification at the protein level difficult. Large-scale enrichment of microproteins in proteogenomics has made it possible to efficiently identify microproteins and discover unannotated smORFs in Saccharomyces cerevisiae. Here, we integrated four microprotein-specific enrichment strategies to enhance coverage. We identified 117 microproteins, verified 31 missing proteins (MPs), and discovered 3 novel smORFs. In total, 31 proteins were confirmed as MPs by spectrum quality checking. Three novel smORFs (YKL104W-A, YHR052C-B, and YHR054C-B) were reserved after spectrum quality checking, peptide synthesizing, homologue matching, etc. This study not only demonstrates that there are potential smORF candidates to be annotated in an extensively studied organism, but also presents an efficient strategy for the discovery of small MPs. All MS datasets have been deposited to the ProteomeXchange with identifier PXD008586 (Username:; Password: UNEbNk3j).


Journal Details

This article was published in the following journal.

Name: Journal of proteome research
ISSN: 1535-3907


DeepDyve research library

PubMed Articles [20196 Associated PubMed Articles listed on BioPortfolio]

WEADE: A workflow for enrichment analysis and data exploration.

Data analysis based on enrichment of Gene Ontology terms has become an important step in exploring large gene or protein expression datasets and several stand-alone or web tools exist for that purpose...

Digging for Missing Proteins Using Low-Molecular-Weight Protein Enrichment and a "Mirror Protease" Strategy.

In 2012, the Chromosome-Centric Human Proteome Project (C-HPP) launched an investigation for missing proteins (MPs) to complete the human proteome project (HPP). The majority of the MPs were distribut...

The Challenge to Search for New Nervous System Disease Biomarker Candidates: the Opportunity to Use the Proteogenomics Approach.

Alzheimer's disease, Parkinson's disease, prion diseases, schizophrenia, and multiple sclerosis are the most common nervous system diseases, affecting millions of people worldwide. The current scienti...

Repeat missing child reports in Wales.

There were approximately 306,000 reports of missing persons in the UK from 2012 to 2013, 64% involved children. Repeat missing incidents account for approximately 38% of reported missing incidences. W...

Potentially missing data was considerably more frequent than definitely missing data in randomized controlled trials: A methodological survey.

Missing data for the outcomes of participants in randomized controlled trials (RCTs) are a key element of risk of bias assessment. However, it is not always clear from RCT reports whether some categor...

Clinical Trials [3854 Associated Clinical Trials listed on BioPortfolio]

Comparison of the Antihypertensive Efficacy of Valsartan and Enalapril After Missing One Dose

This study was designed in order to evaluate the blood pressure lowering effect of valsartan compared to enalapril over 24 hours after skipping one daily dose. Both drugs act on the renin...

A Screening and Recruitment Study in Adults Expressing Interest in the Emory Microbiota Enrichment Program

The goal of this study is to rapidly identify subjects who are eligible for the Microbiota Enrichment Program (MEP) at Emory in Atlanta, Georgia. This general screening protocol will be us...

Acceptability of Products and Eating Pleasure in Elderly People Living at Home or in Establishment Hosting For the Dependant Elderly (EHPAD) (Old-people's Home)

In independent elderly people, the aim is to test recipes for different types of food from different countries (starter, main course with culinary aids, carrot purees, desserts and smoothi...

Development of a Screening Strategy for Community-Based Adverse Drug Related Events in the Emergency Department

Adverse Drug Related Events (ADREs) are a leading cause of Emergency Department (ED) visits in Canada. However emergency physicians recognize only half of all ADREs in patients presenting ...

A Study of the Kinetics of a 13C-Cholesterol Infusate in Healthy Male Subjects (0000-108)(COMPLETED)

This is a 2-part pilot study in healthy male subjects to evaluate plasma enrichment kinetics of [13C3,4]-cholesterol (Part I) and to assess the test-retest reproducibility (Part II) of Rev...

Medical and Biotech [MESH] Definitions

The systematic study of annotated genomic information to global protein expression in order to determine the relationship between genomic sequences and both expressed proteins and predicted protein sequences.

Work consisting of the designation of an article or book as retracted in whole or in part by an author or authors or an authorized representative. It identifies a citation previously published and now retracted through a formal issuance from the author, publisher, or other authorized agent, and is distinguished from RETRACTION OF PUBLICATION, which identifies the citation retracting the original published item.

Adaptive antiviral defense mechanisms, in archaea and bacteria, based on DNA repeat arrays called CLUSTERED REGULARLY INTERSPACED SHORT PALINDROMIC REPEATS (CRISPR elements) that function in conjunction with CRISPR-ASSOCIATED PROTEINS (Cas proteins). Several types have been distinguished, including Type I, Type II, and Type III, based on signature motifs of CRISPR-ASSOCIATED PROTEINS.

Symbols or text that identifies a book as the work of a specific printer.

A conserved AMINO ACID SEQUENCE located in the intracellular domains of a family of transmembrane proteins that negatively regulate the signal transduction processes emanating from transmembrane proteins containing IMMUNORECEPTOR TYROSINE-BASED ACTIVATION MOTIFS. The CONSENSUS SEQUENCE of this motif is I(or V)LXYXXL(or V) (where X denotes any amino acid). Also known as ITIM motifs.

Quick Search


DeepDyve research library

Relevant Topic

Bioinformatics is the application of computer software and hardware to the management of biological data to create useful information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied...

Searches Linking to this Article