Track topics on Twitter Track topics that are important to you
Microproteins are peptides composed of 100 amino acids (AA) or less, encoded by small open reading frames (smORFs). It has been demonstrated that microproteins participate in and regulate a wide range of functions in cells. However, the annotation and identification of microproteins is challenging in part owing to their low molecular weight, low abundancy, and hydrophobicity. These factors have led to the un-annotation of smORFs in genome processing and have made their identification at the protein level difficult. Large-scale enrichment of microproteins in proteogenomics has made it possible to efficiently identify microproteins and discover unannotated smORFs in Saccharomyces cerevisiae. Here, we integrated four microprotein-specific enrichment strategies to enhance coverage. We identified 117 microproteins, verified 31 missing proteins (MPs), and discovered 3 novel smORFs. In total, 31 proteins were confirmed as MPs by spectrum quality checking. Three novel smORFs (YKL104W-A, YHR052C-B, and YHR054C-B) were reserved after spectrum quality checking, peptide synthesizing, homologue matching, etc. This study not only demonstrates that there are potential smORF candidates to be annotated in an extensively studied organism, but also presents an efficient strategy for the discovery of small MPs. All MS datasets have been deposited to the ProteomeXchange with identifier PXD008586 (Username: firstname.lastname@example.org; Password: UNEbNk3j).
This article was published in the following journal.
Name: Journal of proteome research
Proteogenomics methods have identified many non-annotated protein-coding genes in the human genome. Many of the newly discovered protein-coding genes encode peptides and small proteins, referred to co...
A nanoscale insulator-based dielectrophoresis (iDEP) technique is developed for rapid enrichment of proteins and highly sensitive immunoassays. Dense arrays of nanorods (NDs) by oblique angle depositi...
Missing data exist in all clinical trials and missing data issue is a very serious issue in terms of the interpretability of the trial results. There is no universally applicable solution for all miss...
Missing values exist widely in mass-spectrometry (MS) based metabolomics data. Various methods have been applied for handling missing values, but the selection can significantly affect following data ...
Proteogenomics enable the discovery of novel peptides (from unannotated genomic protein-coding loci) and single amino acid variant peptides (derived from single-nucleotide polymorphisms and mutations)...
This study was designed in order to evaluate the blood pressure lowering effect of valsartan compared to enalapril over 24 hours after skipping one daily dose. Both drugs act on the renin...
The goal of this study is to rapidly identify subjects who are eligible for the Microbiota Enrichment Program (MEP) at Emory in Atlanta, Georgia. This general screening protocol will be us...
Adverse Drug Related Events (ADREs) are a leading cause of Emergency Department (ED) visits in Canada. However emergency physicians recognize only half of all ADREs in patients presenting ...
In independent elderly people, the aim is to test recipes for different types of food from different countries (starter, main course with culinary aids, carrot purees, desserts and smoothi...
This is a 2-part pilot study in healthy male subjects to evaluate plasma enrichment kinetics of [13C3,4]-cholesterol (Part I) and to assess the test-retest reproducibility (Part II) of Rev...
The systematic study of annotated genomic information to global protein expression in order to determine the relationship between genomic sequences and both expressed proteins and predicted protein sequences.
Work consisting of the designation of an article or book as retracted in whole or in part by an author or authors or an authorized representative. It identifies a citation previously published and now retracted through a formal issuance from the author, publisher, or other authorized agent, and is distinguished from RETRACTION OF PUBLICATION, which identifies the citation retracting the original published item.
Adaptive antiviral defense mechanisms, in archaea and bacteria, based on DNA repeat arrays called CLUSTERED REGULARLY INTERSPACED SHORT PALINDROMIC REPEATS (CRISPR elements) that function in conjunction with CRISPR-ASSOCIATED PROTEINS (Cas proteins). Several types have been distinguished, including Type I, Type II, and Type III, based on signature motifs of CRISPR-ASSOCIATED PROTEINS.
A conserved AMINO ACID SEQUENCE located in the intracellular domains of a family of transmembrane proteins that negatively regulate the signal transduction processes emanating from transmembrane proteins containing IMMUNORECEPTOR TYROSINE-BASED ACTIVATION MOTIFS. The CONSENSUS SEQUENCE of this motif is I(or V)LXYXXL(or V) (where X denotes any amino acid). Also known as ITIM motifs.
SNARE proteins where the central amino acid residue of the SNARE motif is an ARGININE. They are classified separately from the Q-SNARE PROTEINS where the central amino acid residue of the SNARE motif is a GLUTAMINE. This subfamily contains the vesicle associated membrane proteins (VAMPs) based on similarity to the prototype for the R-SNAREs, VAMP2 (synaptobrevin 2).
Bioinformatics is the application of computer software and hardware to the management of biological data to create useful information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied...