Track topics on Twitter Track topics that are important to you
Open modification searching (OMS) is a powerful search strategy that identifies peptides carrying any type of modification by allowing a modified spectrum to match against its unmodified variant by using a very wide precursor mass window. A drawback of this strategy, however, is that it leads to a large increase in search time. Although performing an open search can be done using existing spectral library search engines by simply setting a wide precursor mass window, none of these tools have been optimized for OMS, leading to excessive runtimes and suboptimal identification results. Here we present the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. This approach is combined with a cascade search strategy to maximize the number of identified unmodified and modified spectra while strictly controlling the false discovery rate, as well as a shifted dot product score to sensitively match modified spectra to their unmodified counterparts. ANN-SoLo achieves state-of-the-art performance in terms of speed and the number of identifications. On a previously published human cell line data set, ANN-SoLo confidently identifies more spectra than SpectraST or MSFragger and achieves a speedup of an order of magnitude compared to SpectraST. ANN-SoLo is implemented in Python and C++. It is freely available under the Apache 2.0 license at https://github.com/bittremieux/ANN-SoLo.
This article was published in the following journal.
Name: Journal of proteome research
Chemical identification often relies on matching measured chemical properties and/or spectral "fingerprints" of unknowns against their precompiled libraries. Chromatography, absorption spectroscopy, a...
Sparse spectral clustering (SSC) has become one of the most popular clustering approaches in recent years. However, its high computational complexity prevents its application to large-scale datasets s...
A Rabbit myosin standard, like that used to create the empirical statistical model, was randomly and independently sampled by liquid chromatography micro electrospray ionization and tandem mass spectr...
ADP-ribosylation is a technically challenging PTM which has just emerged into the field of PTM-specific proteomics. But this fragile modifier requires special treatment on both a data acquisition and ...
We report the development and availability of a mass spectral reference library for oligosaccharides in human milk. This represents a new variety of spectral library that includes consensus spectra of...
The goal of this clinical research study is to evaluate the use of an imaging technology called spectral diagnosis. Researchers want to find out if a special spectral-diagnosis probe can ...
This study is designed to evaluate and compare in-tissue performance of OCT scans on the new Optos P200TE, versus the predicate Optos Spectral OCT/SLO device.
The purpose of the current study is to examine the effects of health-related internet use on affect, health anxiety and symptom severity in individuals with pathological levels of health a...
This research study aims to improve the standard exam called Focused Assessment with Sonography in Trauma (FAST). The FAST exam is an ultrasound test used to identify an abdominal bleed. T...
The purpose of this study is to analyze macular retinal thickness and macular volume using the spectral domain - optical coherence tomography (SD-OCT) in normal eyes and in eyes with vario...
A branch of computer or library science relating to the storage, locating, searching, and selecting, upon demand, relevant data on a given subject.
Controlled vocabulary thesaurus produced by the NATIONAL LIBRARY OF MEDICINE. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity.
Collection and analysis of data pertaining to operations of a particular library, library system, or group of independent libraries, with recommendations for improvement and/or ordered plans for further development.
The use of automatic machines or processing devices in libraries. The automation may be applied to library administrative activities, office procedures, and delivery of library services to users.
A form of GENE LIBRARY containing the complete DNA sequences present in the genome of a given organism. It contrasts with a cDNA library which contains only sequences utilized in protein coding (lacking introns).
Standard antiretroviral therapy (ART) consists of the combination of at least three antiretroviral (ARV) drugs to maximally suppress the HIV virus and stop the progression of HIV disease. Huge reductions have been seen in rates of death and suffering whe...
Collaborations in biotechnology
Commercial and academic collaborations are used throughout the biotechnology and pharmaceutical sector to enhance research and product development. Collaborations can take the form of research and evaluation agreements, licensing, partnerships etc. ...