Track topics on Twitter Track topics that are important to you
The US National Library of Medicine and National Institutes of Health manage PubMed.gov which comprises of more than 29 million records, papers, reports for biomedical literature, including MEDLINE, life science and medical journals, articles, reviews, reports and books.
BioPortfolio aims to cross reference relevant information on published papers, clinical trials and news associated with selected topics - speciality.
For example view all recent relevant publications on Epigenetics and associated publications and clincial trials.
Plasmids are ubiquituous in bacterial genomes, and have been shown to be involved in important evolutionary processes, in particular the acquisition of antimicrobial resistance. However separating chromosomal contigs from plasmid contigs and assembling the later is a challenging problem.
We present SPar-K (Signal Partitioning with K-means), a method to search for archetypical chromatin architectures by partitioning a set of genomic regions characterized by chromatin signal profiles around ChIP-seq peaks and others kinds of of functional sites. This method efficiently deals with problems of data heterogeneity, limited misalignment of anchor points and unknown orientation of asymmetric patterns.
Computational prediction of protein structure from sequence is broadly viewed as a foundational problem of biochemistry and one of the most difficult challenges in bioinformatics. Once every two years the Critical Assessment of protein Structure Prediction (CASP) experiments are held to assess the state of the art in the field in a blind fashion, by presenting predictor groups with protein sequences whose structures have been solved but have not yet been made publicly available. The first CASP was organized...
Metagenomic and metatranscriptomic analyses can provide an abundance of information related to microbial communities. However, straightforward analysis of this data does not provide optimal results, with a required integration of data types being needed to thoroughly investigate these microbiomes and their environmental interactions.
Simulated genomes with pre-defined and random genomic variants can be very useful for benchmarking genomic and bioinformatics analyses. Here we introduce simuG, a lightweight tool for simulating the full-spectrum of genomic variants (SNPs, INDELs, CNVs, inversions, and translocations) for any organisms (including human). The simplicity and versatility of simuG makes it a unique general- purpose genome simulator for a wide-range of simulation-based applications.
Interpretation of ubiquitous protein sequence data has become a bottleneck in biomolecular research, due to a lack of structural and other experimental annotation data for these proteins. Prediction of protein interaction sites from sequence may be a viable substitute. We therefore recently developed a sequence-based random-forest method for protein-protein interface prediction, which yielded a significantly increased performance than other methods on both homomeric and heteromeric protein-protein interacti...
In modern microscopy the field of view is often increased by obtaining an image mosaic, where multiple sub-images are taken side by side and combined post-acquisition. Mosaic imaging often leads to long imaging times that can increase the probability of sample deformation during the acquisition due to, e.g., changes in the environment, damage caused by the radiation used to probe the sample, or biologically induced deterioration. Here we propose a technique, based on local phase correlation, to detect the d...
Recent development of Hi-C technique, a biochemical method to study 3-dimensional genome architecture, provided large amount of information describing spatial organization of chromosomes in different cell types and species. While multiple tools are available for analysis and comparison of Hi-C data of different cell types, there are almost no resources for systematic interspecies comparison.
Traditional drug discovery approaches identify a target for a disease and find a compound that binds to the target. In this approach, structures of compounds are considered as the most important features because it is assumed that similar structures will bind to the same target. Therefore, structural analogs of the drugs that bind to the target are selected as drug candidates. However, even though compounds are not structural analogs, they may achieve the desired response. A new drug discovery method based ...
Accurate genotyping of DNA from a single cell is required for applications such as de novo mutation detection, linkage analysis and lineage tracing. However, achieving high precision genotyping in the single cell environment is challenging due to the errors caused by whole genome amplification. Two factors make genotyping from single cells using single nucleotide polymorphism (SNP) arrays challenging. The lack of a comprehensive single cell dataset with a reference genotype and the absence of genotyping too...
Traditional drug discovery and development are often time-consuming and high-risk. Repurposing/repositioning of approved drugs offers a relatively low-cost and high-efficiency approach towards rapid development of efficacious treatments. The emergence of large-scale, heterogeneous biological networks has offered unprecedented opportunities for developing in silico drug repositioning approaches. However, capturing highly non-linear, heterogenous network structures by most existing approaches for drug reposit...
Data splitting is a fundamental step for building classification models with spectral data, especially in biomedical applications. This approach is performed following pre-processing and prior to model construction, and consists of dividing the samples into at least training and test sets; herein, the training set is used for model construction and the test set for model validation. Some of the most-used methodologies for data splitting are the random selection (RS) and the Kennard-Stone (KS) algorithms; he...
Cell fate determination is a continuous process in which one cell type diversifies to other cell types following a hierarchical path. Advancements in single-cell technologies provide the opportunity to reveal the continuum of cell progression which forms a structured continuous tree. Computational algorithms, which are usually based on a priori assumptions on the hidden structures, have previously been proposed as a means of recovering pseudo-trajectory along cell differentiation process. However, there sti...
MyelinJ is a free user friendly ImageJ macro for high throughput analysis of fluorescent micrographs such as 2D-myelinating cultures and statistical analysis using R. MyelinJ can analyse single images or complex experiments with multiple conditions, where the ggpubr package in R is automatically used for statistical analysis and the production of publication quality graphs. The main outputs are percentage (%) neurite density and % myelination. % neurite density is calculated using the normalise local contra...
The detection of potential biomarkers of Alzheimer's disease (AD) is crucial for its early prediction, diagnosis, and treatment. Voxelwise genome-wide association study (VGWAS) is a commonly used method in imaging genomics and usually applied to detect AD biomarkers in imaging and genetic data. However, existing VGWAS methods entail large computational cost and disregard spatial correlations within imaging data. A novel method is proposed to solve these issues.
High throughput technologies are widely employed in modern biomedical research. They yield measurements of a large number of biomolecules in a single experiment. The number of experiments usually is much smaller than the number of measurements in each experiment. The simultaneous measurements of biomolecules provide a basis for a comprehensive, systems view for describing relevant biological processes. Often it is necessary to determine correlations between the data matrices under different conditions or pa...
Genomic scanning approaches that detect one locus at a time are subject to many problems in genome-wide association studies (GWAS) and quantitative trait locus (QTL) mapping. The problems include large matrix inversion, over-conservativeness for tests after Bonferroni correction and difficulty in evaluation of the total genetic contribution to a trait's variance. Targeting these problems, we take a further step and investigate a multiple locus model that detects all markers simultaneously in a single model.
Alternative polyadenylation (polyA) sites near the 3' end of a pre-mRNA create multiple mRNA transcripts with different 3' untranslated regions (3' UTRs). The sequence elements of a 3' UTR are essential for many biological activities such as mRNA stability, sub-cellular localization, protein translation, protein binding and translation efficiency. Moreover, numerous studies in the literature have reported the correlation between diseases and the shortening (or lengthening) of 3' UTRs. As alternative polyA s...
One of the main goals in systems biology is to learn molecular regulatory networks from quantitative profile data. In particular, Gaussian Graphical Models (GGMs) are widely used network models in bioinformatics where variables (e.g. transcripts, metabolites or proteins) are represented by nodes, and pairs of nodes are connected with an edge according to their partial correlation. Reconstructing a GGM from data is a challenging task when the sample size is smaller than the number of variables. The main chal...
SPLATCHE3 simulates genetic data under a variety of spatially explicit evolutionary scenarios, extending previous versions of the framework. The new capabilities include long-distance migration, spatially and temporally heterogeneous short-scale migrations, alternative hybridization models, simulation of serial samples of genetic data and a large variety of DNA mutation models. These implementations have been applied independently to various studies, but grouped together in the current version.
We introduce Tibanna, an open-source software tool for automated execution of bioinformatics pipelines on Amazon Web Services (AWS). Tibanna accepts reproducible and portable pipeline standards including Common Workflow Language (CWL), Workflow Description Language (WDL) and Docker. It adopts a strategy of isolation and optimization of individual executions, combined with a serverless scheduling approach. Pipelines are executed and monitored using local commands or the Python Application Programming Interfa...
Dihydrouridine (D) is a common RNA posttranscriptional modification found in eukaryotes, bacteria and a few archaea. The modification can promote the conformational flexibility of individual nucleotide bases. And its levels are increased in cancerous tissues. Therefore, it is necessary to detect D in RNA for further understanding its functional roles. Since wet-experimental techniques for the aim are time-consuming and laborious, it is urgent to develop computational models to identify D modification sites ...
Protein tunnels and channels are key transport pathways that allow ligands to pass between proteins' external and internal environments. These functionally important structural features warrant detailed attention. It is difficult to study the ligand binding and unbinding process experimentally, while molecular dynamics simulations can be time-consuming and computationally demanding.
Unsupervised clustering is important in disease subtyping, among having other genomic applications. As genomic data has become more multifaceted, how to cluster across data sources for more precise subtyping is an ever more important area of research. Many of the methods proposed so far, including iCluster and Cluster of Cluster Assignments, make an unreasonble assumption of a common clustering across all data sources, and those that do not are fewer and tend to be computationally intensive.