A novel numerical model for protein sequences analysis based on spherical coordinates and multiple physicochemical properties of amino acids.

How to characterize short protein sequences to make an effective connection to their functions is an unsolved problem. Here we propose to map the physicochemical properties of each amino acid onto unit spheres so that each protein sequence can be represented quantitatively. We demonstrate the usefulness of this representation by applying it to the prediction of cell penetrating peptides. We show that its combination with traditional composition features yields the best performance across different datasets, among several methods compared. For the convenience of users, a web server has been established for automatic calculations of the proposed features at


Name: Biopolymers
ISSN: 1097-0282
Pages: e23282


