Abstract
Motivation: Partially and wholly unstructured proteins have now been identified in all kingdoms of life, and more commonly in eukaryotic organisms. This intrinsic disorder is related to certain critical functions. Apart from their fundamental interest, unstructured regions in proteins may prevent crystallization. Therefore, the prediction of disordered regions is an important aspect for the understanding of protein function, but may also help to device genetic constructs.
Results: In this paper we present a computational tool for the detection of unstructured regions in proteins based on two properties of unfolded fragments: (i) disordered regions have a biased composition and (ii) they usually contain no or small hydrophobic clusters. In order to quantify these two facts we first calculate the amino acids distributions in structured and unstructured regions. Using this distribution we calculate for a given sequence fragment the probability to be either part of a structured or unstructured region. For each amino acid, the distance to the nearest hydrophobic cluster is also computed. Using these three values along a protein sequence allows, with very simple rules, to predict unstructured regions. This method only requires the primary sequence, and no multiple alignment, which makes it an adequate method for orphan proteins.
Availability: http://genomics.eu.org/