COMPLEXITY, cilt.20, sa.2, ss.12-21, 2014 (SCI-Expanded)
In this study, to demonstrate the language-like behavior of protein length distribution in proteomes, a quantitative linguistic distribution model, Menzerath-Altmann model, was adopted. A total of 10 proteomes from completely sequenced representative organisms (archaea, bacteria, and eukarya domains) were examined. The results showed that the protein length distribution in the complete set of proteomic proteins, or at least in a wide range for each proteome, can be described reasonably well using the distribution model without considering any complex underlying mechanisms. The deliberation of the model parameters confirmed the evolutionary trend and the model parameters were observed to be related to organismal complexity. (C) 2014 Wiley Periodicals, Inc.