Menzerath-Altmann law for distinct word distribution analysis in a large text


PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, vol.392, no.12, pp.2775-2780, 2013 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 392 Issue: 12
  • Publication Date: 2013
  • Doi Number: 10.1016/j.physa.2013.02.012
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.2775-2780
  • Keywords: Language, Distinct word distribution, Menzerath-Altmann law, Generalized gamma distribution, GAMMA DISTRIBUTION, SELF-ORGANIZATION, GENOMES, SIZE
  • Eskisehir Osmangazi University Affiliated: Yes


The empirical law uncovered by Menzerath and formulated by Altmann, known as the Menzerath-Altmann law (henceforth the MA law), reveals the statistical distribution behavior of human language in various organizational levels. Building on previous studies relating organizational regularities in a language, we propose that the distribution of distinct (or different) words in a large text can effectively be described by the MA law. The validity of the proposition is demonstrated by examining two text corpora written in different languages not belonging to the same language family (English and Turkish). The results show not only that distinct word distribution behavior can accurately be predicted by the MA law, but that this result appears to be language-independent. This result is important not only for quantitative linguistic studies, but also may have significance for other naturally occurring organizations that display analogous organizational behavior. We also deliberately demonstrate that the MA law is a special case of the probability function of the generalized gamma distribution. (c) 2013 Elsevier B.V. All rights reserved.