Menzerath-Altmann law for distinct word distribution analysis in a large text


EROĞLU S.

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, cilt.392, sa.12, ss.2775-2780, 2013 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 392 Sayı: 12
  • Basım Tarihi: 2013
  • Doi Numarası: 10.1016/j.physa.2013.02.012
  • Dergi Adı: PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.2775-2780
  • Anahtar Kelimeler: Language, Distinct word distribution, Menzerath-Altmann law, Generalized gamma distribution, GAMMA DISTRIBUTION, SELF-ORGANIZATION, GENOMES, SIZE
  • Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

The empirical law uncovered by Menzerath and formulated by Altmann, known as the Menzerath-Altmann law (henceforth the MA law), reveals the statistical distribution behavior of human language in various organizational levels. Building on previous studies relating organizational regularities in a language, we propose that the distribution of distinct (or different) words in a large text can effectively be described by the MA law. The validity of the proposition is demonstrated by examining two text corpora written in different languages not belonging to the same language family (English and Turkish). The results show not only that distinct word distribution behavior can accurately be predicted by the MA law, but that this result appears to be language-independent. This result is important not only for quantitative linguistic studies, but also may have significance for other naturally occurring organizations that display analogous organizational behavior. We also deliberately demonstrate that the MA law is a special case of the probability function of the generalized gamma distribution. (c) 2013 Elsevier B.V. All rights reserved.