Randomness versus specifics for word-frequency distributions
2016 (English)In: Physica A: Statistical Mechanics and its Applications, ISSN 0378-4371, E-ISSN 1873-2119, Vol. 444, 828-837 p.Article in journal (Refereed) PublishedText
The text-length-dependence of real word-frequency distributions can be connected to the general properties of a random book. It is pointed out that this finding has strong implications, when deciding between two conceptually different views on word-frequency distributions, i.e. the specific 'Zipf's-view' and the non-specific 'Randomness-view', as is discussed. It is also noticed that the text-length transformation of a random book does have an exact scaling property precisely for the power-law index gamma = 1, as opposed to the Zipf's exponent gamma = 2 and the implication of this exact scaling property is discussed. However a real text has gamma > 1 and as a consequence gamma increases when shortening a real text. The connections to the predictions from the RGF (Random Group Formation) and to the infinite length-limit of a meta-book are also discussed. The difference between 'curve-fitting' and 'predicting' word-frequency distributions is stressed. It is pointed out that the question of randomness versus specifics for the distribution of outcomes in case of sufficiently complex systems has a much wider relevance than just the word-frequency example analyzed in the present work.
Place, publisher, year, edition, pages
Elsevier, 2016. Vol. 444, 828-837 p.
Word-frequency distributions, Zipf's law, Random Group Formation, Maximum entropy
Probability Theory and Statistics Physical Sciences
IdentifiersURN: urn:nbn:se:umu:diva-114601DOI: 10.1016/j.physa.2015.10.082ISI: 000366785900075OAI: oai:DiVA.org:umu-114601DiVA: diva2:902601