A Paradoxical Property of the Monkey Book
2011 (English)In: Journal of Statistical Mechanics: Theory and Experiment, ISSN 1742-5468, P07013- p.Article in journal (Refereed) Published
A 'monkey book' is a book consisting of a random sequence of letters and blanks, where a group of letters surrounded by two blanks is defined as a word. We compare the statistics of the word distribution for a monkey book to real books. It is shown that the word distribution statistics for the monkey book is different and quite distinct from a typical real book. In particular, the monkey book obeys Heaps' power law to an extraordinarily good approximation, in contrast to the word distributions for real books, which deviate from Heaps' law in a characteristic way. This discrepancy is traced to the different properties of a 'spiked' distribution and its smooth envelope. The somewhat counter-intuitive conclusion is that a 'monkey book' obeys Heaps' power law precisely because its word-frequency distribution is not a smooth power law, contrary to the expectation based on simple mathematical arguments that if one is a power law, so is the other.
Place, publisher, year, edition, pages
Institute of Physics , 2011. P07013- p.
analysis of algorithms, growth processes
IdentifiersURN: urn:nbn:se:umu:diva-47758DOI: 10.1088/1742-5468/2011/07/P07013OAI: oai:DiVA.org:umu-47758DiVA: diva2:444386