The meta book and size-dependent properties of written language
2009 (English)In: New Journal of Physics, ISSN 1367-2630, E-ISSN 1367-2630, Vol. 11, 123015Article in journal (Refereed) Published
Evidence is given for a systematic text-length dependence of the power-law index $\gamma$ of a single book. The estimated $\gamma$ values are consistent with a monotonic decrease from 2 to 1 with increasing length of a text. A direct connection to an extended Heap's lawis explored. The infinite book limit is, as a consequence, proposed to be given by $\gamma = 1$ instead of the value $\gamma=2$ expected if the Zipf's law was ubiquitously applicable. In addition we explore the idea that the systematic text-length dependence can be described by a meta book concept, which is an abstract representation reflecting the word-frequency structure of a text. According to this concept the word-frequency distribution of a text, with a certain length written by a single author, has the same characteristics as a text of the same length pulled out from an imaginary complete infinite corpus written by the same author.
Place, publisher, year, edition, pages
Bristol: Institute of Physics Publishing (IOPP), 2009. Vol. 11, 123015
Quantitative linguistics, word frequencies, size dependencies, meta book
IdentifiersURN: urn:nbn:se:umu:diva-27643DOI: 10.1088/1367-2630/11/12/123015ISI: 000272703100001OAI: oai:DiVA.org:umu-27643DiVA: diva2:276882