umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The Dependence of Frequency Distributions on Multiple Meanings of Words, Codes and Signs
Umeå University, Faculty of Science and Technology, Department of Physics.
2018 (English)In: Physica A: Statistical Mechanics and its Applications, ISSN 0378-4371, E-ISSN 1873-2119, Vol. 490, p. 554-564Article in journal (Refereed) Published
Abstract [en]

The dependence of the frequency distributions due to multiple meanings of words in a text is investigated by deleting letters. By coding the words with fewer letters the number of meanings per coded word increases. This increase is measured and used as an input in a predictive theory. For a text written in English, the word-frequency distribution is broad and fat-tailed, whereas if the words are only represented by their first letter the distribution becomes exponential. Both distribution are well predicted by the theory, as is the whole sequence obtained by consecutively representing the words by the first L = 6, 5, 4, 3, 2, 1 letters. Comparisons of texts written by Chinese characters and the same texts written by letter-codes are made and the similarity of the corresponding frequency-distributions are interpreted as a consequence of the multiple meanings of Chinese characters. This further implies that the difference of the shape for word-frequencies for an English text written by letters and a Chinese text written by Chinese characters is due to the coding and not to the language per se. 

Place, publisher, year, edition, pages
2018. Vol. 490, p. 554-564
Keywords [en]
Word-frequency distributions, Multiple meanings, Random Group Formation, Maximum entropy, Codes
National Category
Other Physics Topics Specific Languages
Identifiers
URN: urn:nbn:se:umu:diva-140251DOI: 10.1016/j.physa.2017.08.133OAI: oai:DiVA.org:umu-140251DiVA, id: diva2:1146719
Available from: 2017-10-03 Created: 2017-10-03 Last updated: 2018-06-09Bibliographically approved

Open Access in DiVA

fulltext(452 kB)19 downloads
File information
File name FULLTEXT01.pdfFile size 452 kBChecksum SHA-512
1f663ddea2cd2f6883897f2a3c6bd721e67a9f49978333193c34f46032aceb285d9aa2b1b7b0384bfc21b6caef9ba4963a313972a38206c3b3bc077473c817ac
Type fulltextMimetype application/pdf

Other links

Publisher's full textURL

Authority records BETA

Minnhagen, Petter

Search in DiVA

By author/editor
Minnhagen, Petter
By organisation
Department of Physics
In the same journal
Physica A: Statistical Mechanics and its Applications
Other Physics TopicsSpecific Languages

Search outside of DiVA

GoogleGoogle Scholar
Total: 19 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 126 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf