umu.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Minnhagen, Petter
Publications (10 of 51) Show all publications
Yan, X., Yang, S. y., Kim, B. J. & Minnhagen, P. (2018). Benford's Law and the First Letter of Words. Physica A: Statistical Mechanics and its Applications, 512, 305-315
Open this publication in new window or tab >>Benford's Law and the First Letter of Words
2018 (English)In: Physica A: Statistical Mechanics and its Applications, ISSN 0378-4371, E-ISSN 1873-2119, Vol. 512, p. 305-315Article in journal (Other academic) Published
Abstract [en]

A universal First-Letter Law (FLL) is derived and described. It predicts the percentages of first letters for words in novels. The FLL is akin to Benford’s law (BL) of first digits, which predicts the percentages of first digits in a data collection of numbers. Both are universal in the sense that FLL only depends on the numbers of letters in the alphabet, whereas BL only depends on the number of digits in the base of the number system. The existence of these types of universal laws appears counter-intuitive. Nonetheless both describe data very well. Relations to some earlier works are given. FLL predicts that an English author on the average starts about 16 out of 100 words with the English letter ‘t’. This is corroborated by data, yet an author can freely write anything. Fuller implications and the applicability of FLL remain for the future.

Place, publisher, year, edition, pages
Elsevier, 2018
Keywords
First-Letter Law, Benford’s law, universal frequency ladder, Random Group Formation, maximum entropy
National Category
Language Technology (Computational Linguistics)
Research subject
Linguistics; Theoretical Physics
Identifiers
urn:nbn:se:umu:diva-143257 (URN)10.1016/j.physa.2018.08.133 (DOI)000446151000026 ()
Available from: 2017-12-19 Created: 2017-12-19 Last updated: 2018-11-06Bibliographically approved
Yan, X. & Minnhagen, P. (2018). The Dependence of Frequency Distributions on Multiple Meanings of Words, Codes and Signs. Physica A: Statistical Mechanics and its Applications, 490, 554-564
Open this publication in new window or tab >>The Dependence of Frequency Distributions on Multiple Meanings of Words, Codes and Signs
2018 (English)In: Physica A: Statistical Mechanics and its Applications, ISSN 0378-4371, E-ISSN 1873-2119, Vol. 490, p. 554-564Article in journal (Refereed) Published
Abstract [en]

The dependence of the frequency distributions due to multiple meanings of words in a text is investigated by deleting letters. By coding the words with fewer letters the number of meanings per coded word increases. This increase is measured and used as an input in a predictive theory. For a text written in English, the word-frequency distribution is broad and fat-tailed, whereas if the words are only represented by their first letter the distribution becomes exponential. Both distribution are well predicted by the theory, as is the whole sequence obtained by consecutively representing the words by the first L = 6, 5, 4, 3, 2, 1 letters. Comparisons of texts written by Chinese characters and the same texts written by letter-codes are made and the similarity of the corresponding frequency-distributions are interpreted as a consequence of the multiple meanings of Chinese characters. This further implies that the difference of the shape for word-frequencies for an English text written by letters and a Chinese text written by Chinese characters is due to the coding and not to the language per se. 

Keywords
Word-frequency distributions, Multiple meanings, Random Group Formation, Maximum entropy, Codes
National Category
Other Physics Topics Specific Languages
Identifiers
urn:nbn:se:umu:diva-140251 (URN)10.1016/j.physa.2017.08.133 (DOI)
Available from: 2017-10-03 Created: 2017-10-03 Last updated: 2018-06-09Bibliographically approved
Do Yi, S., Noh, J. D., Minnhagen, P., Song, M.-Y., Chon, T.-S. & Kim, B. J. (2017). Human bipedalism and body-mass index. Scientific Reports, 7, Article ID 3688.
Open this publication in new window or tab >>Human bipedalism and body-mass index
Show others...
2017 (English)In: Scientific Reports, ISSN 2045-2322, E-ISSN 2045-2322, Vol. 7, article id 3688Article in journal (Refereed) Published
Abstract [en]

Body-mass index, abbreviated as BMI and given by M/H2 with the mass M and the height H, has been widely used as a useful proxy to measure a general health status of a human individual. We generalise BMI in the form of M/Hp and pursue to answer the question of the value of p for populations of animal species including human. We compare values of p for several different datasets for human populations with the ones obtained for other animal populations of fish, whales, and land mammals. All animal populations but humans analyzed in our work are shown to have p ≈ 3 unanimously. In contrast, human populations are different: As young infants grow to become toddlers and keep growing, the sudden change of p is observed at about one year after birth. Infants younger than one year old exhibit significantly larger value of p than two, while children between one and five years old show p ≈ 2, sharply different from other animal species. The observation implies the importance of the upright posture of human individuals. We also propose a simple mechanical model for a human body and suggest that standing and walking upright should put a clear division between bipedal human (p ≈ 2) and other animals (p ≈ 3).

Place, publisher, year, edition, pages
Nature Publishing Group, 2017
National Category
Other Physics Topics
Identifiers
urn:nbn:se:umu:diva-137634 (URN)10.1038/s41598-017-03961-w (DOI)000403413700050 ()
Available from: 2017-07-18 Created: 2017-07-18 Last updated: 2018-06-09Bibliographically approved
Yan, X. & Minnhagen, P. (2016). Randomness versus specifics for word-frequency distributions. Physica A: Statistical Mechanics and its Applications, 444, 828-837
Open this publication in new window or tab >>Randomness versus specifics for word-frequency distributions
2016 (English)In: Physica A: Statistical Mechanics and its Applications, ISSN 0378-4371, E-ISSN 1873-2119, Vol. 444, p. 828-837Article in journal (Refereed) Published
Abstract [en]

The text-length-dependence of real word-frequency distributions can be connected to the general properties of a random book. It is pointed out that this finding has strong implications, when deciding between two conceptually different views on word-frequency distributions, i.e. the specific 'Zipf's-view' and the non-specific 'Randomness-view', as is discussed. It is also noticed that the text-length transformation of a random book does have an exact scaling property precisely for the power-law index gamma = 1, as opposed to the Zipf's exponent gamma = 2 and the implication of this exact scaling property is discussed. However a real text has gamma > 1 and as a consequence gamma increases when shortening a real text. The connections to the predictions from the RGF (Random Group Formation) and to the infinite length-limit of a meta-book are also discussed. The difference between 'curve-fitting' and 'predicting' word-frequency distributions is stressed. It is pointed out that the question of randomness versus specifics for the distribution of outcomes in case of sufficiently complex systems has a much wider relevance than just the word-frequency example analyzed in the present work.

Place, publisher, year, edition, pages
Elsevier, 2016
Keywords
Word-frequency distributions, Zipf's law, Random Group Formation, Maximum entropy
National Category
Probability Theory and Statistics Physical Sciences
Identifiers
urn:nbn:se:umu:diva-114601 (URN)10.1016/j.physa.2015.10.082 (DOI)000366785900075 ()
Available from: 2016-02-11 Created: 2016-01-25 Last updated: 2018-06-07Bibliographically approved
Yan, X., Minnhagen, P. & Jensen, H. J. (2016). The likely determines the unlikely. Physica A: Statistical Mechanics and its Applications, 456, 112-119
Open this publication in new window or tab >>The likely determines the unlikely
2016 (English)In: Physica A: Statistical Mechanics and its Applications, ISSN 0378-4371, E-ISSN 1873-2119, Vol. 456, p. 112-119Article in journal (Refereed) Published
Abstract [en]

We point out that the functional form describing the frequency of sizes of events in complex systems (e.g. earthquakes, forest fires, bursts of neuronal activity) can be obtained from maximal likelihood inference, which, remarkably, only involve a few available observed measures such as number of events, total event size and extremes. Most importantly, the method is able to predict with high accuracy the frequency of the rare extreme events. To be able to predict the few, often big impact events, from the frequent small events is of course of great general importance. For a data set of wind speed we are able to predict the frequency of gales with good precision. We analyse several examples ranging from the shortest length of a recruit to the number of Chinese characters which occur only once in a text. (C) 2016 Elsevier B.V. All rights reserved.

Keywords
Complex systems, Frequency distributions, Maximum entropy, Predictions, Real data
National Category
Physical Sciences
Identifiers
urn:nbn:se:umu:diva-123037 (URN)10.1016/j.physa.2016.03.027 (DOI)000376693500011 ()
Available from: 2016-08-22 Created: 2016-06-27 Last updated: 2018-06-07Bibliographically approved
Yan, X. & Minnhagen, P. (2015). Maximum Entropy, Word-Frequency, Chinese Characters, and Multiple Meanings. PLoS ONE, 10(5), Article ID e0125592.
Open this publication in new window or tab >>Maximum Entropy, Word-Frequency, Chinese Characters, and Multiple Meanings
2015 (English)In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 10, no 5, article id e0125592Article in journal (Refereed) Published
Abstract [en]

The word-frequency distribution of a text written by an author is well accounted for by a maximum entropy distribution, the RGF (random group formation)-prediction. The RGF-distribution is completely determined by the a priori values of the total number of words in the text (M), the number of distinct words (N) and the number of repetitions of the most common word (k(max)). It is here shown that this maximum entropy prediction also describes a text written in Chinese characters. In particular it is shown that although the same Chinese text written in words and Chinese characters have quite differently shaped distributions, they are nevertheless both well predicted by their respective three a priori characteristic values. It is pointed out that this is analogous to the change in the shape of the distribution when translating a given text to another language. Another consequence of the RGF-prediction is that taking a part of a long text will change the input parameters (M, N, k(max)) and consequently also the shape of the frequency distribution. This is explicitly confirmed for texts written in Chinese characters. Since the RGF-prediction has no system-specific information beyond the three a priori values (M, N, k(max)), any specific language characteristic has to be sought in systematic deviations from the RGF-prediction and the measured frequencies. One such systematic deviation is identified and, through a statistical information theoretical argument and an extended RGF-model, it is proposed that this deviation is caused by multiple meanings of Chinese characters. The effect is stronger for Chinese characters than for Chinese words. The relation between Zipf's law, the Simon-model for texts and the present results are discussed.

National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:umu:diva-106610 (URN)10.1371/journal.pone.0125592 (DOI)000356768100060 ()25955175 (PubMedID)
Available from: 2015-07-28 Created: 2015-07-24 Last updated: 2018-06-07Bibliographically approved
Bokma, F., Baek, S. K. & Minnhagen, P. (2014). 50 years of inordinate fondness. Systematic Biology, 63(2), 251-256
Open this publication in new window or tab >>50 years of inordinate fondness
2014 (English)In: Systematic Biology, ISSN 1063-5157, E-ISSN 1076-836X, Vol. 63, no 2, p. 251-256Article in journal (Refereed) Published
Place, publisher, year, edition, pages
Oxford University Press, 2014
Keywords
Taxa, species, distributions, maximum entropy, random group formation
National Category
Evolutionary Biology
Identifiers
urn:nbn:se:umu:diva-81886 (URN)10.1093/sysbio/syt067 (DOI)000332044900010 ()
Available from: 2013-10-22 Created: 2013-10-22 Last updated: 2018-06-08Bibliographically approved
Yi, S. D., Kim, B. J. & Minnhagen, P. (2013). Allometric exponent and randomness. New Journal of Physics, 15, 043001
Open this publication in new window or tab >>Allometric exponent and randomness
2013 (English)In: New Journal of Physics, ISSN 1367-2630, E-ISSN 1367-2630, Vol. 15, p. 043001-Article in journal (Refereed) Published
Abstract [en]

An allometric height-mass exponent gamma gives an approximative power-law relation < M > proportional to H-gamma between the average mass < M > and the height H for a sample of individuals. The individuals in the present study are humans but could be any biological organism. The sampling can be for a specific age of the individuals or for an age interval. The body mass index is often used for practical purposes when characterizing humans and it is based on the allometric exponent gamma = 2. It is shown here that the actual value of gamma is to a large extent determined by the degree of correlation between mass and height within the sample studied: no correlation between mass and height means gamma = 0, whereas if there was a precise relation between mass and height such that all individuals had the same shape and density then gamma = 3. The connection is demonstrated by showing that the value of gamma can be obtained directly from three numbers characterizing the spreads of the relevant random Gaussian statistical distributions: the spread of the height and mass distributions together with the spread of the mass distribution for the average height. Possible implications for allometric relations, in general, are discussed.

Place, publisher, year, edition, pages
Institute of Physics Publishing (IOPP), 2013
National Category
Physical Sciences
Identifiers
urn:nbn:se:umu:diva-70338 (URN)10.1088/1367-2630/15/4/043001 (DOI)000317035700001 ()
Available from: 2013-05-14 Created: 2013-05-14 Last updated: 2018-06-08Bibliographically approved
Baek, S. K., Mäkilä, H., Minnhagen, P. & Kim, B. J. (2013). Residual discrete symmetry of the five-state clock model. Physical Review E. Statistical, Nonlinear, and Soft Matter Physics, 88(1), Article ID 012125.
Open this publication in new window or tab >>Residual discrete symmetry of the five-state clock model
2013 (English)In: Physical Review E. Statistical, Nonlinear, and Soft Matter Physics, ISSN 1539-3755, E-ISSN 1550-2376, Vol. 88, no 1, article id 012125Article in journal (Refereed) Published
Abstract [en]

It is well known that the q-state clock model can exhibit a Kosterlitz-Thouless (KT) transition if q is equal to or greater than a certain threshold, which has been believed to be five. However, recent numerical studies indicate that helicity modulus does not vanish in the high-temperature phase of the five-state clock model as predicted by the KT scenario. By performing Monte Carlo calculations under the fluctuating twist boundary condition, we show that it is because the five-state clock model does not have the fully continuous U(1) symmetry even in the high-temperature phase while the six-state clock model does. We suggest that the upper transition of the five-state clock model is actually a weaker cousin of the KT transition so that it is q≥6 that exhibits the genuine KT behavior

Keywords
Phase transistions, Kostelitz-Thouless, clock-models, helicity modulus
National Category
Physical Sciences
Research subject
Theoretical Physics
Identifiers
urn:nbn:se:umu:diva-81890 (URN)10.1103/PhysRevE.88.012125 (DOI)000322084800002 ()23944432 (PubMedID)
Available from: 2013-10-22 Created: 2013-10-22 Last updated: 2018-06-08Bibliographically approved
Lee, S. H., Bernhardsson, S., Holme, P., Kim, B. J. & Minnhagen, P. (2012). Neutral theory of chemical reaction networks. New Journal of Physics, 14, 033032
Open this publication in new window or tab >>Neutral theory of chemical reaction networks
Show others...
2012 (English)In: New Journal of Physics, ISSN 1367-2630, E-ISSN 1367-2630, Vol. 14, p. 033032-Article in journal (Refereed) Published
Abstract [en]

To what extent do the characteristic features of a chemical reaction network reflect its purpose and function? In general, one argues that correlations between specific features and specific functions are key to understanding a complex structure. However, specific features may sometimes be neutral and uncorrelated with any system-specific purpose, function or causal chain. Such neutral features are caused by chance and randomness. Here we compare two classes of chemical networks: one that has been subjected to biological evolution (the chemical reaction network of metabolism in living cells) and one that has not (the atmospheric planetary chemical reaction networks). Their degree distributions are shown to share the very same neutral system-independent features. The shape of the broad distributions is to a large extent controlled by a single parameter, the network size. From this perspective, there is little difference between atmospheric and metabolic networks; they are just different sizes of the same random assembling network. In other words, the shape of the degree distribution is a neutral characteristic feature and has no functional or evolutionary implications in itself; it is not a matter of life and death.

Keywords
Nertwork, chemical reactions, mettabolism, planets, emergent properties
National Category
Other Physics Topics
Identifiers
urn:nbn:se:umu:diva-53319 (URN)10.1088/1367-2630/14/3/033032 (DOI)000302370400002 ()
Available from: 2012-03-22 Created: 2012-03-20 Last updated: 2018-06-08Bibliographically approved
Organisations

Search in DiVA

Show all publications