umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Balanced unequal probability sampling with maximum entropy
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.ORCID iD: 0000-0003-1524-0851
(English)Manuscript (Other academic)
Abstract [en]

This paper investigates how to perform balanced unequal probability sampling with maximum entropy. Focus is on balancing conditions having the form of known marginal sums in a cross-stratification table. Since only marginal sums are fixed, the sample sizes for one or more cells in the table are random. The probability distribution for those sample sizes can be expressed explicitly but there are computational difficulties except for very small cases. Markov Chain Monte Carlo methods are proposed for obtaining good distribution approximations, as well as sample selection. Some large-sample Gaussian approximations are also considered. Iterative procedures for obtaining sampling probabilities yielding specified inclusion probabilities are discussed.

National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
URN: urn:nbn:se:umu:diva-22457OAI: oai:DiVA.org:umu-22457DiVA: diva2:216704
Available from: 2009-05-11 Created: 2009-05-11 Last updated: 2016-03-07Bibliographically approved
In thesis
1. Contributions to the theory of unequal probability sampling
Open this publication in new window or tab >>Contributions to the theory of unequal probability sampling
2009 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis consists of five papers related to the theory of unequal probability sampling from a finite population. Generally, it is assumed that we wish to make modelassisted inference, i.e. the inclusion probability for each unit in the population is prescribed before the sample is selected. The sample is then selected using some random mechanism, the sampling design. Mostly, the thesis is focused on three particular unequal probability sampling designs, the conditional Poisson (CP-) design, the Sampford design, and the Pareto design. They have different advantages and drawbacks: The CP design is a maximum entropy design but it is difficult to determine sampling parameters which yield prescribed inclusion probabilities, the Sampford design yields prescribed inclusion probabilities but may be hard to sample from, and the Pareto design makes sample selection very easy but it is very difficult to determine sampling parameters which yield prescribed inclusion probabilities. These three designs are compared probabilistically, and found to be close to each other under certain conditions. In particular the Sampford and Pareto designs are probabilistically close to each other. Some effort is devoted to analytically adjusting the CP and Pareto designs so that they yield inclusion probabilities close to the prescribed ones. The result of the adjustments are in general very good. Some iterative procedures are suggested to improve the results even further. Further, balanced unequal probability sampling is considered. In this kind of sampling, samples are given a positive probability of selection only if they satisfy some balancing conditions. The balancing conditions are given by information from auxiliary variables. Most of the attention is devoted to a slightly less general but practically important case. Also in this case the inclusion probabilities are prescribed in advance, making the choice of sampling parameters important. A complication which arises in the context of choosing sampling parameters is that certain probability distributions need to be calculated, and exact calculation turns out to be practically impossible, except for very small cases. It is proposed that Markov Chain Monte Carlo (MCMC) methods are used for obtaining approximations to the relevant probability distributions, and also for sample selection. In general, MCMC methods for sample selection does not occur very frequently in the sampling literature today, making it a fairly novel idea.

Place, publisher, year, edition, pages
Umeå: Institutionen för Matematik och Matematisk Statistik, Umeå universitet, 2009. 26 p.
Keyword
balanced sampling, conditional Poisson sampling, inclusion probabilities, maximum entropy, Markov chain Monte Carlo, Pareto sampling, Sampford sampling, unequal probability sampling.
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-22459 (URN)978-91-7264-760-2 (ISBN)
Public defence
2009-06-04, MA121, MIT-huset, Umeå Universitet, 90187 Umeå, Umeå, 13:15 (English)
Opponent
Supervisors
Available from: 2009-05-13 Created: 2009-05-11 Last updated: 2016-03-07Bibliographically approved

Open Access in DiVA

No full text

Authority records BETA

Lundquist, Anders

Search in DiVA

By author/editor
Lundquist, Anders
By organisation
Department of Mathematics and Mathematical Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 284 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf