umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Contributions to the theory of unequal probability sampling
Umeå University, Faculty of Science and Technology, Mathematics and Mathematical Statistics.ORCID iD: 0000-0003-1524-0851
2009 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis consists of five papers related to the theory of unequal probability sampling from a finite population. Generally, it is assumed that we wish to make modelassisted inference, i.e. the inclusion probability for each unit in the population is prescribed before the sample is selected. The sample is then selected using some random mechanism, the sampling design. Mostly, the thesis is focused on three particular unequal probability sampling designs, the conditional Poisson (CP-) design, the Sampford design, and the Pareto design. They have different advantages and drawbacks: The CP design is a maximum entropy design but it is difficult to determine sampling parameters which yield prescribed inclusion probabilities, the Sampford design yields prescribed inclusion probabilities but may be hard to sample from, and the Pareto design makes sample selection very easy but it is very difficult to determine sampling parameters which yield prescribed inclusion probabilities. These three designs are compared probabilistically, and found to be close to each other under certain conditions. In particular the Sampford and Pareto designs are probabilistically close to each other. Some effort is devoted to analytically adjusting the CP and Pareto designs so that they yield inclusion probabilities close to the prescribed ones. The result of the adjustments are in general very good. Some iterative procedures are suggested to improve the results even further. Further, balanced unequal probability sampling is considered. In this kind of sampling, samples are given a positive probability of selection only if they satisfy some balancing conditions. The balancing conditions are given by information from auxiliary variables. Most of the attention is devoted to a slightly less general but practically important case. Also in this case the inclusion probabilities are prescribed in advance, making the choice of sampling parameters important. A complication which arises in the context of choosing sampling parameters is that certain probability distributions need to be calculated, and exact calculation turns out to be practically impossible, except for very small cases. It is proposed that Markov Chain Monte Carlo (MCMC) methods are used for obtaining approximations to the relevant probability distributions, and also for sample selection. In general, MCMC methods for sample selection does not occur very frequently in the sampling literature today, making it a fairly novel idea.

Place, publisher, year, edition, pages
Umeå: Institutionen för Matematik och Matematisk Statistik, Umeå universitet , 2009. , 26 p.
Keyword [en]
balanced sampling, conditional Poisson sampling, inclusion probabilities, maximum entropy, Markov chain Monte Carlo, Pareto sampling, Sampford sampling, unequal probability sampling.
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
URN: urn:nbn:se:umu:diva-22459ISBN: 978-91-7264-760-2 (print)OAI: oai:DiVA.org:umu-22459DiVA: diva2:216730
Public defence
2009-06-04, MA121, MIT-huset, Umeå Universitet, 90187 Umeå, Umeå, 13:15 (English)
Opponent
Supervisors
Available from: 2009-05-13 Created: 2009-05-11 Last updated: 2016-03-07Bibliographically approved
List of papers
1. Pareto sampling versus Sampford and Conditional Poisson sampling
Open this publication in new window or tab >>Pareto sampling versus Sampford and Conditional Poisson sampling
2006 (English)In: Scandinavian Journal of Statistics, ISSN 0303-6898, E-ISSN 1467-9469, Vol. 33, no 4, 699-720 p.Article in journal (Refereed) Published
Abstract [en]

Pareto sampling was introduced by Rosén in the late 1990s. It is a simple method to get a fixed size πps sample though with inclusion probabilities only approximately as desired. Sampford sampling, introduced by Sampford in 1967, gives the desired inclusion probabilities but it may take time to generate a sample. Using probability functions and Laplace approximations, we show that from a probabilistic point of view these two designs are very close to each other and asymptotically identical. A Sampford sample can rapidly be generated in all situations by letting a Pareto sample pass an acceptance–rejection filter. A new very efficient method to generate conditional Poisson (CP) samples appears as a byproduct. Further, it is shown how the inclusion probabilities of all orders for the Pareto design can be calculated from those of the CP design. A new explicit very accurate approximation of the second-order inclusion probabilities, valid for several designs, is presented and applied to get single sum type variance estimates of the Horvitz–Thompson estimator.

Place, publisher, year, edition, pages
Wiley InterScience, 2006
Keyword
acceptance–rejection, conditional Poisson sampling, Horvitz–Thompson estimator, inclusion probabilities, Laplace approximation, Pareto sampling, πps sample, Sampford sampling, variance estimation
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-7839 (URN)10.1111/j.1467-9469.2006.00497.x (DOI)
Available from: 2008-01-13 Created: 2008-01-13 Last updated: 2017-12-14Bibliographically approved
2. On sampling with desired inclusion probabilities of first and second order
Open this publication in new window or tab >>On sampling with desired inclusion probabilities of first and second order
2005 (English)Report (Other academic)
Abstract [en]

We present a new simple approximation of target probabilities pi for conditional Poisson sampling to obtain given inclusion probabilities. This approximation is based on the fact that the Sampford design gives inclusion probabilities as desired. Some alternative routines to calculate exact pi-values are presented and compared numerically. Further we derive two methods for achieving prescribed 2nd order inclusion probabilities. First we use a probability function belonging to the exponential family. The parameters of this probability function are determined by using an iterative proportional fitting algorithm. Then we modify the conditional Poisson probability function with an additional quadratic factor.

Place, publisher, year, edition, pages
Umeå: Umeå universitet, 2005. 22 p.
Series
Research report in mathematical statistics, ISSN 1653-0829 ; 2005:03
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-8385 (URN)
Distributor:
Institutionen för matematik och matematisk statistik, 90187, Umeå
Available from: 2008-01-20 Created: 2008-01-20 Last updated: 2016-03-07Bibliographically approved
3. On the distance between some πps sampling designs
Open this publication in new window or tab >>On the distance between some πps sampling designs
2007 (English)In: Acta Applicandae Mathematicae - An International Survey Journal on Applying Mathematics and Mathematical Applications, ISSN 0167-8019, E-ISSN 1572-9036, Vol. 97, no 1-3, 79-97 p.Article in journal (Refereed) Published
Abstract [en]

Asymptotic distances between probability distributions appearing in πps sampling theory are studied. The distributions are Poisson, Conditional Poisson (CP), Sampford, Pareto, Adjusted CP and Adjusted Pareto sampling. We start with the Kullback-Leibler divergence and the Hellinger distance and derive a simpler distance measure using a Taylor expansion of order two. This measure is evaluated first theoretically and then numerically, using small populations. The numerical examples are also illustrated using a multidimensional scaling technique called principal coordinate analysis (PCO). It turns out that Adjusted CP, Sampford, and adjusted Pareto are quite close to each other. Pareto is a bit further away from these, then comes CP and finally Poisson which is rather far from all the others.

Place, publisher, year, edition, pages
Dordrecht: Reidel, 2007
Keyword
Asymptotic distance, conditional poisson sampling, hellinger distance, inclusion probabilities, kullback-leibler divergence, pareto sampling, principal coordinate analysis, sampford sampling, target probabilities
Identifiers
urn:nbn:se:umu:diva-19696 (URN)10.1007/s10440-007-9134-x (DOI)
Available from: 2009-03-10 Created: 2009-03-10 Last updated: 2017-12-13Bibliographically approved
4. Balanced unequal probability sampling with maximum entropy
Open this publication in new window or tab >>Balanced unequal probability sampling with maximum entropy
(English)Manuscript (Other academic)
Abstract [en]

This paper investigates how to perform balanced unequal probability sampling with maximum entropy. Focus is on balancing conditions having the form of known marginal sums in a cross-stratification table. Since only marginal sums are fixed, the sample sizes for one or more cells in the table are random. The probability distribution for those sample sizes can be expressed explicitly but there are computational difficulties except for very small cases. Markov Chain Monte Carlo methods are proposed for obtaining good distribution approximations, as well as sample selection. Some large-sample Gaussian approximations are also considered. Iterative procedures for obtaining sampling probabilities yielding specified inclusion probabilities are discussed.

National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-22457 (URN)
Available from: 2009-05-11 Created: 2009-05-11 Last updated: 2016-03-07Bibliographically approved
5. A note on choosing sampling probabilities for conditional Poisson sampling
Open this publication in new window or tab >>A note on choosing sampling probabilities for conditional Poisson sampling
(English)Manuscript (Other academic)
Abstract [en]

For conditional Poisson sampling the sampling probabilities and the achieved inclusion probabilities are not identical, which is a problem. We present a general method for choosing the sampling probabilities, which uses only the desired inclusion probabilities and transformations of them. We compare the performance of this new method to other reasonable choices of sampling probabilities.

National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-22458 (URN)
Available from: 2009-05-11 Created: 2009-05-11 Last updated: 2016-03-07Bibliographically approved

Open Access in DiVA

fulltext(215 kB)1081 downloads
File information
File name FULLTEXT01.pdfFile size 215 kBChecksum SHA-512
ad85a683df2e25f17e2e3b9b6de2176bac8df9b60b1d75b5ab32098874e1e1287ea980a71ca7ced240294e678c13d1dcf54e5d4e4ad484fab2d347d44d375ed1
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Lundquist, Anders
By organisation
Mathematics and Mathematical Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 1081 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 882 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf