Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates
2008 (English)In: BMC Medical Research Methodology, ISSN 1471-2288, Vol. 8, Article nr 13- p.Article in journal (Refereed) Published
BACKGROUND: As in any measurement process, a certain amount of error may be expected in routine population surveillance operations such as those in demographic surveillance sites (DSSs). Vital events are likely to be missed and errors made no matter what method of data capture is used or what quality control procedures are in place. The extent to which random errors in large, longitudinal datasets affect overall health and demographic profiles has important implications for the role of DSSs as platforms for public health research and clinical trials. Such knowledge is also of particular importance if the outputs of DSSs are to be extrapolated and aggregated with realistic margins of error and validity.
METHODS: This study uses the first 10-year dataset from the Butajira Rural Health Project (BRHP) DSS, Ethiopia, covering approximately 336,000 person-years of data. Simple programmes were written to introduce random errors and omissions into new versions of the definitive 10-year Butajira dataset. Key parameters of sex, age, death, literacy and roof material (an indicator of poverty) were selected for the introduction of errors based on their obvious importance in demographic and health surveillance and their established significant associations with mortality. Defining the original 10-year dataset as the 'gold standard' for the purposes of this investigation, population, age and sex compositions and Poisson regression models of mortality rate ratios were compared between each of the intentionally erroneous datasets and the original 'gold standard' 10-year data.
RESULTS: The composition of the Butajira population was well represented despite introducing random errors, and differences between population pyramids based on the derived datasets were subtle. Regression analyses of well-established mortality risk factors were largely unaffected even by relatively high levels of random errors in the data.
CONCLUSION: The low sensitivity of parameter estimates and regression analyses to significant amounts of randomly introduced errors indicates a high level of robustness of the dataset. This apparent inertia of population parameter estimates to simulated errors is largely due to the size of the dataset. Tolerable margins of random error in DSS data may exceed 20%. While this is not an argument in favour of poor quality data, reducing the time and valuable resources spent on detecting and correcting random errors in routine DSS operations may be justifiable as the returns from such procedures diminish with increasing overall accuracy. The money and effort currently spent on endlessly correcting DSS datasets would perhaps be better spent on increasing the surveillance population size and geographic spread of DSSs and analysing and disseminating research findings.
Place, publisher, year, edition, pages
BioMed Central, 2008. Vol. 8, Article nr 13- p.
Age Distribution, Analysis of Variance, Demography, Epidemiologic Methods, Ethiopia/epidemiology, Female, Humans, Male, Mortality, Poisson Distribution, Population Surveillance, Regression Analysis, Sex Distribution, Software
Public Health, Global Health, Social Medicine and Epidemiology
IdentifiersURN: urn:nbn:se:umu:diva-10416DOI: 10.1186/1471-2288-8-13ISI: 000254661800001PubMedID: 18366742OAI: oai:DiVA.org:umu-10416DiVA: diva2:150087