dept header
Calendar | Directory | Contact
 

Applied. Statistics. (2008) 57, Part 1, pp. 75–87.

Variance estimation in complex survey sampling for generalized linear models .

Natarajan S, Lipsitz SR, Fitzmaurice G, Moore CG, andGonin R.

New York University School of Medicine and VA New York Harbor Healthcare System, New York, USA.

Significance:

Complex survey sampling is used to sample a fraction of a larger population to get population-based parameter and variance estimates by incorporating the design and the probability of being selected into the sample in the analysis.

This paper focuses on non-standard generalized linear models for complex survey data using a simple two-step method to obtain consistent regression parameter and variance estimates.

The significance of this method lies in the fact that it can be implemented within any standard sample survey package and is applicable to complex sample surveys with any number of stages.

Summary:

Complex survey sampling is often used to sample a fraction of a large finite population.

In general, the survey is conducted so that each unit (e.g. subject) in the sample has a different probability of being selected into the sample. For generalizability of the sample to the population, both the design and the probability of being selected into the sample must be incorporated in the analysis.

In this paper we focus on non-standard regression models for complex survey data. In our motivating example, which is based on data from the Medical Expenditure Panel Survey, the outcome variable is the subject’s ‘total health care expenditures in the year 2002’.

Previous analyses of medical cost data suggest that the variance is approximately equal to the mean raised to the power of 1.5, which is a non-standard variance function. Currently, the regression parameters for this model cannot be easily estimated in standard statistical software packages.

We propose a simple two-step method to obtain consistent regression parameter and variance estimates; the method proposed can be implemented within any standard sample survey package.

The approach is applicable to complex sample surveys with any number of stages.