ESAIM: Probability and Statistics

Research Article

Using auxiliary information in statistical function estimation

Tarima, Sergeya1 and Pavlov, Dmitria2

a1 Division of Biostatistics, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin, 53226, USA;

a2 Clinical Biostatistics, Pfizer Inc., 50 Pequot Avenue, New London, Connecticut, 06320, USA;


In many practical situations sample sizes are not sufficiently large and estimators based on such samples may not be satisfactory in terms of their variances. At the same time it is not unusual that some auxiliary information about the parameters of interest is available. This paper considers a method of using auxiliary information for improving properties of the estimators based on a current sample only. In particular, it is assumed that the information is available as a number of estimates based on samples obtained from some other mutually independent data sources. This method uses the fact that there is a correlation effect between estimators based on the current sample and auxiliary information from other sources. If variance covariance matrices of vectors of estimators used in the estimating procedure are known, this method produces more efficient estimates in terms of their variances compared to the estimates based on the current sample only. If these variance-covariance matrices are not known, their consistent estimates can be used as well such that the large sample properties of the method remain unchangeable. This approach allows to improve statistical properties of many standard estimators such as an empirical cumulative distribution function, empirical characteristic function, and Nelson-Aalen cumulative hazard estimator.

(Received February 28 2004)

(Accepted July 22 2005)

(Online publication December 16 2005)

Key Words:

  • Auxiliary information;
  • multiple data sources;
  • partially grouped samples;
  • convergence rates.

Mathematics Subject Classification:

  • 62G05;
  • 62G20