Hostname: page-component-8448b6f56d-cfpbc Total loading time: 0 Render date: 2024-04-17T00:05:13.761Z Has data issue: false hasContentIssue false

GENERAL INEQUALITIES FOR GIBBS POSTERIOR WITH NONADDITIVE EMPIRICAL RISK

Published online by Cambridge University Press:  16 April 2014

Cheng Li*
Affiliation:
Northwestern University
Wenxin Jiang
Affiliation:
Northwestern University
Martin A. Tanner
Affiliation:
Northwestern University
*
*Address correspondence to Cheng Li, Department of Statistics, Northwestern University, 2006 Sheridan Road, Evanston, IL 60208, USA; e-mail: chengli2014@u.northwestern.edu.

Abstract

The Gibbs posterior is a useful tool for risk minimization, which adopts a Bayesian framework and can incorporate convenient computational algorithms such as Markov chain Monte Carlo. We derive risk bounds for the Gibbs posterior using some general nonasymptotic inequalities, which can be used to derive nearly optimal convergence rates and select models to optimally balance the approximation errors and the stochastic errors. These inequalities are formulated in a very general way that does not require the empirical risk to be a usual sample average over independent observations. We apply this framework to study the convergence rate of the GMM (generalized method of moments) risk and derive an oracle inequality for the ranking risk, where models are selected based on the Gibbs posterior with a nonadditive empirical risk.

Type
ARTICLES
Copyright
Copyright © Cambridge University Press 2014 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Audibert, J.Y. (2004) Classification Using Gibbs Estimators Under Complexity and Margin Assumptions. Technical report, Laboratoire de Probabilités et Modelès Aléatoires, http://www.proba. jussieu.fr/mathdoc/textes/PMA-908.pdf.Google Scholar
Belloni, A. & Chernozhukov, V. (2009) On the computational complexity of MCMC-based estimators in large samples. The Annals of Statistics 37, 20112055.Google Scholar
Catoni, O. (2007) PAC-Bayesian Supervised Classification (The Thermodynamics of Statistical Learning). Lecture Notes—Monograph Series, vol. 56. IMS.Google Scholar
Chen, K., Jiang, W., & Tanner, M.A. (2010) A note on some algorithms for the Gibbs posterior. Statistics and Probability Letters 80, 12341241.Google Scholar
Chernozhukov, V. & Hong, H. (2003) An MCMC approach to classical estimation. Journal of Econometrics 115, 293346.Google Scholar
Clémencon, S., Lugosi, G., & Vayatis, N. (2008) Ranking and empirical minimization of U-statistics. The Annals of Statistics 36, 844874.Google Scholar
Ghosal, S. (2000) Asymptotic normality of posterior distributions for exponential families when the number of parameters tends to infinity. Journal of Multivariate Analysis 73, 4968.Google Scholar
Han, A.K. (1987) Non-parametric analysis of a generalized regression model—the maximum rank correlation estimator. Journal of Econometrics 35, 303316.Google Scholar
Jiang, W. & Tanner, M.A. (2008) Gibbs posterior for variable selection in high dimensional classification and data mining. The Annals of Statistics 36, 22072231.Google Scholar
Jiang, W. & Tanner, M.A. (2010) Risk minimization for time series binary choice with variable selection. Econometric Theory 26, 14371452.Google Scholar
Jun, S.J., Pinske, J., & Wan, Y. (2011) $\sqrt n $-consistent robust integration-based estimation. Journal of Multivariate Analysis 102, 828846.Google Scholar
Jun, S.J., Pinske, J., & Wan, Y. (2013) Classical Laplace Estimation for$\root 3 \of n $-Consistent Estimators: Improved Convergence Rates and Rate-Adaptive Inference. Technical report, http://joris.econ.psu.edu/ papers/Jun_Pinkse_Wan_cuberoot.pdf.Google Scholar
Massart, P. (2003) Concentration Inequalities and Model Selection. Springer.Google Scholar
Rejchel, W. (2012) On ranking and generalization bounds. Journal of Machine Learning Research 13, 13731392.Google Scholar
Sherman, R.P. (1993) The limiting distribution of the maximum rank correlation estimator. Econometrica 61, 123137.Google Scholar
van der Vaart, A.W. & Wellner, J.A. (1996) Weak Convergence and Empirical Process. Springer.Google Scholar
Wang, L. (2011) GEE analysis of clustered binary data with diverging number of covariates. The Annals of Statistics 39, 389417.Google Scholar
Yao, L., Jiang, W., & Tanner, M.A. (2011) Predicting panel data binary choice with the Gibbs posterior. Neural Computation 23, 26832712.Google Scholar
Zhang, T. (1999) Theoretical analysis of a class of randomized regularization methods. In COLT ’99 Proceedings of the Twelfth Annual Conference on Computational Learning Theory, pp. 156163.Google Scholar
Zhang, T. (2006) Information theoretical upper and lower bounds for statistical estimation. IEEE Transactions on Information Theory 52, 13071321.Google Scholar