Nonparametric Conditional Density Estimation in a High-Dimensional Regression Setting.

Abstract

In some applications (e.g., in cosmology and economics), the regression E[Z|x] is not adequate to represent the association between a predictor x and a response Z because of multi-modality and asymmetry of f(z|x); using the full density instead of a single-point estimate can then lead to less bias in subsequent analysis. As of now, there are no effective ways of estimating f(z|x) when x represents high-dimensional, complex data. In this article, we propose a new nonparametric estimator of f(z|x) that adapts to sparse (low-dimensional) structure in x. By directly expanding f(z|x) in the eigenfunctions of a kernel-based operator, we avoid tensor products in high dimensions as well as ratios of estimated densities. Our basis functions are orthogonal with respect to the underlying data distribution, allowing fast implementation and tuning of parameters. We derive rates of convergence and show that the method adapts to the intrinsic dimension of the data. We also demonstrate the effectiveness of the series method on images, spectra, and an application to photometric redshift estimation of galaxies.

Publication
Journal of Computational and Graphical Statistics