Higher Order Spline Highly Adaptive Lasso Estimators of Functional Parameters: Pointwise Asymptotic Normality and Uniform Convergence Rates
We consider estimation of a functional of the data distribution based on i.i.d. observations. We assume the target function can be defined as the minimizer of the expectation of a loss function over a class of d-variate real valued cadlag functions that have finite sectional variation norm. For all k=0,1,…, we define a k-th order smoothness class of functions as d-variate functions on the unit cube for which each of a sequentially defined k-th order Radon-Nikodym derivative w.r.t. Lebesgue measure is cadlag and of bounded variation. For a target function in this k-th order smoothness class we provide a representation of the target function as an infinite linear combination of tensor products of ≤ k-th order spline basis functions indexed by a knot-point, where the lower (than k) order spline basis functions are used to represent the function at the 0-edges. The L_1-norm of the coefficients represents the sum of the variation norms across all the k-th order derivatives, which is called the k-th order sectional variation norm of the target function. This generalizes the zero order spline representation of cadlag functions with bounded sectional variation norm to higher order smoothness classes. We use this k-th order spline representation of a function to define the k-th order spline sieve minimum loss estimator (MLE), Highly Adaptive Lasso (HAL) MLE, and Relax HAL-MLE. For first and higher order smoothness classes, in this article we analyze these three classes of estimators and establish pointwise asymptotic normality and uniform convergence at dimension free rate n^-k^*/(2k^*+1) up till a power of log n depending on the dimension, where k^*=k+1, assuming appropriate undersmoothing is used in selecting the L_1-norm. We also establish asymptotic linearity of plug-in estimators of pathwise differentiable features of the target function.
READ FULL TEXT 
  
  
     share
 share