Kde1d

class Kde1d(*args, **kwargs)

A class for univariate kernel density estimation.

The Kde1d class provides methods for univariate kernel density estimation using local polynomial fitting. It can handle data with bounded, unbounded, and discrete support.

The estimator uses a Gaussian kernel in all cases. A log-transform is used if there is only one boundary; a probit transform is used if there are two boundaries. Discrete variables are handled via jittering.

Zero-inflated densities are estimated by a hurdle-model with discrete mass at 0 and the remainder estimated as for continuous data.

Examples

>>> import numpy as np
>>> import pyvinecopulib as pv
>>>
>>> # Unbounded data
>>> x = np.random.normal(0, 1, 500)
>>> fit = pv.Kde1d()
>>> fit.fit(x)
>>> pdf_vals = fit.pdf(np.array([0.0]))
>>> fit.plot(x)
>>>
>>> # Bounded data
>>> x = np.random.gamma(1, size=500)
>>> fit = pv.Kde1d(xmin=0.0, degree=1)
>>> fit.fit(x)
>>> fit.plot(x)
>>>
>>> # Discrete data
>>> x = np.random.binomial(5, 0.5, 500)
>>> fit = pv.Kde1d(xmin=0, xmax=5, type="discrete")
>>> fit.fit(x)
>>> fit.plot(x)

References

Geenens, G. (2014). Probit transformation for kernel density estimation on the unit interval. Journal of the American Statistical Association, 109(505), 346–358. [arXiv:1303.4121](https://arxiv.org/abs/1303.4121)

Geenens, G., & Wang, C. (2018). Local-likelihood transformation kernel density estimation for positive random variables. Journal of Computational and Graphical Statistics, 27(4), 822–835. [arXiv:1602.04862](https://arxiv.org/abs/1602.04862)

Loader, C. (2006). Local Regression and Likelihood. Springer Science & Business Media.

Nagler, T. (2018a). A generic approach to nonparametric function estimation with mixed data. Statistics & Probability Letters, 137, 326–330. [arXiv:1704.07457](https://arxiv.org/abs/1704.07457)

Nagler, T. (2018b). Asymptotic analysis of the jittering kernel density estimator. Mathematical Methods of Statistics, 27, 32–46. [arXiv:1705.05431](https://arxiv.org/abs/1705.05431)

Attributes

bandwidth

Bandwidth parameter.

degree

Degree of the local polynomial.

edf

Effective degrees of freedom.

grid_points

Grid points used for interpolation.

grid_size

Number of grid points for interpolation.

loglik

Log-likelihood of the fitted model.

multiplier

Bandwidth multiplier.

prob0

Point mass at 0 (for zero-inflated models).

type

Variable type as VarType enum.

values

Density values at grid points.

xmax

Upper bound of the density support.

xmin

Lower bound of the density support.

Methods

__init__

Constructor for the Kde1d class.

cdf

Evaluate the cumulative distribution function.

fit

Fit the kernel density estimate to data.

from_grid

Create a Kde1d object from grid points and density values.

from_params

Create a Kde1d object from parameters.

pdf

Evaluate the probability density function.

plot

Generates a plot for the Kde1d object.

quantile

Evaluate the quantile function.

set_xmin_xmax

Set the boundary parameters.

simulate

Simulate data from the fitted density.