Vinecop.select

Vinecop.select(self: pyvinecopulib.Vinecop, data: numpy.ndarray[numpy.float64[m, n]], controls: pyvinecopulib.FitControlsVinecop = FitControlsVinecop()) None

In other words, select() behaves differently depending on its current truncation level and the truncation level specified in the controls, respectively called trunc_lvl and controls.trunc_lvl in what follows. Essentially, controls.trunc_lvl defines the object’s truncation level after calling select():

  • If controls.trunc_lvl <= trunc_lvl, the families and parameters for all pairs in trees smaller or equal to controls.trunc_lvl are selected, using the current structure.

  • If controls.trunc_lvl > trunc_lvl, select() behaves as above for all trees that are smaller or equal to trunc_lvl, and then it selects the structure for higher trees along with the families and parameters. This includes the case where trunc_lvl = 0, namely where the structure is fully unspecified.

Selection of the structure is performed using the algorithm of Dissmann, J. F., E. C. Brechmann, C. Czado, and D. Kurowicka (2013). Selecting and estimating regular vine copulae and application to financial returns. Computational Statistics & Data Analysis, 59 (1), 52-69. The dependence measure used to select trees (default: Kendall’s tau) is corrected for ties (see the wdm library).

If the controls object has been instantiated with select_families = false, then the method simply updates the parameters of the pair-copulas without selecting the families or the structure. In this case, this is equivalent to calling fit() for each pair-copula, albeit potentially in parallel if num_threads > 1.

When at least one variable is discrete, two types of “observations” are required: the first \(n \times d\) block contains realizations of \(F_Y(Y), F_X(X)\); the second \(n \times d\) block contains realizations of \(F_Y(Y^-), F_X(X^-), ...\). The minus indicates a left-sided limit of the cdf. For continuous variables the left limit and the cdf itself coincide. For, e.g., an integer-valued variable, it holds \(F_Y(Y^-) = F_Y(Y - 1)\). Continuous variables in the second block can be omitted.

If there are missing data (i.e., NaN entries), incomplete observations are discarded before fitting a pair-copula. This is done on a pair-by-pair basis so that the maximal available information is used.

Parameters:
  • data\(n \times (d + k)\) or \(n \times 2d\) matrix of observations, where \(k\) is the number of discrete variables.

  • controls – The controls to the algorithm (see FitControlsVinecop()).