The weighted dependence measures (`wdm`) API

This notebook demonstrates how to compute various dependence measures using the wdm function in pyvinecopulib, both with and without observation weights.

Available Dependence Measures

The wdm function supports several dependence measures: - Pearson correlation ("pearson", "prho", "cor") - Spearman’s ρ ("spearman", "srho", "rho") - Kendall’s τ ("kendall", "ktau", "tau") - Blomqvist’s β ("blomqvist", "bbeta", "beta") - Hoeffding’s D ("hoeffding", "hoeffd", "d")

[1]:

import numpy as np
import matplotlib.pyplot as plt
import pyvinecopulib as pv

# Set random seed for reproducibility
np.random.seed(42)

1. Basic Usage: Unweighted Dependence Measures

Let’s start with some example data and compute various dependence measures.

[2]:

# Generate correlated data
n = 200
x = np.random.normal(0, 1, n)
y = 0.7 * x + np.random.normal(0, 0.5, n)

# Compute various dependence measures
measures = {
  "Pearson": pv.wdm(x, y, "pearson"),
  "Spearman": pv.wdm(x, y, "spearman"),
  "Kendall": pv.wdm(x, y, "kendall"),
  "Blomqvist": pv.wdm(x, y, "blomqvist"),
  "Hoeffding": pv.wdm(x, y, "hoeffding"),
}

print("Unweighted Dependence Measures:")
print("=" * 35)
for name, value in measures.items():
  print(f"{name:12}: {value:.4f}")

Unweighted Dependence Measures:
===================================
Pearson     : 0.8180
Spearman    : 0.8057
Kendall     : 0.6097
Blomqvist   : 0.6000
Hoeffding   : 0.2696

[3]:

# Visualize the data
plt.figure(figsize=(8, 6))
plt.scatter(x, y, alpha=0.6, s=30)
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Scatter Plot of Generated Data")
plt.grid(True, alpha=0.3)
plt.show()

../_images/examples_06_weighted_dependence_measures_4_0.png

2. Weighted Dependence Measures

Now let’s explore how weights affect the dependence measures. We’ll use different weighting schemes to demonstrate the functionality.

[4]:

# Example 1: Uniform weights (should give same result as unweighted)
weights_uniform = np.ones(n)

print("Uniform Weights vs Unweighted:")
print("=" * 35)
for method in ["pearson", "spearman", "kendall"]:
  unweighted = pv.wdm(x, y, method)
  weighted = pv.wdm(x, y, method, weights_uniform)
  print(f"{method.capitalize():12}: {unweighted:.6f} vs {weighted:.6f}")

Uniform Weights vs Unweighted:
===================================
Pearson     : 0.818019 vs 0.818019
Spearman    : 0.805694 vs 0.805694
Kendall     : 0.609749 vs 0.609749

[5]:

# Example 2: Linear weights (more weight on later observations)
weights_linear = np.linspace(0.1, 2.0, n)

print("\nLinear Weights (0.1 to 2.0):")
print("=" * 35)
for method in ["pearson", "spearman", "kendall"]:
  unweighted = pv.wdm(x, y, method)
  weighted = pv.wdm(x, y, method, weights_linear)
  print(f"{method.capitalize():12}: {unweighted:.4f} -> {weighted:.4f}")


Linear Weights (0.1 to 2.0):
===================================
Pearson     : 0.8180 -> 0.8138
Spearman    : 0.8057 -> 0.7929
Kendall     : 0.6097 -> 0.5965

[6]:

# Example 3: Half-zero weights (focus on second half of data)
weights_half_zero = np.zeros(n)
weights_half_zero[n // 2 :] = 1.0

print("\nHalf-Zero Weights (second half only):")
print("=" * 45)
for method in ["pearson", "spearman", "kendall"]:
  weighted_result = pv.wdm(x, y, method, weights_half_zero)
  second_half_result = pv.wdm(x[n // 2 :], y[n // 2 :], method)
  print(
    f"{method.capitalize():12}: {weighted_result:.6f} (weighted) vs {second_half_result:.6f} (second half)"
  )
  print(
    f"              Difference: {abs(weighted_result - second_half_result):.2e}"
  )


Half-Zero Weights (second half only):
=============================================
Pearson     : 0.830812 (weighted) vs 0.830812 (second half)
              Difference: 0.00e+00
Spearman    : 0.809241 (weighted) vs 0.809241 (second half)
              Difference: 0.00e+00
Kendall     : 0.615354 (weighted) vs 0.615354 (second half)
              Difference: 0.00e+00

3. Comparison of Dependence Measures

Let’s compare how different dependence measures behave with various types of relationships.

[7]:

# Generate different types of relationships
n_comp = 150
x_base = np.random.uniform(-2, 2, n_comp)

relationships = {
  "Linear": 2 * x_base + np.random.normal(0, 0.5, n_comp),
  "Quadratic": x_base**2 + np.random.normal(0, 0.5, n_comp),
  "Exponential": np.exp(x_base / 2) + np.random.normal(0, 0.5, n_comp),
  "Sine": np.sin(2 * x_base) + np.random.normal(0, 0.3, n_comp),
}

print("Dependence Measures for Different Relationships:")
print("=" * 55)
print(
  f"{'Relationship':<12} {'Pearson':<8} {'Spearman':<8} {'Kendall':<8} {'Hoeffding':<10}"
)
print("-" * 55)

for name, y_rel in relationships.items():
  pearson = pv.wdm(x_base, y_rel, "pearson")
  spearman = pv.wdm(x_base, y_rel, "spearman")
  kendall = pv.wdm(x_base, y_rel, "kendall")
  hoeffding = pv.wdm(x_base, y_rel, "hoeffding")

  print(
    f"{name:<12} {pearson:<8.3f} {spearman:<8.3f} {kendall:<8.3f} {hoeffding:<10.6f}"
  )

Dependence Measures for Different Relationships:
=======================================================
Relationship Pearson  Spearman Kendall  Hoeffding
-------------------------------------------------------
Linear       0.975    0.973    0.860    0.687903
Quadratic    -0.046   -0.149   -0.118   0.125984
Exponential  0.780    0.743    0.558    0.232443
Sine         0.268    0.263    0.157    0.085145

[8]:

# Visualize the different relationships
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
axes = axes.ravel()

for i, (name, y_rel) in enumerate(relationships.items()):
  axes[i].scatter(x_base, y_rel, alpha=0.6, s=30)
  axes[i].set_xlabel("X")
  axes[i].set_ylabel("Y")
  axes[i].set_title(f"{name} Relationship")
  axes[i].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

../_images/examples_06_weighted_dependence_measures_11_0.png