GGMP#

GGMP (Gaussian GP for Gaussian Mixture data) models distributional observations as Gaussian mixture models, fitting one GP per mixture component.

class fvgp.ggmp.GGMP(x_data, y_data, *, hps_obj, gp_kernel_functions=None, gp_mean_functions=None, likelihood_terms=5, gp_init_kwargs=None, gp_device_ids=None, gp_eval_parallel=False)[source]#

Gaussian GP for Gaussian Mixture data (GGMP).

GGMP models distributional data: each of the N input locations has an associated probability density (given as a (domain, density) pair) instead of a scalar observation. Internally it represents each density as a K-component Gaussian Mixture Model (GMM) and places one independent GP per component. Component k’s GP is trained on the vector of component-k means across all N stations, using the corresponding component-k variances as noise.

CAUTION: Beta release. The current implementation of GGMP is a proof-of-concept and may contain unknown bugs. It is recommended to test on synthetic data before applying to real datasets.

Typical workflow:

ggmp = GGMP(x_data, y_data, hps_obj=hps, likelihood_terms=K)
ggmp.initLikelihoods()
ggmp.initGPs()
ggmp.train(method="local", max_iter=200)
mean = ggmp.posterior_mean(x_pred)
var  = ggmp.posterior_variance(x_pred)
Parameters:
  • x_data (np.ndarray, shape (N, D)) – Spatial or feature-space locations of the N stations.

  • y_data (list of (domain, density) tuples, length N) – Each entry is a pair of 1-D arrays describing the empirical PDF at one station: domain is the evaluation grid and density is the (unnormalized) density values on that grid.

  • hps_obj (hyperparameters) – A hyperparameters instance holding K sets of GP hyperparameters (one per component) together with their optimization bounds.

  • gp_kernel_functions (list of callables or None, optional) – Per-component kernel functions passed to GP. Defaults to the fvGP built-in anisotropic Matérn kernel for each component.

  • gp_mean_functions (list of callables or None, optional) – Per-component prior mean functions. Defaults to constant_mean (a trainable constant) for each component.

  • likelihood_terms (int, optional) – Number of GMM components K. Must match len(hps_obj.hps). Default is 5.

  • gp_init_kwargs (dict, optional) – Extra keyword arguments forwarded to GP at construction time (e.g. ram_economy, compute_device).

  • gp_device_ids (list of int, str, or None, optional) – GPU device IDs for multi-GPU evaluation. Pass "auto" to use all detected GPUs. Ignored when no GPU backend is available.

  • gp_eval_parallel (bool, optional) – If True, evaluate the K component GPs in parallel using a thread pool. Default is False.

build_pairwise_data_generating_normals(idx_a, idx_b)[source]#

Construct the K 2D Gaussians for a pair of datapoints by pairing the component mean/variance entries at each index.

initLikelihoods(init_mean=None, init_std=None, weights=None)[source]#

Initialize the K NormalLikelihood objects (one per GMM component).

Computes per-station empirical means and variances from self.y_data and uses them to seed the component parameters. Must be called before initGPs().

Parameters:
  • init_mean (list of np.ndarray or None, optional) – Initial component mean vectors, each of length N (number of stations). If None, component means are spread around the per-station empirical mean with small offsets so components start distinguishable.

  • init_std (list of np.ndarray or None, optional) – Initial component standard-deviation vectors, each of length N. If None, defaults to the per-station empirical standard deviation for every component.

  • weights (np.ndarray of shape (K,) or None, optional) – Initial mixture weights. If None, uniform weights (1/K) are used.

Returns:

The initialized likelihood objects, also stored as self.likelihoods.

Return type:

list of NormalLikelihood

initGPs()[source]#

Construct one GP per GMM component and sync hyperparameters.

Each GP is trained on the vector of component-k means across all stations, with the component-k variances used as noise. The last hyperparameter is initialized to the empirical mean of the training data so each component starts with a sensible prior mean when using constant_mean.

Requires initLikelihoods() to have been called first.

After construction the internal hps_obj is updated to reflect any corrections applied to the initial hyperparameters (e.g. mean re-centering).

train(hyperparameter_bounds=None, init_hyperparameters=None, method='local', max_iter=120, train_weights=True, weight_method='density', weight_max_iter=200, weight_tol=1e-10, weight_floor=1e-09, y_samples=None, **kwargs)[source]#

Train the GGMP in two phases: GP hyperparameters, then mixture weights.

Phase 1 trains each component GP independently by maximizing its marginal log-likelihood.

Phase 2 (when train_weights=True) re-optimizes the K mixture weights using an EM algorithm on the distributional objective. Two methods are available:

  • "density" (default) — uses the stored y_data (domain/density pairs). Each component’s predictive density at training locations is evaluated and the EM objective maximizes the weighted log-likelihood of the observed PDFs.

  • "samples" — uses raw sample arrays supplied via y_samples. Each component’s predictive distribution is treated as a 1-D Gaussian and the EM objective maximizes the mixture log-likelihood over the samples.

Parameters:
  • hyperparameter_bounds (list of np.ndarray or None, optional) – List of K bound arrays, each of shape (d_k, 2). Defaults to hps_obj.hps_bounds.

  • init_hyperparameters (list of np.ndarray or None, optional) – List of K initial hyperparameter vectors. Defaults to each GP’s current hyperparameters.

  • method (str, optional) – Optimization method for Phase 1. Common choices are "local" (L-BFGS-B), "global", "hgdl", "mcmc". Default "local".

  • max_iter (int, optional) – Maximum iterations for Phase 1. Default 120.

  • train_weights (bool, optional) – If True, run Phase 2 weight optimization after GP training. Default True.

  • weight_method ({"density", "samples"}, optional) – EM objective for weight optimization. "density" uses self.y_data; "samples" uses y_samples. Default "density".

  • weight_max_iter (int, optional) – Maximum EM iterations for weight optimization. Default 200.

  • weight_tol (float, optional) – L1 convergence tolerance for weight EM. Default 1e-10.

  • weight_floor (float, optional) – Minimum weight per component to avoid collapse. Default 1e-9.

  • y_samples (list of np.ndarray or None, optional) – Required when weight_method="samples". List of N sample arrays, each of shape (T_n,) or (T_n, 1).

  • **kwargs – Additional keyword arguments forwarded to GP.train().

Returns:

Optimized hyperparameter vectors for each component, also stored in self.hps_obj.

Return type:

list of np.ndarray

posterior_mean(x_pred)[source]#

Compute the posterior mean of the GGMP mixture at prediction points.

The mixture posterior mean is the weight-averaged mean over all K component GPs:

\[\mu(x^*) = \sum_{k=1}^{K} w_k \, \mu_k(x^*)\]

where \(\mu_k\) is the posterior mean of component k’s GP and \(w_k\) is its mixture weight (from likelihoods).

Parameters:

x_pred (np.ndarray, shape (M, D)) – Prediction locations.

Returns:

Posterior mean of the GMM-GP mixture at each prediction point.

Return type:

np.ndarray, shape (M,)

posterior_variance(x_pred)[source]#

Compute the posterior variance of the GGMP mixture at prediction points.

Uses the law of total variance to combine the K component GPs.

Each component k has a predictive variance (paper Eq. 22)

\[v_k(x^*) = \nu_k(x^*) + \bar{s}^2_k\]

where \(\nu_k(x^*)\) is the GP posterior variance (uncertainty about the latent component mean) and \(\bar{s}^2_k = \tfrac{1}{N}\sum_n s^2_{nk}\) is the mean training within-component variance, used as the noise floor at unseen locations. The mixture variance is then

\[\text{Var}(x^*) = \underbrace{\sum_k w_k \, v_k(x^*)}_{\text{expected variance}} + \underbrace{\sum_k w_k \bigl(\mu_k(x^*) - \mu(x^*)\bigr)^2}_{\text{variance of means}}\]

where \(\mu_k\) is the GP posterior mean and \(w_k\) is the mixture weight for component k.

Parameters:

x_pred (np.ndarray, shape (M, D)) – Prediction locations.

Returns:

Posterior variance of the GMM-GP mixture at each prediction point.

Return type:

np.ndarray, shape (M,)

hyperparameters#

class fvgp.ggmp.hyperparameters(weights, weights_bounds, hps, hps_bounds)[source]#

Container for GGMP hyperparameters: mixture weights plus per-component GP parameters.

Stores K sets of GP hyperparameters (one per GMM component) together with K mixture weights, and provides helpers to flatten/unflatten them into a single vector suitable for optimizers.

Parameters:
  • weights (np.ndarray, shape (K,)) – Initial mixture weights.

  • weights_bounds (np.ndarray, shape (K, 2)) – Lower and upper bounds for each weight.

  • hps (list of np.ndarray) – List of K hyperparameter vectors, one per GMM component.

  • hps_bounds (list of np.ndarray) – List of K bound arrays (shape (d_k, 2)) corresponding to each hps[k].

set(weights, hps)[source]#

Update weights and per-component hyperparameters and refresh the flat vector.

vectorize_hps(weights, hps)[source]#

Flatten weights and per-component hyperparameters into a single 1-D array.

devectorize_hps(v)[source]#

Split a flat parameter vector back into (weights, list-of-hps-arrays).

vectorize_bounds(weights_bounds, hps_bounds)[source]#

Flatten weight bounds and per-component hyperparameter bounds into a single array.

devectorize_bounds(b)[source]#

Split a flat bounds array back into (weights_bounds, list-of-hps-bounds).

NormalLikelihood#

class fvgp.ggmp.NormalLikelihood(mean, variance, weight)[source]#

Diagonal Gaussian likelihood for one GMM component.

Stores the component mean vector and variance vector (both of length N, one entry per station), along with the component’s mixture weight. Used as the training signal for each per-component GP inside GGMP.

Parameters:
  • mean (np.ndarray, shape (N,)) – Component mean at each station.

  • variance (np.ndarray, shape (N,)) – Component variance at each station (used as GP observation noise).

  • weight (float) – Mixture weight for this component; should satisfy 0 < weight <= 1.

set_moments(mean, variance)[source]#

Update the component mean and variance vectors.

set_weight(weight)[source]#

Update the mixture weight for this component.

unravel()[source]#

Return mean and variance concatenated into a single 1-D array.

ravel(vec)[source]#

Split a concatenated array back into (mean, variance) of length self.dim each.