API

GraphEM

class GraphEM.solver.GraphEM(tol=0.005)

GraphEM object

Implementation of the GraphEM algorithm

Reference: D. Guillot. B.Rajaratnam. J.Emile-Geay. “Statistical paleoclimate reconstructions via Markov random fields.”

Ann. Appl. Stat. 9 (1) 324 - 352, March 2015.

tol: convergence tolerence of EM algorithm (default: 5e-3) temp_r: reconstructed temperature field proxy_r: reconstructed proxy field calib: index of calibration period Sigma: covariance matrix of the multivariate normal model mu: mean of the multivariate normal model

fit: estimates the GraphEM model EM: Implementation of the EM algorithm with a Gaussian graphical model fit_Sigma: Used to estimate the covariance matrix of the Gaussian graphical model bootstrap: Perform multiple reconstructions via block bootstrap

EM(X, graph, C0=[], M0=[], maxit=200, use_iridge=False)

Expectation-Maximization (EM) algorithm with Gaussian Graphical Model

X: matrix (n x p), time x space

Temperature + Proxy matrix with missing values (np.na)

graph: matrix (p x p)

Adjacency matrix of the graph

C0: matrix (p x p)

Initial covariance matrix of the field (Default = [], uses sample covariance matrix)

M0: vector (p)

Initial mean of the field (Default = [], uses sample mean)

maxit: int

Maximum number of iteration of the algorithm (Default = 200)

use_iridge: boolean

If True, uses Ridge regularization to perform regression (Default = False)

X: matrix (n x p)

Field with reconstructed values

C: matrix (p x p)

Estimated covariance matrix of the field

M: vector (p)

Estimate mean vector of the field

bootstrap(N_boot=20, blocksize=2, n_proc=4, save_graphs=False)

Block bootstrap method for the GraphEM algorithm

N_boot: int

Number of bootstrap samples to use (Default = 20)

blocksize: int

Size of the blocks used when constructing bootstrap samples (Default = 2)

n_proc: int

Number of processors available for distributed computing (not currently implemented).

save_graphs: boolean

Indicated whether or not to save all the graphs estimated via graph_greedy_search (Default = False)

self.temp_r_all: matrix (sample x time x space)

Matrix containing the reconstructed temperature field for each bootstrap sample

fit(temp, proxy, calib, graph=[], lonlat=[], sp_TT=3.0, sp_TP=3.0, sp_PP=3.0, N_graph=30, C0=[], M0=[], maxit=200, bootstrap=False, N_boot=20, distance=1000, graph_method='neighborhood', estimate_graph=True, save_graphs=False)

Estimates the parameters of the GraphEM model and reconstruct the missing values of the temperature and proxy fields.

temp: matrix (time x space)

Temperature field. Missing values stored as “np.na”.

proxy: matrix (time x space)
Proxy data. Missing values stored as “np.na”. The time dimension should be

the same as the temperature field.

calib: vector

Vector of indices representing the calibration period.

graph: matrix

Adjacency matrix of the temperature+proxy field. (Default = [], estimated via graph_greedy_search)

lonlat: matrix ((number of temperature location + number of proxies) x 2). Default = []

Matrix containing the (longitude, latitude) of the temperature and proxy locations. Only used if graph_method = ‘neighborhood’.

sp_TT: float
Target sparsity of the temperature/temperature part of the inverse covariance matrix. Only used

when the graph is estimated by glasso. Default (3.0%)

sp_TP: float
Target sparsity of the temperature/proxy part of the inverse covariance matrix. Only used

when the graph is estimated by glasso. Default (3.0%)

sp_PP: float
Target sparsity of the proxy/proxy part of the inverse covariance matrix. Only used

when the graph is estimated by glasso. Default (3.0%)

N_graph: int
Number of graphs to consider in the graph_greedy_search method (Default = 30). Only used

if the graph is estimated using glasso.

C0: matrix

Initial estimate of the covariance matrix of the temperature+proxy field. (Default = []).

M0: vector

Initial estimate of the mean vector of the temperature+proxy field. (Default = []).

bootstrap: boolean

Indicates whether or not to produce multiple estimates of the reconstructed values via bootstrapping. (Default = False)

N_boot: int

Number of bootstrap samples to use if using the bootstrap method.

distance: float

Radius of the neighborhood graph. Only used if graph_method = ‘neighborhood’. (Default = 1000 km).

graph_method: “neighborhood” or “glasso”

Method to use to estimate the graph. Used only if graph = [] or if estimate_graph = True.

save_graphs: boolean

Indicates whether or not to save all the graphs return by graph_greedy_search (Default = False).

fit_Sigma(S, graph)

Estimates the covariance matrix of the field using the provided graph

S: matrix

Sample covariance matrix

graph: matrix

Adjacency matrix of the graph

C: matrix

Estimated covariance matrix