API
GraphEM
- class GraphEM.solver.GraphEM(tol=0.005)
GraphEM object
Implementation of the GraphEM algorithm
- Reference: D. Guillot. B.Rajaratnam. J.Emile-Geay. “Statistical paleoclimate reconstructions via Markov random fields.”
Ann. Appl. Stat. 9 (1) 324 - 352, March 2015.
tol: convergence tolerence of EM algorithm (default: 5e-3) temp_r: reconstructed temperature field proxy_r: reconstructed proxy field calib: index of calibration period Sigma: covariance matrix of the multivariate normal model mu: mean of the multivariate normal model
fit: estimates the GraphEM model EM: Implementation of the EM algorithm with a Gaussian graphical model fit_Sigma: Used to estimate the covariance matrix of the Gaussian graphical model bootstrap: Perform multiple reconstructions via block bootstrap
- EM(X, graph, C0=[], M0=[], maxit=200, use_iridge=False)
Expectation-Maximization (EM) algorithm with Gaussian Graphical Model
- X: matrix (n x p), time x space
Temperature + Proxy matrix with missing values (np.na)
- graph: matrix (p x p)
Adjacency matrix of the graph
- C0: matrix (p x p)
Initial covariance matrix of the field (Default = [], uses sample covariance matrix)
- M0: vector (p)
Initial mean of the field (Default = [], uses sample mean)
- maxit: int
Maximum number of iteration of the algorithm (Default = 200)
- use_iridge: boolean
If True, uses Ridge regularization to perform regression (Default = False)
- X: matrix (n x p)
Field with reconstructed values
- C: matrix (p x p)
Estimated covariance matrix of the field
- M: vector (p)
Estimate mean vector of the field
- bootstrap(N_boot=20, blocksize=2, n_proc=4, save_graphs=False)
Block bootstrap method for the GraphEM algorithm
- N_boot: int
Number of bootstrap samples to use (Default = 20)
- blocksize: int
Size of the blocks used when constructing bootstrap samples (Default = 2)
- n_proc: int
Number of processors available for distributed computing (not currently implemented).
- save_graphs: boolean
Indicated whether or not to save all the graphs estimated via graph_greedy_search (Default = False)
- self.temp_r_all: matrix (sample x time x space)
Matrix containing the reconstructed temperature field for each bootstrap sample
- fit(temp, proxy, calib, graph=[], lonlat=[], sp_TT=3.0, sp_TP=3.0, sp_PP=3.0, N_graph=30, C0=[], M0=[], maxit=200, bootstrap=False, N_boot=20, distance=1000, graph_method='neighborhood', estimate_graph=True, save_graphs=False)
Estimates the parameters of the GraphEM model and reconstruct the missing values of the temperature and proxy fields.
- temp: matrix (time x space)
Temperature field. Missing values stored as “np.na”.
- proxy: matrix (time x space)
- Proxy data. Missing values stored as “np.na”. The time dimension should be
the same as the temperature field.
- calib: vector
Vector of indices representing the calibration period.
- graph: matrix
Adjacency matrix of the temperature+proxy field. (Default = [], estimated via graph_greedy_search)
- lonlat: matrix ((number of temperature location + number of proxies) x 2). Default = []
Matrix containing the (longitude, latitude) of the temperature and proxy locations. Only used if graph_method = ‘neighborhood’.
- sp_TT: float
- Target sparsity of the temperature/temperature part of the inverse covariance matrix. Only used
when the graph is estimated by glasso. Default (3.0%)
- sp_TP: float
- Target sparsity of the temperature/proxy part of the inverse covariance matrix. Only used
when the graph is estimated by glasso. Default (3.0%)
- sp_PP: float
- Target sparsity of the proxy/proxy part of the inverse covariance matrix. Only used
when the graph is estimated by glasso. Default (3.0%)
- N_graph: int
- Number of graphs to consider in the graph_greedy_search method (Default = 30). Only used
if the graph is estimated using glasso.
- C0: matrix
Initial estimate of the covariance matrix of the temperature+proxy field. (Default = []).
- M0: vector
Initial estimate of the mean vector of the temperature+proxy field. (Default = []).
- bootstrap: boolean
Indicates whether or not to produce multiple estimates of the reconstructed values via bootstrapping. (Default = False)
- N_boot: int
Number of bootstrap samples to use if using the bootstrap method.
- distance: float
Radius of the neighborhood graph. Only used if graph_method = ‘neighborhood’. (Default = 1000 km).
- graph_method: “neighborhood” or “glasso”
Method to use to estimate the graph. Used only if graph = [] or if estimate_graph = True.
- save_graphs: boolean
Indicates whether or not to save all the graphs return by graph_greedy_search (Default = False).
- fit_Sigma(S, graph)
Estimates the covariance matrix of the field using the provided graph
- S: matrix
Sample covariance matrix
- graph: matrix
Adjacency matrix of the graph
- C: matrix
Estimated covariance matrix