Gaussian Processes with Structured Latent Confounders
GPSLC.jl
is a Julia package for semi-parametric causal effect estimation with structured latent confounding. It provides interfaces for performing causal inference over the latent variables and Gaussian process parameters to produce accurate causal effect estimates.
The original GP-SLC paper can be found here: http://proceedings.mlr.press/v119/witty20a/witty20a.pdf.
Inference
GPSLC.gpslc
— Functiongpslc(filename * ".csv")
gpslc(filename * ".csv"; hyperparams=hyperparams, priorparams=priorparams))
gpslc(DataFrame(X1=...,X2=...,T=...,Y=...,obj=...))
gpslc(DataFrame(X1=...,X2=...,T=...,Y=...,obj=...); hyperparams=hyperparams, priorparams=priorparams)
Run posterior inference on the input data.
Datatypes of DataFrame or CSV must follow these standards:
T
(Boolean/Float64)Y
(Float64)X1...XN
(Float64...Float64)obj
(Any)
Optional parameters
hyperparams::
HyperParameters
=getHyperParameters
()
: Hyper parameters primarily define the high level amount of inference to perform.priorparams::
PriorParameters
=getPriorParameters
()
: Prior parameters define the high level priors to draw from when constructing kernel functions and latent confounder structure.
Returns a GPSLCObject
which stores the hyperparameters, prior parameters, data, and posterior samples.
The primary struct that we provide interfaces for is the GPSLCObject
, which most of the high-level functions like the treatment effect functions take as input in addition to their other arguments.
Treatment Effects
Individual Treatment Effect (ITE)
A contribution of the original GP-SLC paper is to produce accurate individual treatment effect estimates, conditioned on the observed data and using the inferred values of the latent confounders as determined by the provided structure.
GPSLC.sampleITE
— FunctionsampleITE(g, doT)
sampleITE(g, doT; samplesPerPosterior=10)
Estimate Individual Treatment Effect with GPSLC model
Params:
g::
GPSLCObject
: Contains data and hyperparametersdoT
: The requested intervention (e.g. set all treatments to 1.0)samplesPerPosterior
: How many ITE samples to draw per posterior sample ing
.
Returns:
ITEsamples
: n x m
matrix where n
is the number of individuals, and m
is the number of samples.
Sample Average Treatment Effect (SATE)
Another popular and useful treatment effect estimate is SATE, which averages individual treatment effects over the individuals in the sample.
GPSLC.sampleSATE
— FunctionsampleSATE(g, doT)
sampleSATE(g, doT; samplesPerPosterior=10)
Estimate Sample Average Treatment Effect with GPSLC model
Using sampleITE
, samples can be drawn for the sample average treatment effect
Params:
g::
GPSLCObject
: Contains data and hyperparametersdoT
: The requested intervention (e.g. set all treatments to 1.0)samplesPerPosterior
: How many samples to draw per posterior sample ing
.
Returns:
SATEsamples
: n x m
matrix where n
is the number of individuals, and m
is the number of samples.
Counterfactual Effects
It can be helpful to calculate treatment effect estimates for the whole domain of treatment values in the data, or some subset, as in the example below. For this we can use predictCounterfactualEffects
, which also tracks the values of the doT
intervention values for comparison.
GPSLC.predictCounterfactualEffects
— FunctionpredictCounterfactualOutcomes(g, nSamplesPerMixture)
predictCounterfactualOutcomes(g, nSamplesPerMixture; fidelity=100)
predictCounterfactualOutcomes(g, nSamplesPerMixture; fidelity=100, minDoT=0, maxDoT=5)
Params
g::
GPSLCObject
: TheGPSLCObject
that inference has already been computed for.nSamplesPerMixture::Int64
: The number of outcome samples to
draw from each set of inferred posterior parameters.
fidelity::Int64
: How many intervention values to use to cover the domain of treatment values. Higher means more samples.minDoT::Float64=min(g.T...)
: The lowest interventional treatment to use.Defaults to the datag.T
's lowest treatment value.maxDoT::Float64=max(g.T...)
: The highest interventional treatment to use. Defaults to the datag.T
's highest treatment value.
julia> ite, doT = predictCounterfactualEffects(g, 30; fidelity=100)
Returns
ite::Matrix{Float64}
: An array of size[d, n, numPosteriorSamples * nSamplesPerMixture]
where d is the number of interventional values defined byfidelity
and the range of treatments ing.T
-doTrange::Vector{Float64}
: The list values of doT used, in order that matches the rows ofite
.
Summarizing
It can be helpful to summarize the inferred individual treatment effects and sample average treatment effects into mean and credible intervals. A use case for this is demonstrated in the examples section.
GPSLC.summarizeEstimates
— FunctionsummarizeEstimates(samples)
summarizeEstimates(samples; savetofile="ite_samples.csv")
Summarize Predicted Estimates (Counterfactual Outcomes or Individual Treatment Effects)
Create dataframe of mean, lower and upper quantiles of the samples from sampleITE
or predictCounterfactualEffects
.
Params:
samples
: Then x m
array of samples from sampleSATE or sampleITEsavetofile::String
: Optionally save the resultant DataFrame as CSV to the filename passedcredible_interval::Float64
: A real in [0,1] where 0.90 is the default for a 90% credible interval
Returns:
df
: Dataframe of Individual, Mean, LowerBound, and UpperBound values for the credible intervals around the sample.
Examples
The examples below illustrate use cases for
- setting hyperparameters,
- performing inference,
- saving inference results,
- predicting counterfactual effects,
- calculating sample average treatment effect (SATE),
- computing credible intervals for SATE,
- and plotting those intervals relative to the original data
New England Energy Consumption
This example creates an example plot of the NEEC treatment vs outcome data. Plots the original and the intervened data together. The example below is similar to Figure 3 in the original GP-SLC paper.
import Random # hide
Random.seed!(1234) # hide
using GPSLC # hide
using Plots # hide
using Statistics # hide
# set hyperparameters
hyperparams = getHyperParameters()
hyperparams.nOuter = 25
hyperparams.nU = 2
hyperparams.nMHInner = 3
hyperparams.nESInner = 5
# run inference
dataFile = "../example_data/NEEC_sampled.csv"
g = gpslc(dataFile; hyperparams=hyperparams)
saveGPSLCObject(g, "exampleGPSLCObject")
# collect counterfactual outcomes
maIdx = vec(g.obj .== "MA")
nSamples = 100
ite, doT = predictCounterfactualEffects(g, nSamples)
maITE = ite[:, maIdx, :]
# get credible interval on counterfactual outcomes
sate = mean(maITE, dims=2)[:, 1, :]
interval = summarizeEstimates(sate)
meanOutcome = mean(g.Y[maIdx])
lowerSATE = interval[!, "LowerBound"]
meanSATE = interval[!, "Mean"]
upperSATE = interval[!, "UpperBound"]
# plot outcomes and credible interval
treatmentScale = 100
outcomeScale = 10
# observed data
plot(legend=:outertopright, size=(750, 400), margin=0.5Plots.cm, dpi=600)
obsT = g.T[maIdx] .* treatmentScale
obsY = g.Y[maIdx] .* outcomeScale
scatter!(obsT, obsY, label="MA obs", markershape=:circle)
# counterfactual
Tcf = doT .* treatmentScale
Ycf = (meanOutcome .+ meanSATE) .* outcomeScale
upper = (upperSATE .- meanSATE) .* outcomeScale
lower = (meanSATE .- lowerSATE) .* outcomeScale
plot!(Tcf, Ycf, label="MA cf", color=:green,
ribbon=(lower, upper))
xlabel!("Temperature °F")
ylabel!("Energy Consumption (GWh)")
title!("Energy Consumption for Massachusetts")
Above we can see the Gaussian Process using individual treatment effect estimates to predict the energy consumption (outcome) from the temperature (treatment) for Massachusetts. The shaded region is a 90% credible interval from the samples taken by predictCounterfactualEffects
and processed by summarizeEstimates
which calculates the credible intervals by computing the 5th and 95th percentiles of the samples.
The data used in this example can be found here.
Types
External Types
External types are those relevant for using GPSLC.jl
in a high-level way, where inference and prediction are automatically performed.
HyperParameters
GPSLC.HyperParameters
— TypeHyperParameters
Define the high-level attributes of the inference procedure. More information on each of the attributes can be found in getHyperParameters
.
The default values for HyperParameters
are provided by
GPSLC.getHyperParameters
— FunctiongetHyperParameters()
Returns default values for hyperparameters
nU = 1
: Number of latent confounding variables assumed to be influencing all the instances that belong to one object. Inference will be performed over these values.nOuter = 20
: Number of posterior samples to draw.nMHInner = 5
: Number of internal Metropolis-Hastings updates to make per posterior sample.nESInner = 5
: Number of elliptical-slice sampling updates to make per posterior for latent confounders and binary treatment.nBurnIn = 5
: Number of posterior samples to discard when making predictions and estimates.stepSize = 1
: How frequently to use posterior samples (1 being every one after burnIn, higher being everystepSize
th).predictionCovarianceNoise=1e-10
: Predicting with Gaussian processes requires use of covariance matrices that are Symmetric Positive Definite, and this covariance noise on the diagonal ensures these operations can be performed in a stable and consistent way.
PriorParameters
GPSLC.PriorParameters
— TypePriorParameters
A dictionary of shapes and scales for various Inverse Gamma distributions used as priors for kernel parameters and other parameters. More information on each of the attributes can be found in getPriorParameters
.
The default values for PriorParameters
are provided by
GPSLC.getPriorParameters
— FunctiongetPriorParameters()
These are standard values for scale and shape of Inverse Gamma priors over kernel parameters, confounder structure covariance noise, and confounder Gaussian prior covariance.
uNoiseShape::Float64=4.0
: shape of the InvGamma prior over the noise of UuNoiseScale::Float64=4.0
: scale of the InvGamma prior over the noise of UxNoiseShape::Float64=4.0
: shape of the InvGamma prior over the noise of XxNoiseScale::Float64=4.0
: scale of the InvGamma prior over the noise of XtNoiseShape::Float64=4.0
: shape of the InvGamma prior over the noise of TtNoiseScale::Float64=4.0
: scale of the InvGamma prior over the noise of TyNoiseShape::Float64=4.0
: shape of the InvGamma prior over the noise of YyNoiseScale::Float64=4.0
: scale of the InvGamma prior over the noise of YxScaleShape::Float64=4.0
: shape of the InvGamma prior over kernel scale of XxScaleScale::Float64=4.0
: scale of the InvGamma prior over kernel scale of XtScaleShape::Float64=4.0
: shape of the InvGamma prior over kernel scale of TtScaleScale::Float64=4.0
: scale of the InvGamma prior over kernel scale of TyScaleShape::Float64=4.0
: shape of the InvGamma prior over kernel scale of YyScaleScale::Float64=4.0
: scale of the InvGamma prior over kernel scale of YuxLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of U and XuxLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of U and XutLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of U and TutLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of U and TxtLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of X and TxtLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of X and TuyLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of U and YuyLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of U and YxyLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of X and YxyLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of X and YtyLSShape::Float64=4.0
: shape of the InvGamma prior over kernel lengthscale of T and YtyLSScale::Float64=4.0
: scale of the InvGamma prior over kernel lengthscale of T and YsigmaUNoise::Float64=1.0e-13
: noise added to matrix to make covariance stable and invertiblesigmaUCov::Float64=1.0
: assumed covariance over structured confoundersdrift::Float64=0.5
: as in the paper, Metropolis Hastings Gaussian Drift
GPSLCObject
The GPSLCObject
is the high-level Julia struct that most of the externally facing interfaces rely on to perform their operations. Since it contains inference samples, the hyperparameters, and the observed data, it is at the center of all post-inference interfaces that manipulate the posterior samples according to the observed data.
This also means that the GPSLCObject
contains the result of a large portion of compute time, as well as contains all the relevant data for a given workflow. For this reason, all the fields and functions that utilize it are externally available, and described below, to provide users with a simple way to extend the functionality of GPSLC.jl
and estimate other quantities of interest.
GPSLC.GPSLCObject
— TypeGPSLCObject
This is the struct in GPSLC.jl that contains the data, hyperparamters, prior parameters, and posterior samples. It provides the primary interfaces to abstract the internals of GPSLC away from the higher-order functions like sampleITE
, sampleSATE
, and predictCounterfactualEffects
.
Returned by gpslc
Retrieving meta-values from a GPSLCObject
GPSLC.getN
— FunctiongetN(g)
Number of individuals in dataset.
GPSLC.getNX
— FunctiongetNX(g)
Number of covariates (and observed confounders) in dataset.
GPSLC.getNU
— FunctiongetNU(g)
Number of latent confounders to perform inference over (hyperparameter).
GPSLC.getNumPosteriorSamples
— FunctiongetNumPosteriorSamples(g)
Number of posterior samples that will be used based on hyperparameters.
(total posterior samples - burn in) / step size = nBurnIn:stepSize:nOuter
Saving and loading GPSLCObject
s
GPSLCObject
s contain all the posterior samples, which can be intensive to calculate and can be reused for various estimations, we provide a pair of interfaces to save and load the GPSLCObjects
from the filesystem.
GPSLC.saveGPSLCObject
— FunctionsaveGPSLCObject(g, filename)
saveGPSLCObject(g, "path/to/filename")
saveGPSLCObject(g, "path/to/filename.gpslc")
This function will save the GPSLCObject
g
to the file <filename>.gpslc
. This GPSLCObject
, including the posterior samples contained within it can be retrieved with the loadGPSLCObject
function.
Note: The extension .gpslc
is optional and will be added if it is not included.
GPSLC.loadGPSLCObject
— FunctionloadGPSLCObject(filename)
loadGPSLCObject("path/to/filename")
loadGPSLCObject("path/to/filename.gpslc")
This function will load and return the GPSLCObject
contained in <filename>.gpslc
.
Note: the extension .gpslc
is optional and will be added if it is not included.
Internal Types
These types are are used in internal inference procedures, so if users need to fine tune or modify the inference procedure, or access the model directly, these will be relevant.
Confounders
GPSLC.Confounders
— TypeConfounders (U)
Latent confounders that GPSLC performs inference over.
Either 1D or 2D matrices of Float64
values.
Covariates
GPSLC.Covariates
— TypeCovariates (X) Observed confounders and covariates.
Matrix{Float64}
is the only valid structure for covariates
Treatment
GPSLC.Treatment
— TypeTreatment (T)
Is made up of BinaryTreatment
which is an alias for Vector{Bool}
and ContinuousTreatment
which is an alias for Vector{Float64}
. These types support other vector types to afford compatibility with internal libraries.
Outcome
GPSLC.Outcome
— TypeOutcome (Y)
The outcome for the series of Gaussian Process predictions is a Vector{Float64}
. Currently only continuous values are supported as outcomes for input data.