Gaussian Processes with Structured Latent Confounders

GPSLC.jl is a Julia package for semi-parametric causal effect estimation with structured latent confounding. It provides interfaces for performing causal inference over the latent variables and Gaussian process parameters to produce accurate causal effect estimates.

The original GP-SLC paper can be found here: http://proceedings.mlr.press/v119/witty20a/witty20a.pdf.

Inference

GPSLC.gpslcFunction
gpslc(filename * ".csv")
gpslc(filename * ".csv"; hyperparams=hyperparams, priorparams=priorparams))

gpslc(DataFrame(X1=...,X2=...,T=...,Y=...,obj=...))
gpslc(DataFrame(X1=...,X2=...,T=...,Y=...,obj=...); hyperparams=hyperparams, priorparams=priorparams)

Run posterior inference on the input data.

Datatypes of DataFrame or CSV must follow these standards:

  • T (Boolean/Float64)
  • Y (Float64)
  • X1...XN (Float64...Float64)
  • obj (Any)

Optional parameters

Returns a GPSLCObject which stores the hyperparameters, prior parameters, data, and posterior samples.

source

The primary struct that we provide interfaces for is the GPSLCObject, which most of the high-level functions like the treatment effect functions take as input in addition to their other arguments.

Treatment Effects

Individual Treatment Effect (ITE)

A contribution of the original GP-SLC paper is to produce accurate individual treatment effect estimates, conditioned on the observed data and using the inferred values of the latent confounders as determined by the provided structure.

GPSLC.sampleITEFunction
sampleITE(g, doT)
sampleITE(g, doT; samplesPerPosterior=10)

Estimate Individual Treatment Effect with GPSLC model

Params:

  • g::GPSLCObject: Contains data and hyperparameters
  • doT: The requested intervention (e.g. set all treatments to 1.0)
  • samplesPerPosterior: How many ITE samples to draw per posterior sample in g.

Returns:

ITEsamples: n x m matrix where n is the number of individuals, and m is the number of samples.

source

Sample Average Treatment Effect (SATE)

Another popular and useful treatment effect estimate is SATE, which averages individual treatment effects over the individuals in the sample.

GPSLC.sampleSATEFunction
sampleSATE(g, doT)
sampleSATE(g, doT; samplesPerPosterior=10)

Estimate Sample Average Treatment Effect with GPSLC model

Using sampleITE, samples can be drawn for the sample average treatment effect

Params:

  • g::GPSLCObject: Contains data and hyperparameters
  • doT: The requested intervention (e.g. set all treatments to 1.0)
  • samplesPerPosterior: How many samples to draw per posterior sample in g.

Returns:

SATEsamples: n x m matrix where n is the number of individuals, and m is the number of samples.

source

Counterfactual Effects

It can be helpful to calculate treatment effect estimates for the whole domain of treatment values in the data, or some subset, as in the example below. For this we can use predictCounterfactualEffects, which also tracks the values of the doT intervention values for comparison.

GPSLC.predictCounterfactualEffectsFunction
predictCounterfactualOutcomes(g, nSamplesPerMixture)
predictCounterfactualOutcomes(g, nSamplesPerMixture; fidelity=100)
predictCounterfactualOutcomes(g, nSamplesPerMixture; fidelity=100, minDoT=0, maxDoT=5)

Params

  • g::GPSLCObject: The GPSLCObject that inference has already been computed for.
  • nSamplesPerMixture::Int64: The number of outcome samples to

draw from each set of inferred posterior parameters.

  • fidelity::Int64: How many intervention values to use to cover the domain of treatment values. Higher means more samples.
  • minDoT::Float64=min(g.T...): The lowest interventional treatment to use.Defaults to the data g.T's lowest treatment value.
  • maxDoT::Float64=max(g.T...): The highest interventional treatment to use. Defaults to the data g.T's highest treatment value.
julia> ite, doT = predictCounterfactualEffects(g, 30; fidelity=100)

Returns

  • ite::Matrix{Float64}: An array of size [d, n, numPosteriorSamples * nSamplesPerMixture] where d is the number of interventional values defined by fidelity and the range of treatments in g.T - doTrange::Vector{Float64}: The list values of doT used, in order that matches the rows of ite.
source

Summarizing

It can be helpful to summarize the inferred individual treatment effects and sample average treatment effects into mean and credible intervals. A use case for this is demonstrated in the examples section.

GPSLC.summarizeEstimatesFunction
summarizeEstimates(samples)
summarizeEstimates(samples; savetofile="ite_samples.csv")

Summarize Predicted Estimates (Counterfactual Outcomes or Individual Treatment Effects)

Create dataframe of mean, lower and upper quantiles of the samples from sampleITE or predictCounterfactualEffects.

Params:

  • samples: The n x m array of samples from sampleSATE or sampleITE
  • savetofile::String: Optionally save the resultant DataFrame as CSV to the filename passed
  • credible_interval::Float64: A real in [0,1] where 0.90 is the default for a 90% credible interval

Returns:

  • df: Dataframe of Individual, Mean, LowerBound, and UpperBound values for the credible intervals around the sample.
source

Examples

The examples below illustrate use cases for

  • setting hyperparameters,
  • performing inference,
  • saving inference results,
  • predicting counterfactual effects,
  • calculating sample average treatment effect (SATE),
  • computing credible intervals for SATE,
  • and plotting those intervals relative to the original data

New England Energy Consumption

This example creates an example plot of the NEEC treatment vs outcome data. Plots the original and the intervened data together. The example below is similar to Figure 3 in the original GP-SLC paper.

import Random # hide
Random.seed!(1234) # hide
using GPSLC # hide
using Plots # hide
using Statistics # hide

# set hyperparameters
hyperparams = getHyperParameters()
hyperparams.nOuter = 25
hyperparams.nU = 2
hyperparams.nMHInner = 3
hyperparams.nESInner = 5

# run inference
dataFile = "../example_data/NEEC_sampled.csv"
g = gpslc(dataFile; hyperparams=hyperparams)
saveGPSLCObject(g, "exampleGPSLCObject")

# collect counterfactual outcomes
maIdx = vec(g.obj .== "MA")
nSamples = 100
ite, doT = predictCounterfactualEffects(g, nSamples)
maITE = ite[:, maIdx, :]

# get credible interval on counterfactual outcomes
sate = mean(maITE, dims=2)[:, 1, :]
interval = summarizeEstimates(sate)

meanOutcome = mean(g.Y[maIdx])
lowerSATE = interval[!, "LowerBound"]
meanSATE = interval[!, "Mean"]
upperSATE = interval[!, "UpperBound"]

# plot outcomes and credible interval
treatmentScale = 100
outcomeScale = 10

# observed data
plot(legend=:outertopright, size=(750, 400), margin=0.5Plots.cm, dpi=600)
obsT = g.T[maIdx] .* treatmentScale
obsY = g.Y[maIdx] .* outcomeScale
scatter!(obsT, obsY, label="MA obs", markershape=:circle)

# counterfactual
Tcf = doT .* treatmentScale
Ycf = (meanOutcome .+ meanSATE) .* outcomeScale
upper = (upperSATE .- meanSATE) .* outcomeScale
lower = (meanSATE .- lowerSATE) .* outcomeScale

plot!(Tcf, Ycf, label="MA cf", color=:green,
    ribbon=(lower, upper))

xlabel!("Temperature °F")
ylabel!("Energy Consumption (GWh)")
title!("Energy Consumption for Massachusetts")

Above we can see the Gaussian Process using individual treatment effect estimates to predict the energy consumption (outcome) from the temperature (treatment) for Massachusetts. The shaded region is a 90% credible interval from the samples taken by predictCounterfactualEffects and processed by summarizeEstimates which calculates the credible intervals by computing the 5th and 95th percentiles of the samples.

The data used in this example can be found here.

Types

External Types

External types are those relevant for using GPSLC.jl in a high-level way, where inference and prediction are automatically performed.

HyperParameters

The default values for HyperParameters are provided by

GPSLC.getHyperParametersFunction
getHyperParameters()

Returns default values for hyperparameters

  • nU = 1: Number of latent confounding variables assumed to be influencing all the instances that belong to one object. Inference will be performed over these values.
  • nOuter = 20: Number of posterior samples to draw.
  • nMHInner = 5: Number of internal Metropolis-Hastings updates to make per posterior sample.
  • nESInner = 5: Number of elliptical-slice sampling updates to make per posterior for latent confounders and binary treatment.
  • nBurnIn = 5: Number of posterior samples to discard when making predictions and estimates.
  • stepSize = 1: How frequently to use posterior samples (1 being every one after burnIn, higher being every stepSizeth).
  • predictionCovarianceNoise=1e-10: Predicting with Gaussian processes requires use of covariance matrices that are Symmetric Positive Definite, and this covariance noise on the diagonal ensures these operations can be performed in a stable and consistent way.
source

PriorParameters

GPSLC.PriorParametersType
PriorParameters

A dictionary of shapes and scales for various Inverse Gamma distributions used as priors for kernel parameters and other parameters. More information on each of the attributes can be found in getPriorParameters.

source

The default values for PriorParameters are provided by

GPSLC.getPriorParametersFunction
getPriorParameters()

These are standard values for scale and shape of Inverse Gamma priors over kernel parameters, confounder structure covariance noise, and confounder Gaussian prior covariance.

  • uNoiseShape::Float64=4.0: shape of the InvGamma prior over the noise of U
  • uNoiseScale::Float64=4.0: scale of the InvGamma prior over the noise of U
  • xNoiseShape::Float64=4.0: shape of the InvGamma prior over the noise of X
  • xNoiseScale::Float64=4.0: scale of the InvGamma prior over the noise of X
  • tNoiseShape::Float64=4.0: shape of the InvGamma prior over the noise of T
  • tNoiseScale::Float64=4.0: scale of the InvGamma prior over the noise of T
  • yNoiseShape::Float64=4.0: shape of the InvGamma prior over the noise of Y
  • yNoiseScale::Float64=4.0: scale of the InvGamma prior over the noise of Y
  • xScaleShape::Float64=4.0: shape of the InvGamma prior over kernel scale of X
  • xScaleScale::Float64=4.0: scale of the InvGamma prior over kernel scale of X
  • tScaleShape::Float64=4.0: shape of the InvGamma prior over kernel scale of T
  • tScaleScale::Float64=4.0: scale of the InvGamma prior over kernel scale of T
  • yScaleShape::Float64=4.0: shape of the InvGamma prior over kernel scale of Y
  • yScaleScale::Float64=4.0: scale of the InvGamma prior over kernel scale of Y
  • uxLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of U and X
  • uxLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of U and X
  • utLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of U and T
  • utLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of U and T
  • xtLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of X and T
  • xtLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of X and T
  • uyLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of U and Y
  • uyLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of U and Y
  • xyLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of X and Y
  • xyLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of X and Y
  • tyLSShape::Float64=4.0: shape of the InvGamma prior over kernel lengthscale of T and Y
  • tyLSScale::Float64=4.0: scale of the InvGamma prior over kernel lengthscale of T and Y
  • sigmaUNoise::Float64=1.0e-13: noise added to matrix to make covariance stable and invertible
  • sigmaUCov::Float64=1.0: assumed covariance over structured confounders
  • drift::Float64=0.5: as in the paper, Metropolis Hastings Gaussian Drift
source

GPSLCObject

The GPSLCObject is the high-level Julia struct that most of the externally facing interfaces rely on to perform their operations. Since it contains inference samples, the hyperparameters, and the observed data, it is at the center of all post-inference interfaces that manipulate the posterior samples according to the observed data.

This also means that the GPSLCObject contains the result of a large portion of compute time, as well as contains all the relevant data for a given workflow. For this reason, all the fields and functions that utilize it are externally available, and described below, to provide users with a simple way to extend the functionality of GPSLC.jl and estimate other quantities of interest.

Retrieving meta-values from a GPSLCObject

GPSLC.getNXFunction
getNX(g)

Number of covariates (and observed confounders) in dataset.

source
GPSLC.getNUFunction
getNU(g)

Number of latent confounders to perform inference over (hyperparameter).

source
GPSLC.getNumPosteriorSamplesFunction
getNumPosteriorSamples(g)

Number of posterior samples that will be used based on hyperparameters.

(total posterior samples - burn in) / step size = nBurnIn:stepSize:nOuter

source

Saving and loading GPSLCObjects

GPSLCObjects contain all the posterior samples, which can be intensive to calculate and can be reused for various estimations, we provide a pair of interfaces to save and load the GPSLCObjects from the filesystem.

GPSLC.saveGPSLCObjectFunction
saveGPSLCObject(g, filename)
saveGPSLCObject(g, "path/to/filename")
saveGPSLCObject(g, "path/to/filename.gpslc")

This function will save the GPSLCObject g to the file <filename>.gpslc. This GPSLCObject, including the posterior samples contained within it can be retrieved with the loadGPSLCObject function.

Note: The extension .gpslc is optional and will be added if it is not included.

source
GPSLC.loadGPSLCObjectFunction
loadGPSLCObject(filename)
loadGPSLCObject("path/to/filename")
loadGPSLCObject("path/to/filename.gpslc")

This function will load and return the GPSLCObject contained in <filename>.gpslc.

Note: the extension .gpslc is optional and will be added if it is not included.

source

Internal Types

These types are are used in internal inference procedures, so if users need to fine tune or modify the inference procedure, or access the model directly, these will be relevant.

Confounders

GPSLC.ConfoundersType
Confounders (U)

Latent confounders that GPSLC performs inference over.

Either 1D or 2D matrices of Float64 values.

source

Covariates

GPSLC.CovariatesType

Covariates (X) Observed confounders and covariates.

Matrix{Float64} is the only valid structure for covariates

source

Treatment

GPSLC.TreatmentType
Treatment (T)

Is made up of BinaryTreatment which is an alias for Vector{Bool} and ContinuousTreatment which is an alias for Vector{Float64}. These types support other vector types to afford compatibility with internal libraries.

source

Outcome

GPSLC.OutcomeType
Outcome (Y)

The outcome for the series of Gaussian Process predictions is a Vector{Float64}. Currently only continuous values are supported as outcomes for input data.

source