This notebook follows the example from "The Book Of Why" (Pearl, 2018) chapter 8 page 251.
from IPython.display import display, Math, Latex,HTML
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
import pyAgrum.causal as csl
import pyAgrum.causal.notebook as cslnb
import os
import math
import numpy as np
import scipy.stats
In this example we are interested in the effect of experience and education on the salary of an employee, we are in possession of the following data:
Employé | EX(u) | ED(u) | $S_{0}(u)$ | $S_{1}(u)$ | $S_{2}(u)$ |
---|---|---|---|---|---|
Alice | 8 | 0 | 86,000 | ? | ? |
Bert | 9 | 1 | ? | 92,500 | ? |
Caroline | 9 | 2 | ? | ? | 97,000 |
David | 8 | 1 | ? | 91,000 | ? |
Ernest | 12 | 1 | ? | 100,000 | ? |
Frances | 13 | 0 | 97,000 | ? | ? |
etc |
We are left with the previous data and we want to answer the counterfactual question What would Alice's salary be if she attended college ? (i.e. $S_{1}(Alice)$)
In this model it is assumed that an employee's salary is determined by his level of education and his experience. Years of experience are also affected by the level of education. Having a higher level of education means spending more time studying hence less experience.
edex = gum.fastBN("Ux[-2,10]->experience[0,20]<-education{low|medium|high}->salary[65,150];"
"experience->salary<-Us[0,25]")
edex
However counterfactual queries are specific to one datapoint (in our case Alice), we need to add additional variables to our model to allow for individual variations:
# no prior information about the individual (datapoint)
edex.cpt("Us").fillWith(1).normalize()
edex.cpt("Ux").fillWith(1).normalize()
# education level(supposed)
edex.cpt("education")[:] = [0.4, 0.4, 0.2]
# To have probabilistic results, we add a perturbation. (Gaussian around the exact values)
# we calculate a gaussian distribution
x_min = 0.0
x_max = 4.0
mean = 2.0
std = 0.65
x = np.linspace(x_min, x_max, 5)
y = scipy.stats.norm.pdf(x,mean,std)
print("We'll use the following distribution \n",y)
We'll use the following distribution [0.00539715 0.18794845 0.61375735 0.18794845 0.00539715]
Experience listens to Education and Ux : $$Ex = 10 -4 \times Ed + Ux$$
edex.cpt("experience").fillWithFunction("10-4*education+Ux",noise=list(y))
edex.cpt("experience")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1889 | 0.6168 | 0.1889 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0067 | 0.2329 | 0.7604 | ||
| 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |
0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | ||
| 0.7604 | 0.2329 | 0.0067 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |
0.1889 | 0.6168 | 0.1889 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | ||
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0054 | 0.1879 | 0.6135 | 0.1879 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
Salary listens to Education, Experience and Us : $$S = 65 + 2.5 \times Ex + 5 \times Ed + Us$$
edex.cpt("salary").fillWithFunction("round(65+2.51*experience+5*education+Us)",noise=list(y))
gnb.showInference(edex)
To answer this counterfactual question we will follow the three steps algorithm from "The Book Of Why" (Pearl 2018) chapter 8 page 253 :
Use the data to retrieve all the information that characterizes Alice
From the data we can retrieve Alice's profile :
We will use Alice's profile to get $U_s$ and $U_x$, which tell Alice apart from the rest of the data.
ie=gum.LazyPropagation(edex)
ie.setEvidence({'experience':8, 'education': 'low', 'salary' : "86"})
ie.makeInference()
newUs = ie.posterior("Us")
newUs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.1889 | 0.6168 | 0.1889 | 0.0054 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
ie=gum.LazyPropagation(edex)
ie.setEvidence({'experience':8, 'education': 'low', 'salary' : "86"})
ie.makeInference()
newUx = ie.posterior("Ux")
newUx
|
|
|
|
|
|
|
|
|
|
|
|
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
0.7604 | 0.2329 | 0.0067 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
gnb.showInference(edex,evs={'experience':8, 'education': "low", 'salary' : "86"},targets={'Ux','Us'})
Change the model to match the hypothesis implied by the query (if she had attended university) and then use the data that characterizes Alice to calculate her salary.
We create a counterfactual world with Alice's idiosyncratic factors, and we operate the intervention:
# the counterfactual world
edexCounterfactual = gum.BayesNet(edex)
# we replace the prior probabilities of idiosynatric factors with potentials calculated earlier
edexCounterfactual.cpt("Ux").fillWith(newUx)
edexCounterfactual.cpt("Us").fillWith(newUs)
gnb.showInference(edexCounterfactual,size="10")
print("counterfactual world created")
counterfactual world created
# We operate the intervention
edexModele = csl.CausalModel(edexCounterfactual)
cslnb.showCausalImpact(edexModele,"salary",doing="education",values={"education":"medium"})
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0002 | 0.0010 | 0.0020 | 0.0066 | 0.0342 | 0.0846 | 0.1525 | 0.2357 |