Author: Aymen Merrouche and Pierre-Henri Wuillemin.

**Counterfactual in a nutshell**

In [1]:
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
import pyAgrum.causal as csl
import pyAgrum.causal.notebook as cslnb


The 3 next cells describe the fastest way to build counterfactuals in pyAgrum¶

• build the model
• fill the CPTs
• compute counterfactuals
In [2]:
# building the model
edex = gum.fastBN("Ux[-2,10]->experience[0,20]<-education{low|medium|high}->salary[65,150]<-Us[0,25];experience->salary")
edex

Out[2]:
In [8]:
# Filling the CPTs

# priors
edex.cpt("Us").fillWith(1).normalize()
edex.cpt("Ux").fillWith(1).normalize()
edex.cpt("education")[:] = [0.4, 0.4, 0.2]

# two equations
edex.cpt("experience").fillWithFunction("10-4*education+Ux")
edex.cpt("salary").fillWithFunction("round(65+2.5*experience+5*education+Us)");

In [9]:
# Counterfactual
pot=csl.counterfactual(cm = csl.CausalModel(edex),
profile = {'experience':'8', 'education': 'low', 'salary' : '86'},
whatif={"education"},
on={"salary"},
values = {"education" : 'medium'})
gnb.showProba(pot)


CounterfactualS as a function¶

We can now fill (most of) the holes in :¶

EmployÃ© EX(u) ED(u) $S_{0}(u)$ $S_{1}(u)$ $S_{2}(u)$
Alice 8 0 86,000 ? ?
Bert 9 1 ? 92,500 ?
Caroline 9 2 ? ? 97,000
David 8 1 ? 91,000 ?
Ernest 12 1 ? 100,000 ?
Frances 13 0 97,000 ? ?
etc
In [10]:
def mean(p):
return sum([p.variable(0).numerical(i)*p[i] for i in range(p.variable(0).domainSize())])
def affCounterfactualForStudent(model,name,ex,ed,sa,value):
try:
s0=csl.counterfactual(cm = model,
profile = {'experience':str(ex), 'education': ed, 'salary' : str(sa)},
whatif={"education"},
on={"salary"},
values = {"education" : value})
print("{:5.1f}| ".format(mean(s0)),end="")
except:
print(" --  | ",end="")
def forStudent(model,name,ex,ed,sa):
print("| {:20}| {:2.0f}| {:7}|  {:5.1f}|| ".format(name,ex,ed,sa),end="")
for value in ['low','medium','high']:
affCounterfactualForStudent(model,name,ex,ed,sa,value)
print()

print("| Name                | Ex| Ed     | S     || s0   | s1   | s2   |")
print("------------------------------------------------------------------")
d=csl.CausalModel(edex)
forStudent(d,"Alice",8,"low",86)
forStudent(d,"Bert",9,"medium",92)
forStudent(d,"Caroline",9,"high",97)
forStudent(d,"Caroline",9,"high",98)
forStudent(d,"David",8,"medium",91)
forStudent(d,"Ernest",12,"medium",100)
forStudent(d,"Frances",13,"low",97)
forStudent(d,"Frances",13,"low",98)

| Name                | Ex| Ed     | S     || s0   | s1   | s2   |
------------------------------------------------------------------
| Alice               |  8| low    |   86.0||  86.0|  81.0|  76.0|
| Bert                |  9| medium |   92.0||  98.0|  92.0|  88.0|
| Caroline            |  9| high   |   97.0||  --  |  --  |  --  |
| Caroline            |  9| high   |   98.0|| 108.0| 103.0|  98.0|
| David               |  8| medium |   91.0||  96.0|  91.0|  86.0|
| Ernest              | 12| medium |  100.0|| 105.0| 100.0|  95.0|
| Frances             | 13| low    |   97.0||  --  |  --  |  --  |
| Frances             | 13| low    |   98.0||  98.0|  93.0|  88.0|


We cannot answer neither for Caroline or Frances when salary=97 because their profiles are impossible in our modelisation...

In [11]:
gnb.showPosterior(edex,target="salary",evs={'experience':"9", 'education': "high"}) # 97 is not possible for Caroline

In [12]:
gnb.showPosterior(edex,target="salary",evs={'experience':"13", 'education': "low"}) # 97 is not possible for Frances

In [ ]: