Creative Commons License
This pyAgrum's notebook is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Author: Aymen Merrouche and Pierre-Henri Wuillemin.

**Counterfactual in a nutshell**

In [1]:
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
import pyAgrum.causal as csl
import pyAgrum.causal.notebook as cslnb

The 3 next cells describe the fastest way to build counterfactuals in pyAgrum

  • build the model
  • fill the CPTs
  • compute counterfactuals
In [2]:
# building the model
edex = gum.fastBN("Ux[-2,10]->experience[0,20]<-education{low|medium|high}->salary[65,150]<-Us[0,25];experience->salary")
edex
Out[2]:
G Ux Ux experience experience Ux->experience salary salary experience->salary education education education->experience education->salary Us Us Us->salary
In [8]:
# Filling the CPTs

# priors
edex.cpt("Us").fillWith(1).normalize()
edex.cpt("Ux").fillWith(1).normalize()
edex.cpt("education")[:] = [0.4, 0.4, 0.2]

# two equations
edex.cpt("experience").fillWithFunction("10-4*education+Ux")
edex.cpt("salary").fillWithFunction("round(65+2.5*experience+5*education+Us)");
In [9]:
# Counterfactual
pot=csl.counterfactual(cm = csl.CausalModel(edex), 
                       profile = {'experience':'8', 'education': 'low', 'salary' : '86'},
                       whatif={"education"},
                       on={"salary"}, 
                       values = {"education" : 'medium'})
gnb.showProba(pot)

CounterfactualS as a function

We can now fill (most of) the holes in :

Employé EX(u) ED(u) $S_{0}(u)$ $S_{1}(u)$ $S_{2}(u)$
Alice 8 0 86,000 ? ?
Bert 9 1 ? 92,500 ?
Caroline 9 2 ? ? 97,000
David 8 1 ? 91,000 ?
Ernest 12 1 ? 100,000 ?
Frances 13 0 97,000 ? ?
etc
In [10]:
def mean(p):
    return sum([p.variable(0).numerical(i)*p[i] for i in range(p.variable(0).domainSize())])
def affCounterfactualForStudent(model,name,ex,ed,sa,value):
    try:
        s0=csl.counterfactual(cm = model,
                              profile = {'experience':str(ex), 'education': ed, 'salary' : str(sa)},
                              whatif={"education"},
                              on={"salary"},
                              values = {"education" : value})    
        print("{:5.1f}| ".format(mean(s0)),end="")
    except:
        print(" --  | ",end="")        
def forStudent(model,name,ex,ed,sa):
    print("| {:20}| {:2.0f}| {:7}|  {:5.1f}|| ".format(name,ex,ed,sa),end="")
    for value in ['low','medium','high']:
        affCounterfactualForStudent(model,name,ex,ed,sa,value)    
    print()

print("| Name                | Ex| Ed     | S     || s0   | s1   | s2   |")
print("------------------------------------------------------------------")
d=csl.CausalModel(edex)
forStudent(d,"Alice",8,"low",86)
forStudent(d,"Bert",9,"medium",92)
forStudent(d,"Caroline",9,"high",97)
forStudent(d,"Caroline",9,"high",98)
forStudent(d,"David",8,"medium",91)
forStudent(d,"Ernest",12,"medium",100)
forStudent(d,"Frances",13,"low",97)
forStudent(d,"Frances",13,"low",98)
| Name                | Ex| Ed     | S     || s0   | s1   | s2   |
------------------------------------------------------------------
| Alice               |  8| low    |   86.0||  86.0|  81.0|  76.0| 
| Bert                |  9| medium |   92.0||  98.0|  92.0|  88.0| 
| Caroline            |  9| high   |   97.0||  --  |  --  |  --  | 
| Caroline            |  9| high   |   98.0|| 108.0| 103.0|  98.0| 
| David               |  8| medium |   91.0||  96.0|  91.0|  86.0| 
| Ernest              | 12| medium |  100.0|| 105.0| 100.0|  95.0| 
| Frances             | 13| low    |   97.0||  --  |  --  |  --  | 
| Frances             | 13| low    |   98.0||  98.0|  93.0|  88.0| 

We cannot answer neither for Caroline or Frances when salary=97 because their profiles are impossible in our modelisation...

In [11]:
gnb.showPosterior(edex,target="salary",evs={'experience':"9", 'education': "high"}) # 97 is not possible for Caroline
In [12]:
gnb.showPosterior(edex,target="salary",evs={'experience':"13", 'education': "low"}) # 97 is not possible for Frances
In [ ]: