Learning essential graphs¶
In [1]:
from pylab import *
import matplotlib.pyplot as plt
import os
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
Compare learning algorithms¶
Essentially MIIC and 3off2 computes the essential graph (CPDAG) from data. Essential graphs are PDAGs (Partially Directed Acyclic Graphs).
In [2]:
learner=gum.BNLearner("res/sample_asia.csv")
learner.use3off2()
learner.useNMLCorrection()
print(learner)
Filename : res/sample_asia.csv Size : (50000,8) Variables : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2] Induced types : True Missing values : False Algorithm : 3off2 Score : BDeu Correction : NML (Not used for score-based algorithms) Prior : -
In [3]:
ge3off2=learner.learnEssentialGraph()
In [4]:
gnb.show(ge3off2)
In [5]:
learner=gum.BNLearner("res/sample_asia.csv")
learner.useMIIC()
learner.useNMLCorrection()
print(learner)
gemiic=learner.learnEssentialGraph()
gemiic
Filename : res/sample_asia.csv Size : (50000,8) Variables : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2] Induced types : True Missing values : False Algorithm : MIIC Score : BDeu Correction : NML (Not used for score-based algorithms) Prior : -
Out[5]:
For the others methods, it is possible to obtain the essential graph from the learned BN.
In [6]:
learner=gum.BNLearner("res/sample_asia.csv")
learner.useGreedyHillClimbing()
bnHC=learner.learnBN()
print(learner)
geHC=gum.EssentialGraph(bnHC)
geHC
gnb.sideBySide(bnHC,geHC)
Filename : res/sample_asia.csv Size : (50000,8) Variables : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2] Induced types : True Missing values : False Algorithm : Greedy Hill Climbing Score : BDeu Correction : MDL (Not used for score-based algorithms) Prior : -
In [7]:
learner=gum.BNLearner("res/sample_asia.csv")
learner.useLocalSearchWithTabuList()
print(learner)
bnTL=learner.learnBN()
geTL=gum.EssentialGraph(bnTL)
geTL
gnb.sideBySide(bnTL,geTL)
Filename : res/sample_asia.csv Size : (50000,8) Variables : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2] Induced types : True Missing values : False Algorithm : Local Search with Tabu List Tabu list size : 2 Score : BDeu Correction : MDL (Not used for score-based algorithms) Prior : -
Hence we can compare the 4 algorithms.
In [8]:
(
gnb.flow.clear()
.add(ge3off2,"Essential graph from 3off2")
.add(gemiic,"Essential graph from miic")
.add(bnHC,"BayesNet from GHC")
.add(geHC,"Essential graph from GHC")
.add(bnTL,"BayesNet from TabuList")
.add(geTL,"Essential graph from TabuList")
.display()
)