Ebook568 pages5 hours

Bayesian Networks: An Introduction

Name: Bayesian Networks: An Introduction
Author: Timo Koski
ISBN: 9781119964957

By Timo Koski and John Noble

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Bayesian Networks: An Introduction provides a self-contained introduction to the theory and applications of Bayesian networks, a topic of interest and importance for statisticians, computer scientists and those involved in modelling complex data sets. The material has been extensively tested in classroom teaching and assumes a basic knowledge of probability, statistics and mathematics. All notions are carefully explained and feature exercises throughout.

Features include:

An introduction to Dirichlet Distribution, Exponential Families and their applications.
A detailed description of learning algorithms and Conditional Gaussian Distributions using Junction Tree methods.
A discussion of Pearl's intervention calculus, with an introduction to the notion of see and do conditioning.
All concepts are clearly defined and illustrated with examples and exercises. Solutions are provided online.

This book will prove a valuable resource for postgraduate students of statistics, computer engineering, mathematics, data mining, artificial intelligence, and biology.

Researchers and users of comparable modelling or statistical techniques such as neural networks will also find this book of interest.

Skip carousel

Mathematics

LanguageEnglish

PublisherWiley

Release dateAug 26, 2011

ISBN9781119964957

Author

Timo Koski

Related authors

Skip carousel

Related to Bayesian Networks

Titles in the series (100)

Skip carousel

Nonlinear Statistical Models
Ebook
Nonlinear Statistical Models
byA. Ronald Gallant
Rating: 0 out of 5 stars
0 ratings
Measurement Errors in Surveys
Ebook
Measurement Errors in Surveys
byPaul P. Biemer
Rating: 0 out of 5 stars
0 ratings
Statistics and Causality: Methods for Applied Empirical Research
Ebook
Statistics and Causality: Methods for Applied Empirical Research
byWolfgang Wiedermann
Rating: 0 out of 5 stars
0 ratings
Applications of Statistics to Industrial Experimentation
Ebook
Applications of Statistics to Industrial Experimentation
byCuthbert Daniel
Rating: 3 out of 5 stars
3/5
Nonparametric Finance
Ebook
Nonparametric Finance
byJussi Klemelä
Rating: 0 out of 5 stars
0 ratings
A Course in Time Series Analysis
Ebook
A Course in Time Series Analysis
byDaniel Peña
Rating: 3 out of 5 stars
3/5
Theory of Probability: A critical introductory treatment
Ebook
Theory of Probability: A critical introductory treatment
byBruno de Finetti
Rating: 0 out of 5 stars
0 ratings
Measuring Agreement: Models, Methods, and Applications
Ebook
Measuring Agreement: Models, Methods, and Applications
byPankaj K. Choudhary
Rating: 0 out of 5 stars
0 ratings
Forecasting with Univariate Box - Jenkins Models: Concepts and Cases
Ebook
Forecasting with Univariate Box - Jenkins Models: Concepts and Cases
byAlan Pankratz
Rating: 0 out of 5 stars
0 ratings
Linear Statistical Inference and its Applications
Ebook
Linear Statistical Inference and its Applications
byC. Radhakrishna Rao
Rating: 0 out of 5 stars
0 ratings
Theory of Ridge Regression Estimation with Applications
Ebook
Theory of Ridge Regression Estimation with Applications
byA. K. Md. Ehsanes Saleh
Rating: 0 out of 5 stars
0 ratings
Robust Correlation: Theory and Applications
Ebook
Robust Correlation: Theory and Applications
byGeorgy L. Shevlyakov
Rating: 0 out of 5 stars
0 ratings
Computation for the Analysis of Designed Experiments
Ebook
Computation for the Analysis of Designed Experiments
byRichard Heiberger
Rating: 0 out of 5 stars
0 ratings
Multiple Imputation for Nonresponse in Surveys
Ebook
Multiple Imputation for Nonresponse in Surveys
byDonald B. Rubin
Rating: 2 out of 5 stars
2/5
Business Survey Methods
Ebook
Business Survey Methods
byBrenda G. Cox
Rating: 0 out of 5 stars
0 ratings
Aspects of Multivariate Statistical Theory
Ebook
Aspects of Multivariate Statistical Theory
byRobb J. Muirhead
Rating: 0 out of 5 stars
0 ratings
Fundamental Statistical Inference: A Computational Approach
Ebook
Fundamental Statistical Inference: A Computational Approach
byMarc S. Paolella
Rating: 0 out of 5 stars
0 ratings
Probability and Conditional Expectation: Fundamentals for the Empirical Sciences
Ebook
Probability and Conditional Expectation: Fundamentals for the Empirical Sciences
byRolf Steyer
Rating: 0 out of 5 stars
0 ratings
Applied Spatial Statistics for Public Health Data
Ebook
Applied Spatial Statistics for Public Health Data
byLance A. Waller
Rating: 0 out of 5 stars
0 ratings
Time Series Analysis with Long Memory in View
Ebook
Time Series Analysis with Long Memory in View
byUwe Hassler
Rating: 0 out of 5 stars
0 ratings
The Statistical Analysis of Failure Time Data
Ebook
The Statistical Analysis of Failure Time Data
byJohn D. Kalbfleisch
Rating: 0 out of 5 stars
0 ratings
Flowgraph Models for Multistate Time-to-Event Data
Ebook
Flowgraph Models for Multistate Time-to-Event Data
byAparna V. Huzurbazar
Rating: 0 out of 5 stars
0 ratings
Statistical Models and Methods for Lifetime Data
Ebook
Statistical Models and Methods for Lifetime Data
byJerald F. Lawless
Rating: 0 out of 5 stars
0 ratings
Time Series Analysis: Nonstationary and Noninvertible Distribution Theory
Ebook
Time Series Analysis: Nonstationary and Noninvertible Distribution Theory
byKatsuto Tanaka
Rating: 0 out of 5 stars
0 ratings
Methods for Statistical Data Analysis of Multivariate Observations
Ebook
Methods for Statistical Data Analysis of Multivariate Observations
byR. Gnanadesikan
Rating: 0 out of 5 stars
0 ratings
Sequential Stochastic Optimization
Ebook
Sequential Stochastic Optimization
byR. Cairoli
Rating: 0 out of 5 stars
0 ratings
Linear Regression Analysis
Ebook
Linear Regression Analysis
byGeorge A. F. Seber
Rating: 3 out of 5 stars
3/5
An Introduction to Envelopes: Dimension Reduction for Efficient Estimation in Multivariate Statistics
Ebook
An Introduction to Envelopes: Dimension Reduction for Efficient Estimation in Multivariate Statistics
byR. Dennis Cook
Rating: 0 out of 5 stars
0 ratings
Fractal-Based Point Processes
Ebook
Fractal-Based Point Processes
bySteven Bradley Lowen
Rating: 4 out of 5 stars
4/5
Periodically Correlated Random Sequences: Spectral Theory and Practice
Ebook
Periodically Correlated Random Sequences: Spectral Theory and Practice
byHarry L. Hurd
Rating: 0 out of 5 stars
0 ratings

Related ebooks

Skip carousel

Using the Weibull Distribution: Reliability, Modeling, and Inference
Ebook
Using the Weibull Distribution: Reliability, Modeling, and Inference
byJohn I. McCool
Rating: 0 out of 5 stars
0 ratings
Methods of Multivariate Analysis
Ebook
Methods of Multivariate Analysis
byAlvin C. Rencher
Rating: 0 out of 5 stars
0 ratings
Random Data: Analysis and Measurement Procedures
Ebook
Random Data: Analysis and Measurement Procedures
byJulius S. Bendat
Rating: 4 out of 5 stars
4/5
Statistical Analysis of Profile Monitoring
Ebook
Statistical Analysis of Profile Monitoring
byRassoul Noorossana
Rating: 0 out of 5 stars
0 ratings
Smoothing and Regression: Approaches, Computation, and Application
Ebook
Smoothing and Regression: Approaches, Computation, and Application
byMichael G. Schimek
Rating: 0 out of 5 stars
0 ratings
Nonresponse in Household Interview Surveys
Ebook
Nonresponse in Household Interview Surveys
byRobert M. Groves
Rating: 0 out of 5 stars
0 ratings
Applied Longitudinal Analysis
Ebook
Applied Longitudinal Analysis
byGarrett M. Fitzmaurice
Rating: 3 out of 5 stars
3/5
Forecasting with Dynamic Regression Models
Ebook
Forecasting with Dynamic Regression Models
byAlan Pankratz
Rating: 3 out of 5 stars
3/5
Recent Advances and Trends in Nonparametric Statistics
Ebook
Recent Advances and Trends in Nonparametric Statistics
byM.G. Akritas
Rating: 0 out of 5 stars
0 ratings
Parameter Estimation and Inverse Problems
Ebook
Parameter Estimation and Inverse Problems
byRichard C. Aster
Rating: 4 out of 5 stars
4/5
Probability and Random Variables
Ebook
Probability and Random Variables
byG P Beaumont
Rating: 5 out of 5 stars
5/5
Flexible Bayesian Regression Modelling
Ebook
Flexible Bayesian Regression Modelling
byYanan Fan
Rating: 0 out of 5 stars
0 ratings
Data Assimilation for the Geosciences: From Theory to Application
Ebook
Data Assimilation for the Geosciences: From Theory to Application
bySteven J. Fletcher
Rating: 0 out of 5 stars
0 ratings
Adjustment Models in 3D Geomatics and Computational Geophysics: With MATLAB Examples
Ebook
Adjustment Models in 3D Geomatics and Computational Geophysics: With MATLAB Examples
byBashar Alsadik
Rating: 0 out of 5 stars
0 ratings
Handbook of Probabilistic Models
Ebook
Handbook of Probabilistic Models
byPijush Samui
Rating: 0 out of 5 stars
0 ratings
Data Analysis in Pavement Engineering: Methodologies and Applications
Ebook
Data Analysis in Pavement Engineering: Methodologies and Applications
byQiao Dong
Rating: 0 out of 5 stars
0 ratings
Advanced Mathematical Applications in Data Science
Ebook
Advanced Mathematical Applications in Data Science
byBiswadip Basu Mallik
Rating: 0 out of 5 stars
0 ratings
Statistical Methods for Quality Improvement
Ebook
Statistical Methods for Quality Improvement
byThomas P. Ryan
Rating: 0 out of 5 stars
0 ratings
Statistical Techniques for Transportation Engineering
Ebook
Statistical Techniques for Transportation Engineering
byKumar Molugaram
Rating: 5 out of 5 stars
5/5
Understanding Least Squares Estimation and Geomatics Data Analysis
Ebook
Understanding Least Squares Estimation and Geomatics Data Analysis
byJohn Olusegun Ogundare
Rating: 0 out of 5 stars
0 ratings
Multiparametric Statistics
Ebook
Multiparametric Statistics
byVadim Ivanovich Serdobolskii
Rating: 0 out of 5 stars
0 ratings
Parametric Statistical Inference: Basic Theory and Modern Approaches
Ebook
Parametric Statistical Inference: Basic Theory and Modern Approaches
byShelemyahu Zacks
Rating: 0 out of 5 stars
0 ratings
An Introduction to Statistical Computing: A Simulation-based Approach
Ebook
An Introduction to Statistical Computing: A Simulation-based Approach
byJochen Voss
Rating: 0 out of 5 stars
0 ratings
Statistics and Causality: Methods for Applied Empirical Research
Ebook
Statistics and Causality: Methods for Applied Empirical Research
byWolfgang Wiedermann
Rating: 0 out of 5 stars
0 ratings
Machine Intelligence and Pattern Recognition
Ebook
Machine Intelligence and Pattern Recognition
byElsevier Books Reference
Rating: 5 out of 5 stars
5/5
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
Ebook
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
byWouter Verbeke
Rating: 0 out of 5 stars
0 ratings
Uncertainty Quantification and Stochastic Modeling with Matlab
Ebook
Uncertainty Quantification and Stochastic Modeling with Matlab
byEduardo Souza de Cursi
Rating: 0 out of 5 stars
0 ratings
Panel Data Econometrics: Theory
Ebook
Panel Data Econometrics: Theory
byMike Tsionas
Rating: 0 out of 5 stars
0 ratings
Handbook of Latent Variable and Related Models
Ebook
Handbook of Latent Variable and Related Models
byElsevier Books Reference
Rating: 4 out of 5 stars
4/5
Statistical Pattern Recognition
Ebook
Statistical Pattern Recognition
byAndrew R. Webb
Rating: 4 out of 5 stars
4/5

Mathematics For You

Skip carousel

Quantum Physics for Beginners
Ebook
Quantum Physics for Beginners
byMax Thomson
Rating: 4 out of 5 stars
4/5
Calculus For Dummies
Ebook
Calculus For Dummies
byMark Ryan
Rating: 4 out of 5 stars
4/5
Basic Math & Pre-Algebra For Dummies
Ebook
Basic Math & Pre-Algebra For Dummies
byMark Zegarelli
Rating: 4 out of 5 stars
4/5
Algebra - The Very Basics
Ebook
Algebra - The Very Basics
byMetin Bektas
Rating: 5 out of 5 stars
5/5
Algebra I Workbook For Dummies
Ebook
Algebra I Workbook For Dummies
byMary Jane Sterling
Rating: 3 out of 5 stars
3/5
Statistics 101: From Data Analysis and Predictive Modeling to Measuring Distribution and Determining Probability, Your Essential Guide to Statistics
Ebook
Statistics 101: From Data Analysis and Predictive Modeling to Measuring Distribution and Determining Probability, Your Essential Guide to Statistics
byDavid Borman
Rating: 4 out of 5 stars
4/5
Basic Math Notes
Ebook
Basic Math Notes
byErnest Bywater
Rating: 5 out of 5 stars
5/5
Build a Mathematical Mind - Even If You Think You Can't Have One: Become a Pattern Detective. Boost Your Critical and Logical Thinking Skills.
Ebook
Build a Mathematical Mind - Even If You Think You Can't Have One: Become a Pattern Detective. Boost Your Critical and Logical Thinking Skills.
byAlbert Rutherford
Rating: 5 out of 5 stars
5/5
Geometry For Dummies
Ebook
Geometry For Dummies
byMark Ryan
Rating: 5 out of 5 stars
5/5
The Everything Guide to Algebra: A Step-by-Step Guide to the Basics of Algebra - in Plain English!
Ebook
The Everything Guide to Algebra: A Step-by-Step Guide to the Basics of Algebra - in Plain English!
byChristopher Monahan
Rating: 4 out of 5 stars
4/5
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
Ebook
This is The Statistics Handbook your Professor Doesn't Want you to See. So Easy, it's Practically Cheating...
byS. Deviant
Rating: 4 out of 5 stars
4/5
Mental Math Secrets - How To Be a Human Calculator
Ebook
Mental Math Secrets - How To Be a Human Calculator
byRandy Silverman
Rating: 5 out of 5 stars
5/5
My Best Mathematical and Logic Puzzles
Ebook
My Best Mathematical and Logic Puzzles
byMartin Gardner
Rating: 5 out of 5 stars
5/5
Game Theory: A Simple Introduction
Ebook
Game Theory: A Simple Introduction
byK.H. Erickson
Rating: 4 out of 5 stars
4/5
Introducing Game Theory: A Graphic Guide
Ebook
Introducing Game Theory: A Graphic Guide
byIvan Pastine
Rating: 4 out of 5 stars
4/5
The Elements of Euclid for the Use of Schools and Colleges (Illustrated)
Ebook
The Elements of Euclid for the Use of Schools and Colleges (Illustrated)
byISAAC TODHUNTER
Rating: 0 out of 5 stars
0 ratings
Mathematical Thinking - For People Who Hate Math: Level Up Your Analytical and Creative Thinking Skills. Excel at Problem-Solving and Decision-Making.
Ebook
Mathematical Thinking - For People Who Hate Math: Level Up Your Analytical and Creative Thinking Skills. Excel at Problem-Solving and Decision-Making.
byAlbert Rutherford
Rating: 3 out of 5 stars
3/5
The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need
Ebook
The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need
byChristopher Monahan
Rating: 5 out of 5 stars
5/5
The Everything Guide to Pre-Algebra: A Helpful Practice Guide Through the Pre-Algebra Basics - in Plain English!
Ebook
The Everything Guide to Pre-Algebra: A Helpful Practice Guide Through the Pre-Algebra Basics - in Plain English!
byJane Cassie
Rating: 5 out of 5 stars
5/5
The Little Book of Mathematical Principles, Theories & Things
Ebook
The Little Book of Mathematical Principles, Theories & Things
byRobert Solomon
Rating: 3 out of 5 stars
3/5
See Ya Later Calculator: Simple Math Tricks You Can Do in Your Head
Ebook
See Ya Later Calculator: Simple Math Tricks You Can Do in Your Head
byEditors of Portable Press
Rating: 4 out of 5 stars
4/5
Relativity: The special and the general theory
Ebook
Relativity: The special and the general theory
byAlbert Einstein
Rating: 5 out of 5 stars
5/5
The Golden Ratio: The Divine Beauty of Mathematics
Ebook
The Golden Ratio: The Divine Beauty of Mathematics
byGary B. Meisner
Rating: 5 out of 5 stars
5/5
Calculus Made Easy
Ebook
Calculus Made Easy
bySilvanus P. Thompson
Rating: 4 out of 5 stars
4/5
The Math of Life and Death: 7 Mathematical Principles That Shape Our Lives
Ebook
The Math of Life and Death: 7 Mathematical Principles That Shape Our Lives
byKit Yates
Rating: 4 out of 5 stars
4/5
A Mind for Numbers | Summary
Ebook
A Mind for Numbers | Summary
bySummary Station
Rating: 4 out of 5 stars
4/5
ACT Math & Science Prep: Includes 500+ Practice Questions
Ebook
ACT Math & Science Prep: Includes 500+ Practice Questions
byKaplan Test Prep
Rating: 3 out of 5 stars
3/5
GED® Math Test Tutor, 2nd Edition
Ebook
GED® Math Test Tutor, 2nd Edition
bySandra Rush
Rating: 0 out of 5 stars
0 ratings
Is God a Mathematician?
Ebook
Is God a Mathematician?
byMario Livio
Rating: 4 out of 5 stars
4/5
Real Estate by the Numbers: A Complete Reference Guide to Deal Analysis
Ebook
Real Estate by the Numbers: A Complete Reference Guide to Deal Analysis
byJ Scott
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

#037 - Tour De Bayesian with Connor Tann
Podcast episode
#037 - Tour De Bayesian with Connor Tann
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Revisiting the Minimalist Approach to Offline Reinforcement Learning: Recent years have witnessed significant advancements in offline reinforcement learning (RL), resulting in the development of numerous algorithms with varying degrees of complexity. While these algorithms have led to noteworthy improvements, many inco...
Podcast episode
Revisiting the Minimalist Approach to Offline Reinforcement Learning: Recent years have witnessed significant advancements in offline reinforcement learning (RL), resulting in the development of numerous algorithms with varying degrees of complexity. While these algorithms have led to noteworthy improvements, many inco...
byPapers Read on AI
0 ratings
0% found this document useful
[Bite] Data Science and the Scientific Method
Podcast episode
[Bite] Data Science and the Scientific Method
byDataCafé
0 ratings
0% found this document useful
GraphCast: Learning skillful medium-range global weather forecasting: We introduce a machine-learning (ML)-based weather simulator—called “GraphCast”—which outperforms the most accurate deterministic operational medium-range weather forecasting system in the world, as well as all previous ML baselines. GraphCast is an ...
Podcast episode
GraphCast: Learning skillful medium-range global weather forecasting: We introduce a machine-learning (ML)-based weather simulator—called “GraphCast”—which outperforms the most accurate deterministic operational medium-range weather forecasting system in the world, as well as all previous ML baselines. GraphCast is an ...
byPapers Read on AI
0 ratings
0% found this document useful
S5E19 Item Response Theory, Q.E.D.
Podcast episode
S5E19 Item Response Theory, Q.E.D.
byQuantitude
0 ratings
0% found this document useful
Convolution Quadrature: Modellansatz 133
Podcast episode
Convolution Quadrature: Modellansatz 133
byModellansatz - English episodes only
0 ratings
0% found this document useful
Automatic Differentiation: Modellansatz 167
Podcast episode
Automatic Differentiation: Modellansatz 167
byModellansatz - English episodes only
0 ratings
0% found this document useful
083R_Operationalising a concept: The systematic review of composite indicator building for measuring community disaster resilience (research summary)
Podcast episode
083R_Operationalising a concept: The systematic review of composite indicator building for measuring community disaster resilience (research summary)
byWhat is The Future for Cities?
0 ratings
0% found this document useful
48. Big Data Wrangling for Core Sensing Technology
Podcast episode
48. Big Data Wrangling for Core Sensing Technology
byDiscovery to Recovery
0 ratings
0% found this document useful
37. Sean Knapp - The brave new world of data engineering
Podcast episode
37. Sean Knapp - The brave new world of data engineering
byTowards Data Science
0 ratings
0% found this document useful
Complex Geometries
Podcast episode
Complex Geometries
byModellansatz
0 ratings
0% found this document useful
Complex Geometries: Modellansatz 086
Podcast episode
Complex Geometries: Modellansatz 086
byModellansatz - English episodes only
0 ratings
0% found this document useful
Assessing Quantum Computing, with Ignacio Cirac
Podcast episode
Assessing Quantum Computing, with Ignacio Cirac
byLondon Futurists
0 ratings
0% found this document useful
How computers have changed the way we do physics - Breaking through the quantum barrier: The power of available computers has now grown exponentially for many decades. The ability to discover numerically the implications of equations and models has opened our eyes to previously hidden aspects of physics.
Podcast episode
How computers have changed the way we do physics - Breaking through the quantum barrier: The power of available computers has now grown exponentially for many decades. The ability to discover numerically the implications of equations and models has opened our eyes to previously hidden aspects of physics.
byTheoretical Physics - From Outer Space to Plasma
0 ratings
0% found this document useful
Anyone Listening? Quantum Cryptography Applications with Vlatko Vedral: Upgrading isn't just for phone systems. Quantum information science tackles the upgrade of old existing technologies, which run by classical physics laws, to those that function in the quantum realm. It's as easy as it sounds: Vlatko Vederal tells...
Podcast episode
Anyone Listening? Quantum Cryptography Applications with Vlatko Vedral: Upgrading isn't just for phone systems. Quantum information science tackles the upgrade of old existing technologies, which run by classical physics laws, to those that function in the quantum realm. It's as easy as it sounds: Vlatko Vederal tells...
byFinding Genius Podcast
0 ratings
0% found this document useful
Data Observability - Barr Moses
Podcast episode
Data Observability - Barr Moses
byDataTalks.Club
0 ratings
0% found this document useful
FAI May 2018 Podcast: Implementation of Patient-Reported Outcomes Measurement Information System Data Collection in a Private Orthopedic Surgery Practice: The authors describe a method of collecting patient-reported outcomes (PROs) using computerized adaptive tests (CATs) in a high-volume orthopedic surgery practice with limited resources and no research coordinator. Using tablets to...
Podcast episode
FAI May 2018 Podcast: Implementation of Patient-Reported Outcomes Measurement Information System Data Collection in a Private Orthopedic Surgery Practice: The authors describe a method of collecting patient-reported outcomes (PROs) using computerized adaptive tests (CATs) in a high-volume orthopedic surgery practice with limited resources and no research coordinator. Using tablets to...
byFoot & Ankle International
0 ratings
0% found this document useful
Random and Pseudorandom: Melvyn Bragg and his guests discuss mathematical randomness and pseudorandomness.
Podcast episode
Random and Pseudorandom: Melvyn Bragg and his guests discuss mathematical randomness and pseudorandomness.
byIn Our Time: Science
0 ratings
0% found this document useful
Random and Pseudorandom: Melvyn Bragg and his guests discuss mathematical randomness and pseudorandomness.
Podcast episode
Random and Pseudorandom: Melvyn Bragg and his guests discuss mathematical randomness and pseudorandomness.
byIn Our Time
0 ratings
0% found this document useful
070R_Citizen-centred big data analysis-driven governance intelligence framework for smart cities (research summary)
Podcast episode
070R_Citizen-centred big data analysis-driven governance intelligence framework for smart cities (research summary)
byWhat is The Future for Cities?
0 ratings
0% found this document useful
Quantum computing for high energy physics simulations with Martin Savage
Podcast episode
Quantum computing for high energy physics simulations with Martin Savage
byThe New Quantum Era
0 ratings
0% found this document useful
Changepoint Detection: Secret Weapon of the Data Scientist
Podcast episode
Changepoint Detection: Secret Weapon of the Data Scientist
byDataCafé
0 ratings
0% found this document useful
Bradley Cantrell and Adam Mekies, "Codify: Parametric and Computational Design in Landscape Architecture" (Routledge, 2018)
Podcast episode
Bradley Cantrell and Adam Mekies, "Codify: Parametric and Computational Design in Landscape Architecture" (Routledge, 2018)
byNew Books in Architecture
0 ratings
0% found this document useful
Material Science with Houlong Zhuang at Q2B Paris
Podcast episode
Material Science with Houlong Zhuang at Q2B Paris
byThe New Quantum Era
0 ratings
0% found this document useful
Office Hours w/ Professor Jacob Mays
Podcast episode
Office Hours w/ Professor Jacob Mays
byPublic Power Underground
0 ratings
0% found this document useful
This Data on Consumer Energy Use May Surprise You
Podcast episode
This Data on Consumer Energy Use May Surprise You
byThe Energy Gang
0 ratings
0% found this document useful
191R_Decision-making approach to urban energy retrofit – A comprehensive review (research summary)
Podcast episode
191R_Decision-making approach to urban energy retrofit – A comprehensive review (research summary)
byWhat is The Future for Cities?
0 ratings
0% found this document useful
Why and how is AI taking over the tissue image analysis field? w/ Jeppe Thagaard, Visiopharm
Podcast episode
Why and how is AI taking over the tissue image analysis field? w/ Jeppe Thagaard, Visiopharm
byDigital Pathology Podcast
0 ratings
0% found this document useful
Was the Fracture Efficient? with Claude Rezk
Podcast episode
Was the Fracture Efficient? with Claude Rezk
byStone's Notes
0 ratings
0% found this document useful
Why You Have More Information Than You Think | Douglas W. Hubbard
Podcast episode
Why You Have More Information Than You Think | Douglas W. Hubbard
byThe Artists of Data Science
0 ratings
0% found this document useful

Skip carousel

New Quantum Algorithms Finally Crack Nonlinear Equations
Quanta
Article
New Quantum Algorithms Finally Crack Nonlinear Equations
Jan 5, 2021
4 min read
Data Centers Aren’t The Energy Hogs We Thought
Futurity
Article
Data Centers Aren’t The Energy Hogs We Thought
Feb 28, 2020
2 min read
Grid Modeling Overview: Four Types of Models Guiding the Transition to Clean Electricity
Union of Concerned Scientists
Article
Grid Modeling Overview: Four Types of Models Guiding the Transition to Clean Electricity
Apr 25, 2022
6 min read
How Spooky Science Helps Us Peer Inside The Planets
All About Space
Article
How Spooky Science Helps Us Peer Inside The Planets
Dec 3, 2020
An assistant professor of computational science at the EPFL research centre in Lausanne, Switzerland, involved in the current research on metallic hydrogen. Could you explain how the machine-learning techniques used in your research work? Why were th
1 min read
Can Machine Learning Predict The Next Big Disaster?
Futurity
Article
Can Machine Learning Predict The Next Big Disaster?
Jan 3, 2023
3 min read
Matrix Multiplication Inches Closer to Mythic Goal
Quanta
Article
Matrix Multiplication Inches Closer to Mythic Goal
Mar 23, 2021
5 min read
Smaller Is Better: Why Finite Number Systems Pack More Punch
Quanta
Article
Smaller Is Better: Why Finite Number Systems Pack More Punch
Feb 11, 2019
4 min read
“Q’s” Real Gadgets Could Go Beyond Lasers Inside Pens
PC Pro Magazine
Article
“Q’s” Real Gadgets Could Go Beyond Lasers Inside Pens
Jul 8, 2021
Recently, I read Quantum Computing: How it Works and Why it Could Change the World by Amit Katwala. It’s a great intro to the present state of and prospects for quantum computing – not too technical, wasting little space on the basics but avoiding th
2 min read
Business applications For Quantum computing
Rotman Management
Article
Business applications For Quantum computing
May 1, 2022
COMPUTERS DO ARITHMETIC. Underlying every amazing application of computers today is math, calculated using binary digits or ‘bits.’ The original computers of the early 1950s could perform about 465 multiplications per second — much faster than the ‘h
11 min read
DNA Synthesis Method Generates ‘True Random Numbers’
Futurity
Article
DNA Synthesis Method Generates ‘True Random Numbers’
Nov 20, 2020
2 min read
Memristor Setup Could Make Computer Chips More Efficient
Futurity
Article
Memristor Setup Could Make Computer Chips More Efficient
Jul 31, 2018
A new way of arranging advanced computer components called memristors on a chip could pave the way for their use in general computing. This could cut energy consumption by a factor of 100. Using memristors would improve performance in low power envir
2 min read
Quantum Simulators An Overview
Techfastly
Article
Quantum Simulators An Overview
Oct 1, 2021
4 min read
This Lens-free Microscope Fits On A Fingertip
Futurity
Article
This Lens-free Microscope Fits On A Fingertip
Mar 5, 2018
3 min read
Graced With Knowledge, Mathematicians Seek to Understand
Quanta
Article
Graced With Knowledge, Mathematicians Seek to Understand
Apr 8, 2020
5 min read
How Quantum Computing Can Fight Climate Change
APC
Article
How Quantum Computing Can Fight Climate Change
Nov 28, 2022
8 min read
How Quantum Computing Can Fight Climate Change
PC Pro Magazine
Article
How Quantum Computing Can Fight Climate Change
Oct 8, 2022
8 min read
The Lawlessness of Large Numbers
Nautilus
Article
The Lawlessness of Large Numbers
Jul 27, 2023
4 min read
Are Neural Networks About to Reinvent Physics?
Nautilus
Article
Are Neural Networks About to Reinvent Physics?
Nov 21, 2019
Can AI teach itself the laws of physics? Will classical computers soon be replaced by deep neural networks? Sure looks like it, if you’ve been following the news, which lately has been filled with headlines like, “A neural net solves the three-body p
9 min read
Risky Giant Steps Can Solve Optimization Problems Faster
Nautilus
Article
Risky Giant Steps Can Solve Optimization Problems Faster
Sep 5, 2023
4 min read
If We Draw Graphs Like This, We Can Change Computers
Popular Mechanics South Africa
Article
If We Draw Graphs Like This, We Can Change Computers
Feb 18, 2022
3 min read
Quantum Computing and The Rise Of Machine Learning
Techfastly
Article
Quantum Computing and The Rise Of Machine Learning
Oct 1, 2021
2 min read
How Randomness Can Make Math Easier
Quanta
Article
How Randomness Can Make Math Easier
Jul 9, 2019
2 min read
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Nautilus
Article
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Feb 12, 2015
I’ve never seen the computer you’re reading this story on, but I can tell you a lot about it. It runs on electricity. It uses binary logic to carry out programmed instructions. It shuttles information using materials known as semiconductors. Its brai
7 min read
AI May Cut Mammogram False Positives Without Missing Cancer
Futurity
Article
AI May Cut Mammogram False Positives Without Missing Cancer
Apr 10, 2024
Using artificial intelligence to supplement radiologists’ evaluations of mammograms may improve breast cancer screening by reducing false positives without missing cases of cancer, according to a new study. The researchers developed an algorithm that
3 min read
How And Where You Use Machine-learning
APC
Article
How And Where You Use Machine-learning
Oct 7, 2019
4 min read
Pi Is Everywhere – But Do We Really Need More Than 15 Decimal Places?
Guardian Weekly
Article
Pi Is Everywhere – But Do We Really Need More Than 15 Decimal Places?
Aug 27, 2021
2 min read
Mantis Shrimp Eyes Inspire New Optical Sensor
Futurity
Article
Mantis Shrimp Eyes Inspire New Optical Sensor
Mar 4, 2021
2 min read
Metodología Para Recuperar Las Quebradas Según El Entorno Urbano De La Ciudad De Quito
Revista Trama
Article
Metodología Para Recuperar Las Quebradas Según El Entorno Urbano De La Ciudad De Quito
Nov 20, 2018
3 min read
Changing Dynamics of Healthcare Sector - Quantum Computers Taking A Leap
Techfastly
Article
Changing Dynamics of Healthcare Sector - Quantum Computers Taking A Leap
Oct 1, 2021
5 min read
Quantum Dots Are Just As Awesome As We’d Hoped
Futurity
Article
Quantum Dots Are Just As Awesome As We’d Hoped
Apr 11, 2019
3 min read

Related categories

Skip carousel

Reviews for Bayesian Networks

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Bayesian Networks - Timo Koski

Preface

This book evolved from courses developed at Linkoping Institute of Technology and KTH, given by the authors, starting with a graduate course given by Timo Koski in 2002, who was the Professor of Mathematical Statistics at LiTH at the time and subsequently developed by both authors. The book has been aimed at senior undergraduate, masters and beginning Ph.D. students in computer engineering. The students are expected to have a first course in probability and statistics, a first course in discrete mathematics and a first course in algorithmics. The book provides an introduction to the theory of graphical models.

A substantial list of references has been provided, which include the key works for the reader who wants to advance further in the topic.

We have benefited over the years from discussions on Bayesian networks and Bayesian statistics with Elja Arjas, Stefan Arnborg, and Jukka Corander. We would like to thank colleagues from KTH, Jockum Aniansson, Gunnar Englund, Lars Hoist and Bo Wahlberg for participating in (or suffering through) a series of lectures during the third academic quarter of 2007/2008 based on a preliminary version of the text and suggesting improvements as well as raising issues that needed clarification. We would also like to thank Mikael Skoglund for including the course in the ACCESS graduate school program at KTH. We thank doctoral and undergraduate students Luca Furrer, Maksym Girnyk, Ali Hamdi, Majid N. Khormuji, Marten Marcus and Emil Rehnberg for pointing out several errors, misprints and bad formulations in the text and in the exercises. We thank Anna Talarczyk for invaluable help with the figures. All remaining errors and deficiencies are, of course, wholly our responsibility.

Graphical models and probabilistic reasoning

1.1 Introduction

This text considers the subject of graphical models, which is an interaction between probability theory and graph theory. The topic provides a natural tool for dealing with a large class of problems containing uncertainty and complexity. These features occur throughout applied mathematics and engineering and therefore the material has diverse applications in the engineering sciences. A complex model is built by combining simpler parts, an idea known as modularity. The uncertainty in the system is modelled using probability theory; the graph helps to indicate independence structures that enable the probability distribution to be decomposed into smaller pieces.

Bayesian networks represent joint probability models among given variables. Each variable is represented by a node in a graph. The direct dependencies between the variables are represented by directed edges between the corresponding nodes and the conditional probabilities for each variable (that is the probabilities conditioned on the various possible combinations of values for the immediate predecessors in the network) are stored in potentials (or tables) attached to the dependent nodes. Information about the observed value of a variable is propagated through the network to update the probability distributions over other variables that are not observed directly. Using Bayes’ rule, these influences may also be identified in a ‘backwards’ direction, from dependent variables to their predecessors.

The Bayesian approach to uncertainty ensures that the system as a whole remains consistent and provides a way to apply the model to data. Graph theory helps to illustrate and utilize independence structures within interacting sets of variables, hence facilitating the design of efficient algorithms.

In many situations, the directed edges between variables in a Bayesian network can have a simple and natural interpretation as graphical representations of causal relationships. This occurs when a graph is used to model a situation where the values of the immediate predecessors of a variable in a network are to be interpreted as the immediate causes of the values taken by that variable. This representation of causal relationships is probabilistic; the relation between the value taken by a variable and the values taken by its predecessors is specified by a conditional probability distribution. When a graph structure is given and the modelling assumptions permit a causal interpretation, then the estimates of the conditional probability tables obtained from data may be used to infer a system of causation from a set of conditional probability distributions. A Bayesian network is essentially a directed acyclic graph, together with the associated conditional probability distributions. When a Bayesian network represents a causal structure between the variables, it may be used to assess the effects of an intervention, where the manipulation of a cause will influence the effect.

The ability to infer causal relationships forms the basis for learning and acting in an intelligent manner in the external world. Statistical and probabilistic techniques may be used to assess direct associations between variables; some additional common sense and modelling assumptions, when these are appropriate, enable these direct associations to be understood as direct causal relations. It is the knowledge of causal relations, rather than simply statistical associations, that gives a sense of genuine understanding, together with a sense of potential control resulting from the ability to predict the consequences of actions that have not been performed, as J. Pearl writes in [1].

K. Pearson, an early, pre-eminent statistician, argued that the only proper goal of scientific investigation was to provide descriptions of experience in a mathematical form (see [2] written by Pearson in 1892); for example, a coefficient of correlation. Any effort to advance beyond a description of associations and to deduce causal relations meant, according to this view, to evoke hidden or metaphysical ideas such as causes; he did not consider such modelling assumptions to be scientific.

R.A. Fisher, possibly the most influential statistician, considered that causation could be inferred from experimental data only when controlled, or randomized experiments were employed. A majority of statistical studies follow the approach of Fisher and only infer ‘correlation’ or ‘association’ unless randomized experimental trials have been performed.

It is not within the scope of this treatment of Bayesian networks to review the sophisticated attempts at characterizing causality and the ensuing controversies (starting with David Hume [3]) amongst scholars; the reader is referred to the enlightening treatment by J. Williamson [4]. This text attempts to take what seems to be a ‘common sense’ point of view. The human mind is capable of detecting and approving causes of events in an intuitive manner. For example, the nineteenth century physician Ignaz Semmelweis in Vienna investigated, without knowing about germs, the causes of child bed fever. He instituted a policy that doctors should use a solution of chlorinated lime to wash their hands between autopsy work and the examination of patients, with the effect that the mortality rate at the maternity wards of hospitals dropped substantially. Such reasoning about causes and effects is not as straightforward for computers and it is not clear that it is valid in terms of philosophical analysis.

Causality statements are often expressed in terms of large and often complicated objects or populations: for example, ’smoking causes lung cancer'. Causal connections at finer levels of detail will have a different context: for example, the processes at the cell level that cause lung cancer. Causal connections are also contingent on many other conditions and causal laws than those explicitly under consideration. This introduces a level of uncertainty and the causal connections are therefore probabilistic.

Correlations or statistical associations between two variables often imply causation, even if there is not a direct causal relation between the two variables in question; one need not be the cause of the other. A correlation may indicate the presence of hidden variables, that are common causes for both the observed variables, so that the two observed variables are statistically associated.

When there is a possibility that there may be such unknown hidden variables, it is necessary to separate the ‘cause’ from the extraneous factors that may influence it in order to conclude that there is a causal relationship and a randomized (or controlled) experiment achieves this. For a randomized experiment, the groups for each level of treatment, and a control group to which no treatment is applied, are chosen at random so that the allocation of members to treatment groups is not affected by any hidden variables. Unfortunately, there are situations where it may be unethical to carry out a randomized experiment. For example, to prove that smoking causes lung cancer, it is perhaps inappropriate to force a non-smoker to start smoking.

In the example of smoking and lung cancer, the model is unknown and has to be inferred from data. The statistical analyst would like to establish whether smoking causes lung cancer, or whether there are additional hidden variables that are causes for both variables. Note that common sense plays a role here; the possibility that lung cancer may cause smoking is not considered. In terms of graphical models, where the direction of the pointed arrow indicates cause to effect, the analyst wants to determine which of the models in Figure 1.1 are appropriate.

When he carries out a controlled experiment, randomly assigning people to ’smokers’ and ‘non-smokers’ respectively, the association between the hidden variables and smoking for this experiment are broken and the causal diagram is therefore given by Figure 1.2.

By carrying out a ‘controlled experiment', an intervention is made whereby the causal path from the hidden variable is removed, thus ensuring that only the causal connection of interest is responsible for an observed association.

In many situations, a controlled experiment seems the only satisfactory way to demonstrate conclusively that an association between variables is due to a causal link. Without a controlled experiment, there may remain some doubt. For example, levels of smoking dropped when the first announcements were made that smoking caused lung cancer and levels of lung cancer also dropped substantially. A controlled experiment would have demonstrated conclusively that the drop in lung cancer was not due to other environmental factors, such as a decline in heavy polluting industry that occurred at the same time.

Figure 1.1 Smoking and lung cancer.

Figure 1.2 Controlled experiment: the control has removed the hidden causes of smoking from consideration.

Bayesian networks provide a straightforward mathematical language to express relations between variables in a clear way. In many engineering examples, the variables that should be present in the model are well defined. From an appropriate model that contains the hidden (or non-observable) variables and the observable variables, and where it is clear which variables may be intervened on, it will be possible to verify whether certain ‘identifiability’ conditions hold and hence to conclude whether or not there is a causal relation from the data, without a controlled experiment.

Many of the classical probabilistic systems studied in fields such as systems engineering, information theory, pattern recognition and statistical mechanics are problems that may be expressed as graphical models. Hidden Markov models may be considered as graphical models. Engineers are also accustomed to using circuit diagrams, signal flow graphs, and trellises, which may be treated using the framework of graphical models.

Examples of applied fields where Bayesian networks have recently been found to provide a natural mathematical framework are reliability engineering [5], software testing [6], cellular networks [7], and intrusion detection in computer systems [8]. In 1996, William H. Gates III, a co-founder of Microsoft, stated that expertise in Bayesian networks had enhanced the competitive advantage of Microsoft. Bayesian networks are used in software for electrical, financial and medical engineering, artificial intelligence and many other applications.

1.2 Axioms of probability and basic notations

The basic set notations that will be used will now be established.

Definition 1.1 (Notations) The following notations will be used throughout:

The universal set will be denoted by Ω. This is the context of the experiment. In Bayesian analysis, the unknown parameters are considered to be random. Therefore, the context Ω consists of the set of all possible outcomes of an experiment, together with all possible values of the unknown parameters.

The notation X will be used to denote the sample space; the set of all possible outcomes of the experiment.

will be used to denote the parameter space, so that .

The following notations will be used when considering sets:

If A and B are two sets, then or denotes their union. If A1,…,An are a finite collection of sets, then denotes their union. Also, may be used to denote their union.

If A and B are two sets, then or AB or may be used to denote their intersection. If A1,…, An are a finite collection of sets, then all denote their intersection.

denotes that A is a strict subset of B. denotes that A is a subset of B, possibly equal to B.

The empty set will be denoted by ϕ.

Ac denotes the complement of A; namely, Ω\A, where Ω denotes the universal set. The symbol \ denotes exclusion.

Together with the universal set Ω, an event space is required. This is a collection of subsets of Ω. The event space is an algebra of constructable sets. That is, satisfies the following:

1. and .

2. Each may be constructed. That is, for each , let iA : Ω → {0, 1} denote the mapping such that . Then there is a procedure to determine the value of ia(ω) for each ω ∈ Ω.¹

3. If for a finite collection of events each Aj satisfies , then .

4. If , then AC ∈ , where Ac = Ω\A denotes the complement of A.

5. Each satisfies A ⊆ Ω.

For Ω, satisfying the above conditions, a probability distribution p over (Ω, ) (or, in short, a probability distribution over Ω, when is understood) is a function p: → [0,1] satisfying the following version of the Kolmogorov axioms:

1. p(ϕ) = 0 and p(Ω) = 1.

2. If is a finite collection such that each and the events satisfy for all j ≠ k, then

3. for all .

This is a reduced version of the Kolmogorov axioms. The Kolmogorov axioms require countable additivity rather than finite additivity. They invoke axiomatic set theory and do not therefore require the constructive hypothesis. For the Kolmogorov axioms, the event space is taken to be a sigma -algebra and the second axiom requires countable additivity.

Let

and let . Then will be used to denote the algebra and .

Definition 1.2 (Probability Distribution over X) If X contains a finite number of elements, then Tx contains a finite number of elements. In this setting, a probability distribution over X satisfies:

p({x}) ≥ 0 for allx ∈ X,

For any .

In particular,

Definition 1.3 (Notation) Let X be a finite state space and let Ax denote the set of all subsets of X (including ϕ the empty set and X). For probability distribution p: Ax → [0,1] defined above (Definition 1.2), p will also be used to denote the function p: X → [0,1] such that p(x) = p({x}). The meaning of p will be clear from the context.

The definitions and notations will now be established for random variables.

Definition 1.4 (Random Variables and Random Vectors) Discrete random variables, continuous random variables and random vectors satisfy the following properties:

For this text, a discrete random variable Y is a function Y: Ω → C where C is a countable space, that satisfies the following conditions: for each x ∈ C, {ω|Y(ω) = x} ∈ , and there is a function pY: C → R+, known as the probability function of Y, such that for each x ∈ C,

This is said to be the ‘probability that Y is instantiated at x’. Therefore, for any subset C ⊆ C, py satisfies

In particular, taking C = C,

A continuous random variable Ξ is defined as a function Ξ: Ω → R such that for any set A ⊆ R such that the function 1A∩[–N,N] is Riemann integrable for all N ∈ Z+, the set , and for which there is a function πΞ, known as the probability density function, or simply density function, such that for any A ⊂ R with Riemann integrable indicator function,

In particular,

A random vectorY is a vector of random variables. It will be taken as a row vector if it represents different characteristics of a single observation; it will be taken as a column vector if it represents a collection of independent identically distributed random variables.

This convention is motivated by the way that data is presented in a data matrix. Each column of a data matrix represents a different attribute, each row represents a different observation.

A random row vector Y is a collection of random variables that satisfies the following requirements: suppose Y = (Y1,…,Ym) and for each j = 1,…,m, Yj is a discrete random variable that takes values in a countable space Cj, and let C = C1 ×… × Cm, then Y is a random vector if for each (y1,…, ym) ∈ C, , and there is a joint probability function :C → [0,1] such that for any set C ⊆ C,

In particular,

If. is a collection of n random variables where for each j = 1,…, n Ξj: is a continuous random variable, then Ξ is a random vector if for each set such that is Riemann integrable for each ,

and there is a Riemann integrable function , where R+ denotes the non-negative real numbers such that for each set such that is Riemann integrable for each ,

In particular,

A collection of random variables Y of length m +n, containing m discrete variables and n continuous variables, is a random vector if it satisfies the following: there is an ordering σ of 1,…, m + n such that (Yσ (1),…, Yσ (m)) is a discrete random vector and (Yσ (m+1),…, Yσ (m+n) is a continuous random vector. Furthermore, let denote the state space for (Yσ (1),…, Yσ (m)), then for each (y1,…, ym) ∈ , there is a Riemann integrable function such that for any set A ∈ Rn such that l A∩[–N,N] is Riemann integrable for each N ∈Z+,

and such that

Definition 1.5 (Marginal Distribution) Let X = (Xi,…, Xn) be a discrete random vector, with joint probability function px1,…,xn.The probability distribution for (Xj1,…, Xjm), where m ≤ n and 1 ≤ j1 <…< jm ≤ n is known as the marginal distribution, and the marginal probability function is defined as

In particular, for two discrete variables X and Y taking values in the spaces Cx and Cy respectively, with joint probability function px,y, the marginal probability function for the random variable X is defined as

and the marginal probability function for the random variable Y is defined as

If Ξ = (Ξ1,…, Ξn) is a continuous random vector, with joint probability density function , then the marginal density function for Ξj1,…, Ξjm, where {j1,…, jm} ⊂ {1,…,n} is defined as

Categorical random variables In this text, the sample space X will contain a finite, or countably infinite, number of outcomes, while usually the parameter space , where n is the number of parameters. Most of the discrete variables arising will be categorical, in the sense that the outcomes are classified in several categories. For example, suppose an urn contains 16 balls, of which four are white, six are green, five are blue and one is red. Pick a ball at random and let C denote the colour of the ball, then C is an example of a discrete random variable. The probability distribution of a categorical variable C is denoted by pc- Here, for example, ;

The set-up described above is adequate for this text, where only two types of random variables are considered; continuous (which, by definition, have a Riemann integrable density function) and discrete variables. Statements of uncertainty made within this framework are consistent and coherent. Any probability function p over that is to provide a quantitative assessment of the probability of an event, which is also to be mathematically coherent over a constructive algebra of events, must satisfy the axioms listed above. Any set function over an algebra of sets that satisfies these axioms will provide a mathematically coherent measure of uncertainty.

1.3 The Bayes update of probability

The prior probability is the probability distribution p over before any relevant data is obtained. The prior probability is related to background information, modelling assumptions, or simply a function introduced for mathematical convenience, which is intended to express a degree of vagueness. It could be written , where K designates what could be called the context or a. frame of knowledge [9]. The notation is often employed, although this is a little misleading since, formally, conditional probability is only defined in terms of an initial distribution. Here, the initial distribution is based on K. Since the notation is now standard, it will be employed in this text, although caution should be observed.

Consider a prior distribution, denoted by p(.). Suppose the experimenter, or agent, has the information that holds and desires to update the probabilities based on this piece of information. This update involves introducing a new probability distribution p* on . Suppose also that p(B) > 0. Since it is now known that B is a certainty, the update requires

so that , where Bc = Ω\B is the complement of B in Ω. The updated probability is constructed so that the ratio of probabilities for any B1 ⊂ B and B2 ⊂ B does not change. That is, for ,

For arbitrary , the axioms yield

But since , it follows that . In other words, it follows that for arbitrary , since p*(B) = 1,

The transformation p → p* is known as the itayes update of probability. When the evidence is obtained is precisely that an event within the algebra has happened, Bayes’ rule may be used to update the probability distribution. It is customary to use the notation p(A|B) to denote p*(A). Hence

(1.1)

This is called the conditional probability of A given B. This characterization of the conditional probability p(A|B) follows [10]. Further discussion may be found in [11]. This is the update used when the evidence is precisely that an event has occurred. Different updates are required to incorporate knowledge that cannot be expressed in this way. This is discussed in [12].

From the definition of p(A|B), the trivial but important identity

(1.2)

follows for any A, .

The Bayes factor Bayes’ rule simply states that for any two events A and C,

If event C represents new evidence, and p*(.) = p(.|C) represents the updated probability distribution, then for any two events A and B, Bayes’ rule yields:

The factor therefore updates the ratio to . Note that

This leads to the following definition, which will be used later.

Definition 1.6 Let p and q denote two probability distributions over an algebra A. For any two events A, , the Bayes factor Fq,p (A; B) is defined as

(1.3)

Here p plays the role of a probability before updating and q plays the role of an updated probability. The Bayes factor indicates whether or not the new information has increased the odds of an event A relative to B.

Bayes’ rule applied to random variables Let X and Y be two discrete random variables. Let px, py, Px|y and py|x denote the probability mass functions of X, Y, X given Y and Y given X respectively. It follows directly from Bayes’ rule that for all x, y

(1.4)

If X and Y are continuous random variables with density functions πx and πY, then the conditional probability density function of X, given Y = y is

If X is a discrete random variable and Θ is a continuous random variable with state space , where is the conditional probability function for X given Θ = θ and Θ has density function πΘ, then Bayes’ rule in this context gives

(1.5)

1.4 Inductive learning

The task of inductive learning is, roughly stated, to find a general law, based of a finite number of particular examples. Without further information, a law established in this way cannot be a certainty, but necessarily has a level of uncertainty attached to it. The assessment uses the idea that future events similar to past events will cause similar outcomes, so that outcomes from past events may be used to predict outcomes from future events. There is a subjective component in the assessment of the uncertainty, which enters through the prior distribution. This is the assessment made before any particular examples are taken into consideration.

In any real situation within the engineering sciences, a mathematical model is only ever at best a model and can never present a full description of the situation that is being modelled. In many situations, the information is not presented in terms of absolute certainties to which deductive logic may be applied and ‘inductive learning', as described above, is the only way to proceed.

For machine learning of situations arising in the engineering sciences, the machine learns inductively from past examples, from which it makes predictions of future behaviour and takes appropriate decisions; for example, deciding whether a paper mill is running abnormally and, if it is, shutting it down and locating the error.

In the following discussion, the second person singular pronoun, ‘You', will be used to denote the person, or machine,² with well defined instructions, analysing some uncertain statements and making predictions based on this analysis.

The Bayes update rule is now applied to learning from experience. The update is based on the definition of the conditional probability p(E\A) of an event E given an event A. The rule for calculating the conditional probability of A given E (or, given that E ‘occurs') is known as Bayes’ rule and is given by

(1.6)

Equation (1.6) follows immediately from the definitions introduced in Equations (1.1) and (1.2). Here £ is a mnemonic notation for evidence.

The formula (1.6) was for a long time known more widely as ‘inverse probability’ rather than as ‘Bayes’ rule'. The quantity p(A|E) does not always need to be computed using this formula; sometimes it is arrived at by other means. For example, the Jeffrey’s update rule, as will be seen later, uses a particular set of conditional probabilities as fundamental and derives the other probabilities from it.

1.4.1 Bayes’ rule

A more instructive form of Bayes’ rule is obtained by considering a finite exhaustive set of mutually exclusive hypotheses The law of total probability gives for any E

(1.7)

and Equation (1.6) yields for any Hi

(1.8)

Here p(Hi) is the initial or prior probability for a hypothesis, before any evidence E is obtained. The prior probability will have a subjective element, depending on background information and how You interpret this information. As discussed earlier, this background information is often denoted by the letter K. In computer science, this is often referred to as domain knowledge [14]. The prior is therefore often denoted p(Hi|K). As discussed earlier, this is misleading, because no application of the formula given in Equation (1.1) has been made. The notation is simplified by dropping K.

The quantity in p(Hi|E) in Equation (1.8) denotes how p(Hi) is updated to a new probability, called the posterior probability, based on this new evidence. This update is the key concept of Bayesian learning from experience.

There is a basic question: why should probability calculus be used when performing the inductive logic in the presence of uncertainty described above? This is connected with the notion of coherence. The probability calculus outlined above ensures that the uncertainty statements ‘fit together’ or ‘cohere’ and it is the only way to ensure that mathematical statements of uncertainty are consistent. To ensure coherence, inductive logic should therefore be expressed through probability. Inference about uncertain events Hi from an observed event E will be mathematically coherent if and only if they are made by computing p(Hi|E) in the way described above. All calculations, in order to achieve coherence, must be made within this probability calculus. Hence, to make coherent statements about different events, Your calculations for learning by experience have to satisfy Bayes’ rule.

The introduction of an arbitrary prior distribution may appear, at first sight, to be ad hoc. The important point here is that all modelling contains some ad hoc element in the choice of the model. The strength of the Bayesian approach is that once the prior distribution is declared, this contains the modelling assumptions and the ad hoc element becomes transparent. The ad hoc element is always present and is stated clearly in the Bayesian approach.

When an event E occurs that belongs to the well defined algebra of events, over which there are well defined probabilities, the Bayes rule for updating the probability of an event A from p(A) to p(A|E) is applied. There are situations where the evidence E may be given in terms of observations that are less precise (that is, E is not an event that clearly belongs to the algebra), but nevertheless an update of the the probability function p(·) is required. Jeffrey’s rule and Pearl’s method of virtual evidence can be useful in these situations.

1.4.2 Jeffrey’s rule

Suppose that the new evidence implies that You form an exhaustive set of r mutually exclusive hypotheses which, following the soft evidence have probabilities . The question is how to update the probability of any event . Note that You cannot use Bayes’ rule (Equation (1.6)), since the evidence has not been expressed in terms of a well defined event E

Enjoying the preview?

Page 1 of 1

Bayesian Networks: An Introduction

About this ebook

Timo Koski

Related authors

Related to Bayesian Networks

Titles in the series (100)

Related ebooks

Mathematics For You

Related podcast episodes

Related articles

Related categories

Reviews for Bayesian Networks

What did you think?

Book preview

Bayesian Networks - Timo Koski