Combining Pattern Classifiers: Methods and Algorithms

Ebook721 pages6 hours

Combining Pattern Classifiers: Methods and Algorithms

Name: Combining Pattern Classifiers: Methods and Algorithms
Author: Ludmila I. Kuncheva
ISBN: 9781118914540

By Ludmila I. Kuncheva

Rating: 0 out of 5 stars

()

Read preview

About this ebook

A unified, coherent treatment of current classifier ensemble methods, from fundamentals of pattern recognition to ensemble feature selection, now in its second edition

The art and science of combining pattern classifiers has flourished into a prolific discipline since the first edition of Combining Pattern Classifiers was published in 2004. Dr. Kuncheva has plucked from the rich landscape of recent classifier ensemble literature the topics, methods, and algorithms that will guide the reader toward a deeper understanding of the fundamentals, design, and applications of classifier ensemble methods.

Thoroughly updated, with MATLAB® code and practice data sets throughout, Combining Pattern Classifiers includes:

Coverage of Bayes decision theory and experimental comparison of classifiers
Essential ensemble methods such as Bagging, Random forest, AdaBoost, Random subspace, Rotation forest, Random oracle, and Error Correcting Output Code, among others
Chapters on classifier selection, diversity, and ensemble feature selection

With firm grounding in the fundamentals of pattern recognition, and featuring more than 140 illustrations, Combining Pattern Classifiers, Second Edition is a valuable reference for postgraduate students, researchers, and practitioners in computing and engineering.

Skip carousel

LanguageEnglish

PublisherWiley

Release dateAug 13, 2014

ISBN9781118914540

Author

Ludmila I. Kuncheva

Related authors

Skip carousel

Related to Combining Pattern Classifiers

Related ebooks

Skip carousel

Generic Inference: A Unifying Theory for Automated Reasoning
Ebook
Generic Inference: A Unifying Theory for Automated Reasoning
byMarc Pouly
Rating: 0 out of 5 stars
0 ratings
Temporal Data Mining via Unsupervised Ensemble Learning
Ebook
Temporal Data Mining via Unsupervised Ensemble Learning
byYun Yang
Rating: 0 out of 5 stars
0 ratings
Robustness Theory and Application
Ebook
Robustness Theory and Application
byBrenton R. Clarke
Rating: 0 out of 5 stars
0 ratings
Statistical Data Cleaning with Applications in R
Ebook
Statistical Data Cleaning with Applications in R
byMark van der Loo
Rating: 0 out of 5 stars
0 ratings
Practical Applications of Bayesian Reliability
Ebook
Practical Applications of Bayesian Reliability
byYan Liu
Rating: 0 out of 5 stars
0 ratings
Pattern Recognition: A Quality of Data Perspective
Ebook
Pattern Recognition: A Quality of Data Perspective
byWladyslaw Homenda
Rating: 0 out of 5 stars
0 ratings
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
Ebook
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
byWouter Verbeke
Rating: 0 out of 5 stars
0 ratings
Probabilistic Design for Optimization and Robustness for Engineers
Ebook
Probabilistic Design for Optimization and Robustness for Engineers
byBryan Dodson
Rating: 0 out of 5 stars
0 ratings
Sampling
Ebook
Sampling
bySteven K. Thompson
Rating: 5 out of 5 stars
5/5
Differential Equation Analysis in Biomedical Science and Engineering: Ordinary Differential Equation Applications with R
Ebook
Differential Equation Analysis in Biomedical Science and Engineering: Ordinary Differential Equation Applications with R
byWilliam E. Schiesser
Rating: 0 out of 5 stars
0 ratings
Robust Statistics: Theory and Methods (with R)
Ebook
Robust Statistics: Theory and Methods (with R)
byRicardo A. Maronna
Rating: 0 out of 5 stars
0 ratings
Practical Design of Experiments (DOE): A Guide for Optimizing Designs and Processes
Ebook
Practical Design of Experiments (DOE): A Guide for Optimizing Designs and Processes
byMark Allen Durivage
Rating: 0 out of 5 stars
0 ratings
Robust Nonlinear Regression: with Applications using R
Ebook
Robust Nonlinear Regression: with Applications using R
byHossein Riazoshams
Rating: 0 out of 5 stars
0 ratings
Statistical Data Analysis Explained: Applied Environmental Statistics with R
Ebook
Statistical Data Analysis Explained: Applied Environmental Statistics with R
byClemens Reimann
Rating: 0 out of 5 stars
0 ratings
Data Mining: Practical Machine Learning Tools and Techniques
Ebook
Data Mining: Practical Machine Learning Tools and Techniques
byIan H. Witten
Rating: 4 out of 5 stars
4/5
Wavelet Neural Networks: With Applications in Financial Engineering, Chaos, and Classification
Ebook
Wavelet Neural Networks: With Applications in Financial Engineering, Chaos, and Classification
byAntonios K. Alexandridis
Rating: 0 out of 5 stars
0 ratings
Building Dependable Distributed Systems
Ebook
Building Dependable Distributed Systems
byWenbing Zhao
Rating: 0 out of 5 stars
0 ratings
Numerical Methods and Optimization in Finance
Ebook
Numerical Methods and Optimization in Finance
byManfred Gilli
Rating: 3 out of 5 stars
3/5
Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB
Ebook
Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB
byBangjun Lei
Rating: 3 out of 5 stars
3/5
Source Separation and Machine Learning
Ebook
Source Separation and Machine Learning
byJen-Tzung Chien
Rating: 0 out of 5 stars
0 ratings
Evolutionary Algorithms for Mobile Ad Hoc Networks
Ebook
Evolutionary Algorithms for Mobile Ad Hoc Networks
byBernabé Dorronsoro
Rating: 0 out of 5 stars
0 ratings
Making Sense of Data I: A Practical Guide to Exploratory Data Analysis and Data Mining
Ebook
Making Sense of Data I: A Practical Guide to Exploratory Data Analysis and Data Mining
byGlenn J. Myatt
Rating: 0 out of 5 stars
0 ratings
Mobile Edge Artificial Intelligence: Opportunities and Challenges
Ebook
Mobile Edge Artificial Intelligence: Opportunities and Challenges
byYuanming Shi
Rating: 0 out of 5 stars
0 ratings
Introduction to Bayesian Estimation and Copula Models of Dependence
Ebook
Introduction to Bayesian Estimation and Copula Models of Dependence
byArkady Shemyakin
Rating: 0 out of 5 stars
0 ratings
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
Ebook
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
byCésar Pérez López
Rating: 0 out of 5 stars
0 ratings
Taguchi Methods for Robust Design: Enter asset subtitle
Ebook
Taguchi Methods for Robust Design: Enter asset subtitle
byYuin Wu
Rating: 0 out of 5 stars
0 ratings
Simulation Techniques in Financial Risk Management
Ebook
Simulation Techniques in Financial Risk Management
byNgai Hang Chan
Rating: 3 out of 5 stars
3/5
Pattern Recognition
Ebook
Pattern Recognition
byKonstantinos Koutroumbas
Rating: 4 out of 5 stars
4/5
Numerical Algorithms for Personalized Search in Self-organizing Information Networks
Ebook
Numerical Algorithms for Personalized Search in Self-organizing Information Networks
bySep Kamvar
Rating: 0 out of 5 stars
0 ratings
Constraint Networks: Targeting Simplicity for Techniques and Algorithms
Ebook
Constraint Networks: Targeting Simplicity for Techniques and Algorithms
byChristophe Lecoutre
Rating: 0 out of 5 stars
0 ratings

Technology & Engineering For You

Skip carousel

Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
Sneaky Uses for Everyday Things: How to Turn a Penny into a Radio, Make a Flood Alarm with an Aspirin, Change Milk into Plastic, Extract Water and Electricity from Thin Air, Turn on a TV with your Ring, and Other Amazing Feats
Ebook
Sneaky Uses for Everyday Things: How to Turn a Penny into a Radio, Make a Flood Alarm with an Aspirin, Change Milk into Plastic, Extract Water and Electricity from Thin Air, Turn on a TV with your Ring, and Other Amazing Feats
byCy Tymony
Rating: 3 out of 5 stars
3/5
The Art of War
Ebook
The Art of War
bySun Tzu
Rating: 4 out of 5 stars
4/5
The Big Book of Hacks: 264 Amazing DIY Tech Projects
Ebook
The Big Book of Hacks: 264 Amazing DIY Tech Projects
byDoug Cantor
Rating: 4 out of 5 stars
4/5
The 48 Laws of Power in Practice: The 3 Most Powerful Laws & The 4 Indispensable Power Principles
Ebook
The 48 Laws of Power in Practice: The 3 Most Powerful Laws & The 4 Indispensable Power Principles
byJon Waterlow
Rating: 5 out of 5 stars
5/5
The Art of Tinkering: Meet 150+ Makers Working at the Intersection of Art, Science & Technology
Ebook
The Art of Tinkering: Meet 150+ Makers Working at the Intersection of Art, Science & Technology
byKaren Wilkinson
Rating: 4 out of 5 stars
4/5
80/20 Principle: The Secret to Working Less and Making More
Ebook
80/20 Principle: The Secret to Working Less and Making More
byPaul J. Stanley
Rating: 5 out of 5 stars
5/5
The Big Book of Maker Skills: Tools & Techniques for Building Great Tech Projects
Ebook
The Big Book of Maker Skills: Tools & Techniques for Building Great Tech Projects
byChris Hackett
Rating: 4 out of 5 stars
4/5
Logic Pro X For Dummies
Ebook
Logic Pro X For Dummies
byGraham English
Rating: 0 out of 5 stars
0 ratings
My Inventions: The Autobiography of Nikola Tesla
Ebook
My Inventions: The Autobiography of Nikola Tesla
byNikola Tesla
Rating: 4 out of 5 stars
4/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
Electrical Engineering 101: Everything You Should Have Learned in School...but Probably Didn't
Ebook
Electrical Engineering 101: Everything You Should Have Learned in School...but Probably Didn't
byDarren Ashby
Rating: 5 out of 5 stars
5/5
The Total Inventor's Manual: Transform Your Idea into a Top-Selling Product
Ebook
The Total Inventor's Manual: Transform Your Idea into a Top-Selling Product
bySean Michael Ragan
Rating: 1 out of 5 stars
1/5
Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career
Ebook
Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career
byScott H. Young
Rating: 4 out of 5 stars
4/5
The Fast Track to Your Technician Class Ham Radio License: For Exams July 1, 2022 - June 30, 2026
Ebook
The Fast Track to Your Technician Class Ham Radio License: For Exams July 1, 2022 - June 30, 2026
byMichael Burnette, AF7KB
Rating: 5 out of 5 stars
5/5
Ham Radio Exam Prep: A License Manual and Study Guide for the Amateur Radio General Class and Radio Technician Tests with 100 Test Questions
Ebook
Ham Radio Exam Prep: A License Manual and Study Guide for the Amateur Radio General Class and Radio Technician Tests with 100 Test Questions
byHam Radio Team
Rating: 0 out of 5 stars
0 ratings
The Basics of Bitcoins and Blockchains: An Introduction to Cryptocurrencies and the Technology that Powers Them (Cryptography, Derivatives Investments, Futures Trading, Digital Assets, NFT)
Ebook
The Basics of Bitcoins and Blockchains: An Introduction to Cryptocurrencies and the Technology that Powers Them (Cryptography, Derivatives Investments, Futures Trading, Digital Assets, NFT)
byAntony Lewis
Rating: 4 out of 5 stars
4/5
The Invisible Rainbow: A History of Electricity and Life
Ebook
The Invisible Rainbow: A History of Electricity and Life
byArthur Firstenberg
Rating: 4 out of 5 stars
4/5
The CIA Lockpicking Manual
Ebook
The CIA Lockpicking Manual
byCentral Intelligence Agency
Rating: 5 out of 5 stars
5/5
Broken Money: Why Our Financial System is Failing Us and How We Can Make it Better
Ebook
Broken Money: Why Our Financial System is Failing Us and How We Can Make it Better
byLyn Alden
Rating: 5 out of 5 stars
5/5
Smart Phone Dumb Phone: Free Yourself from Digital Addiction
Ebook
Smart Phone Dumb Phone: Free Yourself from Digital Addiction
byAllen Carr
Rating: 0 out of 5 stars
0 ratings
The Systems Thinker: Essential Thinking Skills For Solving Problems, Managing Chaos,
Ebook
The Systems Thinker: Essential Thinking Skills For Solving Problems, Managing Chaos,
byAlbert Rutherford
Rating: 4 out of 5 stars
4/5
Understanding Media: The Extensions of Man
Ebook
Understanding Media: The Extensions of Man
byMarshall McLuhan
Rating: 4 out of 5 stars
4/5
The Total Motorcycling Manual: 291 Essential Skills
Ebook
The Total Motorcycling Manual: 291 Essential Skills
byMark Lindemann
Rating: 5 out of 5 stars
5/5
The Wuhan Cover-Up: And the Terrifying Bioweapons Arms Race
Ebook
The Wuhan Cover-Up: And the Terrifying Bioweapons Arms Race
byRobert F. Kennedy, Jr.
Rating: 0 out of 5 stars
0 ratings
No Nonsense Technician Class License Study Guide: for Tests Given Between July 2018 and June 2022
Ebook
No Nonsense Technician Class License Study Guide: for Tests Given Between July 2018 and June 2022
byDan Romanchik KB6NU
Rating: 5 out of 5 stars
5/5
The Art of War
Ebook
The Art of War
bySun Tsu
Rating: 4 out of 5 stars
4/5
How to Disappear and Live Off the Grid: A CIA Insider's Guide
Ebook
How to Disappear and Live Off the Grid: A CIA Insider's Guide
byJohn Kiriakou
Rating: 0 out of 5 stars
0 ratings
The Complete Titanic Chronicles: A Night to Remember and The Night Lives On
Ebook
The Complete Titanic Chronicles: A Night to Remember and The Night Lives On
byWalter Lord
Rating: 4 out of 5 stars
4/5
A Night to Remember: The Sinking of the Titanic
Ebook
A Night to Remember: The Sinking of the Titanic
byWalter Lord
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

Proposing Annoyance Mining: A recent episode of the Skeptics Guide to the Universe included a slight rant by Dr. Novella and the rouges about a shortcoming in operating systems. This episode explores why such a (seemingly obvious) flaw might make sense from an engineering...
Podcast episode
Proposing Annoyance Mining: A recent episode of the Skeptics Guide to the Universe included a slight rant by Dr. Novella and the rouges about a shortcoming in operating systems. This episode explores why such a (seemingly obvious) flaw might make sense from an engineering...
byData Skeptic
0 ratings
0% found this document useful
Revisiting the Minimalist Approach to Offline Reinforcement Learning: Recent years have witnessed significant advancements in offline reinforcement learning (RL), resulting in the development of numerous algorithms with varying degrees of complexity. While these algorithms have led to noteworthy improvements, many inco...
Podcast episode
Revisiting the Minimalist Approach to Offline Reinforcement Learning: Recent years have witnessed significant advancements in offline reinforcement learning (RL), resulting in the development of numerous algorithms with varying degrees of complexity. While these algorithms have led to noteworthy improvements, many inco...
byPapers Read on AI
0 ratings
0% found this document useful
771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko: Kirill Eremenko joins Jon Krohn for another exclu…
Podcast episode
771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko: Kirill Eremenko joins Jon Krohn for another exclu…
bySuper Data Science: ML & AI Podcast with Jon Krohn
0 ratings
0% found this document useful
AI Today Podcast: AI Glossary Series – Batch Prediction, Microservice, Real-time Prediction, Stream Learning, Cold-Path Analytics, Hot-Path Analytics: In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Batch Prediction, Microservice, Real-time Prediction, Stream Learning, Cold-Path Analytics, and Hot-Path Analytics,
Podcast episode
AI Today Podcast: AI Glossary Series – Batch Prediction, Microservice, Real-time Prediction, Stream Learning, Cold-Path Analytics, Hot-Path Analytics: In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Batch Prediction, Microservice, Real-time Prediction, Stream Learning, Cold-Path Analytics, and Hot-Path Analytics,
byAI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
0 ratings
0% found this document useful
Analyzing the Google Paper on Continuous Delivery in ML // Part 4 // MLOps Coffee Sessions #17
Podcast episode
Analyzing the Google Paper on Continuous Delivery in ML // Part 4 // MLOps Coffee Sessions #17
byMLOps.community
0 ratings
0% found this document useful
What to consider when choosing an image analysis solution for phenotyping? (part 3) w/ Regan Baird, Visiopharm
Podcast episode
What to consider when choosing an image analysis solution for phenotyping? (part 3) w/ Regan Baird, Visiopharm
byDigital Pathology Podcast
0 ratings
0% found this document useful
The APsolute Recap: Biology Edition - Lab Experiments: How many lab experiments did you complete this year? Episode 23 recAPs the lab manual published by the College Board.
Podcast episode
The APsolute Recap: Biology Edition - Lab Experiments: How many lab experiments did you complete this year? Episode 23 recAPs the lab manual published by the College Board.
byThe APsolute RecAP: Biology Edition
0 ratings
0% found this document useful
Setting the Standard: Impact of Method Standardization in Chromatography
Podcast episode
Setting the Standard: Impact of Method Standardization in Chromatography
byThe Analytical Wavelength
0 ratings
0% found this document useful
Relevance-guided Supervision for OpenQA with ColBERT: Abstract Systems for Open-Domain Question Answering (OpenQA) generally depend on a retriever for finding candidate passages in a large corpus and a reader for extracting answers from those passages. In much recent work, the retriever is a learned com...
Podcast episode
Relevance-guided Supervision for OpenQA with ColBERT: Abstract Systems for Open-Domain Question Answering (OpenQA) generally depend on a retriever for finding candidate passages in a large corpus and a reader for extracting answers from those passages. In much recent work, the retriever is a learned com...
byPapers Read on AI
0 ratings
0% found this document useful
Hashing It Out #95: Cartesi - Augusto Texeria and Erick De Moura: Build scalable DApps using a fully-fledged Linux operating system and mainstream software stacks. Run complex computations off-chain, free from blockchain limitations and fees, while retaining decentralization and security. DApps with Cartesi are easier to build, scalable and more powerful..
Podcast episode
Hashing It Out #95: Cartesi - Augusto Texeria and Erick De Moura: Build scalable DApps using a fully-fledged Linux operating system and mainstream software stacks. Run complex computations off-chain, free from blockchain limitations and fees, while retaining decentralization and security. DApps with Cartesi are easier to build, scalable and more powerful..
byLogos Podcast with Jarrad Hope
0 ratings
0% found this document useful
The Fundamentals — JS: In this episode of Syntax, Scott and Wes talk about the fundamentals of JavaScript - the set of core skills you should know before branching off into other frameworks, libraries, etc. LogRocket - Sponsor LogRocket lets you replay what users do on your...
Podcast episode
The Fundamentals — JS: In this episode of Syntax, Scott and Wes talk about the fundamentals of JavaScript - the set of core skills you should know before branching off into other frameworks, libraries, etc. LogRocket - Sponsor LogRocket lets you replay what users do on your...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Episode 17: Perfecting Polymers Processing
Podcast episode
Episode 17: Perfecting Polymers Processing
byMaterialism: A Materials Science Podcast
0 ratings
0% found this document useful
AI Today Podcast: AI Glossary Series – Cloud ML, On-Premise, Edge Device, Machine Learning -as-a-Service (MLaaS): In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Cloud ML, On-Premise, Edge Device, Machine Learning -as-a-Service (MLaaS), explain how these terms relates to AI and why it's important to know about them. ...
Podcast episode
AI Today Podcast: AI Glossary Series – Cloud ML, On-Premise, Edge Device, Machine Learning -as-a-Service (MLaaS): In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Cloud ML, On-Premise, Edge Device, Machine Learning -as-a-Service (MLaaS), explain how these terms relates to AI and why it's important to know about them. ...
byAI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
0 ratings
0% found this document useful
How To Build an API in 2022: In this episode of Syntax, Wes and Scott talk about what APIs are, the API standards that exist, and walk through the various layers of what goes into making an API. Payments Hub - Sponsor There are hundreds of payments processing companies out...
Podcast episode
How To Build an API in 2022: In this episode of Syntax, Wes and Scott talk about what APIs are, the API standards that exist, and walk through the various layers of what goes into making an API. Payments Hub - Sponsor There are hundreds of payments processing companies out...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
The Fundamentals - Server Side: In this episode of Syntax, Scott and Wes talk about server side fundamentals — the important things you should know if you’re interested in diving into server side. Sentry - Sponsor If you want to know what’s happening with your errors, track...
Podcast episode
The Fundamentals - Server Side: In this episode of Syntax, Scott and Wes talk about server side fundamentals — the important things you should know if you’re interested in diving into server side. Sentry - Sponsor If you want to know what’s happening with your errors, track...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
61: Look at this Graph! (Graph Theory): In mathematics, nature is a constant driving inspiration; mathematicians are part of nature, so this is natural. A huge part of nature is the idea of things like networks. These are represented by mathematical objects called 'graphs'. Graphs allow us...
Podcast episode
61: Look at this Graph! (Graph Theory): In mathematics, nature is a constant driving inspiration; mathematicians are part of nature, so this is natural. A huge part of nature is the idea of things like networks. These are represented by mathematical objects called 'graphs'. Graphs allow us...
byBreaking Math Podcast
0 ratings
0% found this document useful
AI Today Podcast: AI Glossary Series – Confusion Matrix, Accuracy, Precision, F1, Recall, Sensitivity, Specificity, Receiver-Operating Characteristic (ROC) Curve: In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Confusion Matrix, Accuracy, Precision, F1, Recall, Sensitivity, Specificity, Receiver-Operating Characteristic (ROC) Curve,
Podcast episode
AI Today Podcast: AI Glossary Series – Confusion Matrix, Accuracy, Precision, F1, Recall, Sensitivity, Specificity, Receiver-Operating Characteristic (ROC) Curve: In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Confusion Matrix, Accuracy, Precision, F1, Recall, Sensitivity, Specificity, Receiver-Operating Characteristic (ROC) Curve,
byAI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
0 ratings
0% found this document useful
195 - How 9/11 and Katrina Changed Scanning: What changed in the radio landscape in 20 years post 9/11 and Hurricane Katrina here in the US? Have you ever wondered why we have so many new P25 systems showing up in the 700 MHz band? These are the questions we are asking in today's podcast. ...
Podcast episode
195 - How 9/11 and Katrina Changed Scanning: What changed in the radio landscape in 20 years post 9/11 and Hurricane Katrina here in the US? Have you ever wondered why we have so many new P25 systems showing up in the 700 MHz band? These are the questions we are asking in today's podcast. ...
byScanner School - Everything you wanted to know about the Scanner Radio Hobby
0 ratings
0% found this document useful
Oracle Data Lakehouse: With each passing day, more and more data sources are sending greater volumes of data across the globe. For any organization, this combination of structured and unstructured data continues to be a challenge. Data lakehouses link, correlate, and...
Podcast episode
Oracle Data Lakehouse: With each passing day, more and more data sources are sending greater volumes of data across the globe. For any organization, this combination of structured and unstructured data continues to be a challenge. Data lakehouses link, correlate, and...
byOracle University Podcast
0 ratings
0% found this document useful
Things Coming Down the Pipe From TC39 - JSJ 590
Podcast episode
Things Coming Down the Pipe From TC39 - JSJ 590
byJavaScript Jabber
0 ratings
0% found this document useful
Hasu & Hart on Oval & The Recapturing of Billions in DeFi Liquidations: Today on the show, we’re talking about Oval, a new DeFi primitive that Hart Lambur from UMA is introducing to the world of DeFi’s biggest lending markets. Billions of dollars have been liquidated from protocols like Aave, Compound, and MakerDAO...
Podcast episode
Hasu & Hart on Oval & The Recapturing of Billions in DeFi Liquidations: Today on the show, we’re talking about Oval, a new DeFi primitive that Hart Lambur from UMA is introducing to the world of DeFi’s biggest lending markets. Billions of dollars have been liquidated from protocols like Aave, Compound, and MakerDAO...
byBankless
0 ratings
0% found this document useful
Should Tesla Buyback Stock? + FSD Beta Release Notes, Wedbush, NHTSA (05.19.22): ➤ One of Tesla’s largest shareholders advocates for stock buyback, should Tesla do it? ➤ FSD Beta 10.12 release notes leak ➤ Wedbush reduces TSLA price target ➤ California mayor discloses massive Supercharging site ➤ NHTSA investigates...
Podcast episode
Should Tesla Buyback Stock? + FSD Beta Release Notes, Wedbush, NHTSA (05.19.22): ➤ One of Tesla’s largest shareholders advocates for stock buyback, should Tesla do it? ➤ FSD Beta 10.12 release notes leak ➤ Wedbush reduces TSLA price target ➤ California mayor discloses massive Supercharging site ➤ NHTSA investigates...
byTesla Daily: Tesla News & Analysis
0 ratings
0% found this document useful
AI Today Podcast: AI Glossary Series – Data Drift, Model Drift, Model Retraining: In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Data Drift, Model Drift, Model Retraining, explain how these terms relate to AI and why it's important to know about them. Show Notes:
Podcast episode
AI Today Podcast: AI Glossary Series – Data Drift, Model Drift, Model Retraining: In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Data Drift, Model Drift, Model Retraining, explain how these terms relate to AI and why it's important to know about them. Show Notes:
byAI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
0 ratings
0% found this document useful
#037 - Tour De Bayesian with Connor Tann
Podcast episode
#037 - Tour De Bayesian with Connor Tann
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Dataprep with Eric Anderson: Eric Anderson joins the podcast to talk about how Dataprep is simplifying data wrangling!
Podcast episode
Dataprep with Eric Anderson: Eric Anderson joins the podcast to talk about how Dataprep is simplifying data wrangling!
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization: The recent amalgamation of transformer and convolutional designs has led to steady improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a hybrid vision transformer architecture that obtains the state-of-the-art l...
Podcast episode
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization: The recent amalgamation of transformer and convolutional designs has led to steady improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a hybrid vision transformer architecture that obtains the state-of-the-art l...
byPapers Read on AI
0 ratings
0% found this document useful
Episode 456 - Azure Programmable Connectivity: Evan and Sujit talk to Ricardo Villarreal about the new ability to built network-aware applications in Azure. APC provides a unified abstraction layer for accessing network APIs—such as edge discovery, quality on demand, and device location—consistently across operators and connectivity methods. Ricardo offers compelling use-cases for these new APIs in Azure and tips to get started.   Media file: https://azpodcast.blob.core.windows.net/episodes/Episode456.mp3 YouTube: https://youtu.be/2XWy2X0WXCQ Resources: https://aka.ms/apcblog   Other updates: New Azure for Operators products and partner programs released | Azure updates | Microsoft Azure   General availability: New enhanced connection troubleshoot | Azure updates | Microsoft Azure   Public preview: Database connections support in Azure Static Web Apps | Azure updates | Microsoft Azure   Public Preview: Performance Plus for Azure Disk Stora
Podcast episode
Episode 456 - Azure Programmable Connectivity: Evan and Sujit talk to Ricardo Villarreal about the new ability to built network-aware applications in Azure. APC provides a unified abstraction layer for accessing network APIs—such as edge discovery, quality on demand, and device location—consistently across operators and connectivity methods. Ricardo offers compelling use-cases for these new APIs in Azure and tips to get started.   Media file: https://azpodcast.blob.core.windows.net/episodes/Episode456.mp3 YouTube: https://youtu.be/2XWy2X0WXCQ Resources: https://aka.ms/apcblog   Other updates: New Azure for Operators products and partner programs released | Azure updates | Microsoft Azure   General availability: New enhanced connection troubleshoot | Azure updates | Microsoft Azure   Public preview: Database connections support in Azure Static Web Apps | Azure updates | Microsoft Azure   Public Preview: Performance Plus for Azure Disk Stora
byThe Azure Podcast
0 ratings
0% found this document useful
327 - SRU106: Rebanding - How Nextel complicated the scanner hobby: The Nextel Conundrum: Originally established as a convenient communication solution for small businesses, Nextel swiftly expanded its services to include cellular telephony. However, the positioning of Nextel within the RF spectrum,...
Podcast episode
327 - SRU106: Rebanding - How Nextel complicated the scanner hobby: The Nextel Conundrum: Originally established as a convenient communication solution for small businesses, Nextel swiftly expanded its services to include cellular telephony. However, the positioning of Nextel within the RF spectrum,...
byScanner School - Everything you wanted to know about the Scanner Radio Hobby
0 ratings
0% found this document useful
The Art & Science of Finding You Top Performers: The Art & Science of Finding You Top Performers Advanced Insights into Data Analysis and Optimization with Dr. Ellis Welcome to this episode of Seller Sessions, where we dive deep into the nuanced world of data analysis and optimisation with the...
Podcast episode
The Art & Science of Finding You Top Performers: The Art & Science of Finding You Top Performers Advanced Insights into Data Analysis and Optimization with Dr. Ellis Welcome to this episode of Seller Sessions, where we dive deep into the nuanced world of data analysis and optimisation with the...
bySeller Sessions Amazon FBA and Private Label
0 ratings
0% found this document useful
Ep.459 - Making Elevating Construction Surveyors, Part 4, Feat. Brandon Montero: This is the fourth recording used in an effort to draft the Book, Elevating Construction Surveyors
Podcast episode
Ep.459 - Making Elevating Construction Surveyors, Part 4, Feat. Brandon Montero: This is the fourth recording used in an effort to draft the Book, Elevating Construction Surveyors
byElevate Construction
0 ratings
0% found this document useful

Skip carousel

Contesting
CQ Amateur Radio
Article
Contesting
Oct 1, 2019
10 min read
New Tools for Using the Sherwood Tables for Transceiver Selection
CQ Amateur Radio
Article
New Tools for Using the Sherwood Tables for Transceiver Selection
Jan 1, 2023
Receive performance has been one of the top criteria for transceiver selection by hams for decades. As the well-worn phrase goes, “if you can’t hear ‘em, you can’t work ‘em.” Rob Sherwood has been conducting bench tests on the receive performance of
10 min read
Excite Audio Vision 4X £89
Computer Music
Article
Excite Audio Vision 4X £89
Apr 19, 2023
3 min read
Data Analysis
Linux Format
Article
Data Analysis
Mar 10, 2020
Sometimes you receive raw data that needs to be processed before plotting. In Veusz, look under the Data > Operations menu and find lots of options for manipulating data sets. Joining, merging, finding the average, filtering and many more are availab
1 min read
Clever CAD Coding For Clients And Cigars
Linux Format
Article
Clever CAD Coding For Clients And Cigars
Apr 2, 2024
Credit: http://openscad.org Tam Hanna’s minimal creative capability makes him ideally suited to teaching all kinds of workarounds for problems that require the use of creativity. Catch up by ordering back issues on page 58! The experiments performed
7 min read
IK MULTIMEDIA ARC System 3
Music Tech Magazine
Article
IK MULTIMEDIA ARC System 3
Jul 16, 2020
A quick question for you: what’s the single most important aspect of your studio, the thing that makes everything else possible? That’s right. Monitoring. Your studio monitoring system is the window through which you hear everything. It’s what all yo
4 min read
Kilohearts Snap Heap
Computer Music
Article
Kilohearts Snap Heap
Jan 24, 2024
Snap Heap (AU, VST, VST3, AAX) from Kilohearts is a fantastic multi-effects host plugin. It normally retails for $29 but this month the extremely kind team at Kilohearts are gifting it to Computer Music readers, and what a great gift! When it comes t
7 min read
Measurements
Stereophile
Article
Measurements
Feb 11, 2020
2 min read
Understand the ColorSync Utility
iCreate
Article
Understand the ColorSync Utility
Sep 10, 2020
1 min read
Contesting Your Way to DX Success
CQ Amateur Radio
Article
Contesting Your Way to DX Success
Jul 1, 2020
10 min read
Measurements
Stereophile
Article
Measurements
Apr 14, 2020
2 min read
Grid Modeling Overview: Four Types of Models Guiding the Transition to Clean Electricity
Union of Concerned Scientists
Article
Grid Modeling Overview: Four Types of Models Guiding the Transition to Clean Electricity
Apr 25, 2022
6 min read
Fix Your Wi-fi Connection
Maximum PC
Article
Fix Your Wi-fi Connection
Nov 8, 2022
1 min read
Comparing Time Series Data Like A Pro
Linux Format
Article
Comparing Time Series Data Like A Pro
Jun 1, 2021
8 min read
WiFi Explorer
MacFormat
Article
WiFi Explorer
Mar 9, 2021
1 min read
Clarisse 4.0
3D World
Article
Clarisse 4.0
Apr 17, 2019
PRICE Studio: $2,299 / Indie: $999 | DEVELOPER Isotropix | WEBSITE www.isotropix.com AUTHOR PROFILE Cirstyn Bech-Yagher Cirstyn has moved from Radeon’s ProRender to the RizomUV team, where she does product management as well as modelling, UV mapping
3 min read
Six Of The Best Freeware Plugins For Extreme Sounds
Electronic Musician
Article
Six Of The Best Freeware Plugins For Extreme Sounds
Feb 21, 2023
1 min read
Memristor Setup Could Make Computer Chips More Efficient
Futurity
Article
Memristor Setup Could Make Computer Chips More Efficient
Jul 31, 2018
A new way of arranging advanced computer components called memristors on a chip could pave the way for their use in general computing. This could cut energy consumption by a factor of 100. Using memristors would improve performance in low power envir
2 min read
Fix LEDs With Some Coded CAD Models
Linux Format
Article
Fix LEDs With Some Coded CAD Models
Mar 5, 2024
Don’t miss next issue, subscribe on page 16! Tam Hanna’s creativity can be compared to the amount of humour found in a stone or the communication capability of popped balloon. In short, he is the ideal candidate to explain how to design cool things u
7 min read
The Midnight Design Solutions “Phaser” Transceiver Kit
CQ Amateur Radio
Article
The Midnight Design Solutions “Phaser” Transceiver Kit
Aug 1, 2020
19 min read
Emulate An Analogue Computer Digitally
Linux Format
Article
Emulate An Analogue Computer Digitally
Feb 6, 2024
11 min read
First Look: Midnight Design Solutions Phaser Transceiver Anatomy of a Digital Mode Project
CQ Amateur Radio
Article
First Look: Midnight Design Solutions Phaser Transceiver Anatomy of a Digital Mode Project
Feb 1, 2020
5 min read
Contesting
CQ Amateur Radio
Article
Contesting
Dec 1, 2021
Single-op entrants in the six 2022 North American QSO Parties have, for the first time, the choice of entering the Assisted category. In previous years, assisted NAQP single ops had their scores listed with the multi-operator M/2 entrants. The 100-wa
10 min read
E xcite Audio V ision 4X $111
Electronic Musician
Article
E xcite Audio V ision 4X $111
May 23, 2023
2 min read
APY Masterclass Framing A Dark Molecular Cloud
BBC Sky at Night
Article
APY Masterclass Framing A Dark Molecular Cloud
May 19, 2022
3 min read
UVI Shade €129
Computer Music
Article
UVI Shade €129
Jan 27, 2021
UVI are best known for their Workstation and Falcon hybrid synth, which collectively deliver an impressive array of sample and synthbased sounds. They are perhaps less well known for their effects, which include the excellent Relayer delay and Sparkv
3 min read
August and September Have the Two Largest Worldwide Digital-Mode Contests
CQ Amateur Radio
Article
August and September Have the Two Largest Worldwide Digital-Mode Contests
Aug 1, 2021
10 min read
Shade
Electronic Musician
Article
Shade
Mar 23, 2021
4 min read
Matrix Mixer
Electronic Musician
Article
Matrix Mixer
Dec 21, 2021
3 min read
Wifi Explorer
MacLife
Article
Wifi Explorer
Feb 2, 2021
1 min read

Related categories

Skip carousel

Reviews for Combining Pattern Classifiers

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Combining Pattern Classifiers - Ludmila I. Kuncheva

PREFACE

Pattern recognition is everywhere. It is the technology behind automatically identifying fraudulent bank transactions, giving verbal instructions to your mobile phone, predicting oil deposit odds, or segmenting a brain tumour within a magnetic resonance image.

A decade has passed since the first edition of this book. Combining classifiers, also known as classifier ensembles, has flourished into a prolific discipline. Viewed from the top, classifier ensembles reside at the intersection of engineering, computing, and mathematics. Zoomed in, classifier ensembles are fuelled by advances in pattern recognition, machine learning and data mining, among others. An ensemble aggregates the opinions of several pattern classifiers in the hope that the new opinion will be better than the individual ones. Vox populi, vox Dei.

The interest in classifier ensembles received a welcome boost due to the high-profile Netflix contest. The world’s research creativeness was challenged using a difficult task and a substantial reward. The problem was to predict whether a person will enjoy a movie based on their past movie preferences. A Grand Prize of $1,000,000 was to be awarded to the team who first achieved a 10% improvement on the classification accuracy of the existing system Cinematch. The contest was launched in October 2006, and the prize was awarded in September 2009. The winning solution was nothing else but a rather fancy classifier ensemble.

What is wrong with the good old single classifiers? Jokingly, I often put up a slide in presentations, with a multiple-choice question. The question is Why classifier ensembles? and the three possible answers are:

because we like to complicate entities beyond necessity (anti-Occam’s razor);

because we are lazy and stupid and cannot be bothered to design and train one single sophisticated classifier; and

because democracy is so important to our society, it must be important to classification.

Funnily enough, the real answer hinges on choice (b). Of course, it is not a matter of laziness or stupidity, but the realization that a complex problem can be elegantly solved using simple and manageable tools. Recall the invention of the error backpropagation algorithm followed by the dramatic resurfacing of neural networks in the 1980s. Neural networks were proved to be universal approximators with unlimited flexibility. They could approximate any classification boundary in any number of dimensions. This capability, however, comes at a price. Large structures with a vast number of parameters have to be trained. The initial excitement cooled down as it transpired that massive structures cannot be easily trained with sufficient guarantees of good generalization performance. Until recently, a typical neural network classifier contained one hidden layer with a dozen neurons, sacrificing the so acclaimed flexibility but gaining credibility. Enter classifier ensembles! Ensembles of simple neural networks are among the most versatile and successful ensemble methods.

But the story does not end here. Recent studies have rekindled the excitement of using massive neural networks drawing upon hardware advances such as parallel computations using graphics processing units (GPU) [75]. The giant data sets necessary for training such structures are generated by small distortions of the available set. These conceptually different rival approaches to machine learning can be regarded as divide-and-conquer and brute force, respectively. It seems that the jury is still out about their relative merits. In this book we adopt the divide-and-conquer approach.

THE PLAYING FIELD

Writing the first edition of the book felt like the overwhelming task of bringing structure and organization to a hoarder’s attic. The scenery has changed markedly since then. The series of workshops on Multiple Classifier Systems (MCS), run since 2000 by Fabio Roli and Josef Kittler [338], served as a beacon, inspiration, and guidance for experienced and new researchers alike. Excellent surveys shaped the field, among which are the works by Polikar [311], Brown [53], and Valentini and Re [397]. Better still, four recent texts together present accessible, in-depth, comprehensive, and exquisite coverage of the classifier ensemble area: Rokach [335], Zhou [439], Schapire and Freund [351], and Seni and Elder [355]. This gives me the comfort and luxury to be able to skim over topics which are discussed at length and in-depth elsewhere, and pick ones which I believe deserve more exposure or which I just find curious.

As in the first edition, I have no ambition to present an accurate snapshot of the state of the art. Instead, I have chosen to explain and illustrate some methods and algorithms, giving sufficient detail so that the reader can reproduce them in code. Although I venture an opinion based on general consensus and examples in the text, this should not be regarded as a guide for preferring one method to another.

SOFTWARE

A rich set of classifier ensemble methods is implemented in WEKA1 [167], a collection of machine learning algorithms for data-mining tasks. PRTools2 is a MATLAB toolbox for pattern recognition developed by the Pattern Recognition Research Group of the TU Delft, The Netherlands, led by Professor R. P. W. (Bob) Duin. An industry-oriented spin-off toolbox, called perClass3 was designed later. Classifier ensembles feature prominently in both packages.

PRTools and perClass are instruments for advanced MATLAB programmers and can also be used by practitioners after a short training. The recent edition of MATLAB Statistics toolbox (2013b) includes a classifier ensemble suite as well.

Snippets of MATLAB DIY (do-it-yourself) code for illustrating methodologies and concepts are given in the chapter appendices. MATLAB was seen as a suitable language for such illustrations because it often looks like executable pseudo-code. A programming language is like a living creature—it grows, develops, changes, and breeds. The code in the book is written by today’s versions, styles, and conventions. It does not, by any means, measure up to the richness, elegance, and sophistication of PRTools and perClass. Aimed at simplicity, the code is not fool-proof nor is it optimized for time or other efficiency criteria. Its sole purpose is to enable the reader to grasp the ideas and run their own small-scale experiments.

STRUCTURE AND WHAT IS NEW IN THE SECOND EDITION

The book is organized as follows.

Chapter 1, Fundamentals, gives an introduction of the main concepts in pattern recognition, Bayes decision theory, and experimental comparison of classifiers. A new treatment of the classifier comparison issue is offered (after Demšar [89]). The discussion of bias and variance decomposition of the error which was given in a greater level of detail in Chapter 7 before (bagging and boosting) is now briefly introduced and illustrated in Chapter 1.

Chapter 2, Base Classifiers, contains methods and algorithms for designing the individual classifiers. In this edition, a special emphasis is put on the stability of the classifier models. To aid the discussions and illustrations throughout the book, a toy two-dimensional data set was created called the fish data. The Naïve Bayes classifier and the support vector machine classifier (SVM) are brought to the fore as they are often used in classifier ensembles. In the final section of this chapter, I introduce the triangle diagram that can enrich the analyses of pattern recognition methods.

Chapter 3, Multiple Classifier Systems, discusses some general questions in combining classifiers. It has undergone a major makeover. The new final section, Quo Vadis?, asks questions such as Are we reinventing the wheel? and Has the progress thus far been illusory? It also contains a bibliometric snapshot of the area of classifier ensembles as of January 4, 2013 using Thomson Reuters’ Web of Knowledge (WoK).

Chapter 4, Combining Label Outputs, introduces a new theoretical framework which defines the optimality conditions of several fusion rules by progressively relaxing an assumption. The Behavior Knowledge Space method is trimmed down and illustrated better in this edition. The combination method based on singular value decomposition (SVD) has been dropped.

Chapter 5, Combining Continuous-Valued Outputs, summarizes classifier fusion methods such as simple and weighted average, decision templates and a classifier used as a combiner. The division of methods into class-conscious and class-independent in the first edition was regarded as surplus and was therefore abandoned.

Chapter 6, Ensemble Methods, grew out of the former Bagging and Boosting chapter. It now accommodates on an equal keel the reigning classics in classifier ensembles: bagging, random forest, AdaBoost and random subspace, as well as a couple of newcomers: rotation forest and random oracle. The Error Correcting Output Code (ECOC) ensemble method is included here, having been cast as Miscellanea in the first edition of the book. Based on the interest in this method, as well as its success, ECOC’s rightful place is together with the classics.

Chapter 7, Classifier Selection, explains why this approach works and how classifier competence regions are estimated. The chapter contains new examples and illustrations.

Chapter 8, Diversity, gives a modern view on ensemble diversity, raising at the same time some old questions, which are still puzzling the researchers in spite of the remarkable progress made in the area. There is a frighteningly large number of possible new diversity measures, lurking as binary similarity and distance measures (take for example Choi et al.’s study [74] with 76, s-e-v-e-n-t-y s-i-x, such measures). And we have not even touched the continuous-valued outputs and the possible diversity measured from those. The message in this chapter is stronger now: we hardly need any more diversity measures; we need to pick a few and learn how to use them. In view of this, I have included a theoretical bound on the kappa-error diagram [243] which shows how much space is still there for new ensemble methods with engineered diversity.

Chapter 9, Ensemble Feature Selection, considers feature selection by the ensemble and for the ensemble. It was born from a section in the former Chapter 8, Miscellanea. The expansion was deemed necessary because of the surge of interest to ensemble feature selection from a variety of application areas, notably so from bioinformatics [346]. I have included a stability index between feature subsets or between feature rankings [236].

I picked a figure from each chapter to create a small graphical guide to the contents of the book as illustrated in Figure 1.

FIGURE 1 The book chapters at a glance.

The former Theory chapter (Chapter 9) was dissolved; parts of it are now blended with the rest of the content of the book. Lengthier proofs are relegated to the respective chapter appendices. Some of the proofs and derivations were dropped altogether, for example, the theory behind the magic of AdaBoost. Plenty of literature sources can be consulted for the proofs and derivations left out.

The differences between the two editions reflect the fact that the classifier ensemble research has made a giant leap; some methods and techniques discussed in the first edition did not withstand the test of time, others were replaced with modern versions. The dramatic expansion of some sub-areas forced me, unfortunately, to drop topics such as cluster ensembles and stay away from topics such as classifier ensembles for: adaptive (on-line) learning, learning in the presence of concept drift, semi-supervised learning, active learning, handing imbalanced classes and missing values. Each of these sub-areas will likely see a bespoke monograph in a not so distant future. I look forward to that.

I am humbled by the enormous volume of literature on the subject, and the ingenious ideas and solutions within. My sincere apology to those authors, whose excellent research into classifier ensembles went without citation in this book because of lack of space or because of unawareness on my part.

WHO IS THIS BOOK FOR?

The book is suitable for postgraduate students and researchers in computing and engineering, as well as practitioners with some technical background. The assumed level of mathematics is minimal and includes a basic understanding of probabilities and simple linear algebra. Beginner’s MATLAB programming knowledge would be beneficial but is not essential.

Ludmila I. Kuncheva

Bangor, Gwynedd, UK

December 2013

NOTES

1. http://www.cs.waikato.ac.nz/ml/weka/

2. http://prtools.org/

3. http://perclass.com/index.php/html/

ACKNOWLEDGEMENTS

I am most sincerely indebted to Gavin Brown, Juan Rodríguez, and Kami Kountcheva for scrutinizing the manuscript and returning to me their invaluable comments, suggestions, and corrections. Many heartfelt thanks go to my family and friends for their constant support and encouragement. Last but not least, thank you, my reader, for picking up this book.

Ludmila I. Kuncheva

Bangor, Gwynedd, UK

December 2013

FUNDAMENTALS OF PATTERN RECOGNITION

1.1 BASIC CONCEPTS: CLASS, FEATURE, DATA SET

A wealth of literature in the 1960s and 1970s laid the grounds for modern pattern recognition [386; 106; 141; 340; 282; 305; 353; 290; 90; 140]. Faced with the formidable challenges of real-life problems, elegant theories still coexist with ad hoc ideas, intuition, and guessing.

Pattern recognition is about assigning labels to objects. Objects are described by features, also called attributes. A classic example is recognition of handwritten digits for the purpose of automatic mail sorting. Figure 1.1 shows a small data sample. Each 15×15 image is one object. Its class label is the digit it represents, and the features can be extracted from the binary matrix of pixels.

FIGURE 1.1 Example of images of handwritten digits.

1.1.1 Classes and Class Labels

Intuitively, a class contains similar objects, whereas objects from different classes are dissimilar. Some classes have a clear-cut meaning, and in the simplest case are mutually exclusive. For example, in signature verification, the signature is either genuine or forged. The true class is one of the two, regardless of what we might deduce from the observation of a particular signature. In other problems, classes might be difficult to define, for example, the classes of left-handed and right-handed people or ordered categories such as low risk, medium risk, and high risk.

We shall assume that there are c possible classes in the problem, labeled from ω1 to ωc, organized as a set of labels Ω = {ω1, …, ωc}, and that each object belongs to one and only one class.

1.1.2 Features

Throughout this book we shall consider numerical features. Such are, for example, systolic blood pressure, the speed of the wind, a company’s net profit in the past 12 months, the gray-level intensity of a pixel. Real-life problems are invariably more complex than that. Features can come in the forms of categories, structures, names, types of entities, hierarchies, so on. Such nonnumerical features can be transformed into numerical ones. For example, a feature country of origin can be encoded as a binary vector with number of elements equal to the number of possible countries where each bit corresponds to a country. The vector will contain 1 for a specified country and zeros elsewhere. In this way one feature gives rise to a collection of related numerical features. Alternatively, we can keep just the one feature where the categories are represented by different values. Depending on the classifier model we choose, the ordering of the categories and the scaling of the values may have a positive, negative, or neutral effect on the relevance of the feature. Sometimes the methodologies for quantifying features are highly subjective and heuristic. For example, sitting an exam is a methodology to quantify a student’s learning progress. There are also unmeasurable features that we as humans can assess intuitively but can hardly explain. Examples of such features are sense of humor, intelligence, and beauty.

Once in a numerical format, the feature values for a given object are arranged as an n-dimensional vector . The real space is called the feature space, each axis corresponding to a feature.

Sometimes an object can be represented by multiple, disjoint subsets of features. For example, in identity verification, three different sensing modalities can be used [207]: frontal face, face profile, and voice. Specific feature subsets are measured for each modality and then the feature vector is composed of three sub-vectors, x = [x(1), x(2), x(3)]T. We call this distinct pattern representation after Kittler et al. [207]. As we shall see later, an ensemble of classifiers can be built using distinct pattern representation, with one classifier on each feature subset.

1.1.3 Data Set

The information needed to design a classifier is usually in the form of a labeled data set Z = {z1, …, zN}, . The class label of zj is denoted by yj ∈ Ω, j = 1, …, N. A typical data set is organized as a matrix of N rows (objects, also called examples or instances) by n columns (features), with an extra column with the class labels

numbered Display Equation

Entry zj, i is the value of the i-th feature for the j-th object.

example001.jpg Example 1.1 A shape–color synthetic data set

Consider a data set with two classes, both containing a collection of the following objects: inlineimg , inlineimg , inlineimg , inlineimg , inlineimg , and inlineimg . Figure 1.2 shows an example of such a data set. The collections of objects for the two classes are plotted next to one another. Class ω1 is shaded. The features are only the shape and the color (black or white); the positioning of the objects within the two dimensions is not relevant. The data set contains 256 objects. Each object is labeled in its true class. We can code the color as 0 for white and 1 for black, and the shapes as triangle = 1, square = 2, and circle = 3.

Based on the two features, the classes are not completely separable. It can be observed that there are mostly circles in ω1 and mostly squares in ω2. Also, the proportion of black objects in class ω2 is much larger. Thus, if we observe a color and a shape, we can make a decision about the class label. To evaluate the distribution of different objects in the two classes, we can count the number of appearances of each object. The distributions are as follows:

With the distributions obtained from the given data set, it makes sense to choose class ω1 if we have a circle (of any color) or a white triangle. For all other possible combinations of values, we should choose label ω2. Thus using only these two features for labeling, we will make 43 errors (16.8%).

FIGURE 1.2 A shape–color data set example. Class ω1 is shaded.

A couple of questions spring to mind. First, if the objects are not discernible, how have they been labeled in the first place? Second, how far can we trust the estimated distributions to generalize over unseen data?

To answer the first question, we should be aware that the features supplied by the user are not expected to be perfect. Typically there is a way to determine the true class label, but the procedure may not be available, affordable, or possible at all. For example, certain medical conditions can be determined only post mortem. An early diagnosis inferred through pattern recognition may decide the outcome for the patient. As another example, consider classifying of expensive objects on a production line as good or defective. Suppose that an object has to be destroyed in order to determine the true label. It is desirable that the labeling is done using measurable features that do not require breaking of the object. Labeling may be too expensive, involving time and expertise which are not available. The problem then becomes a pattern recognition one, where we try to find the class label as correctly as possible from the available features.

Returning to the example in Figure 1.2, suppose that there is a third (unavailable) feature which could be, for example, the horizontal axis in the plot. This feature would have been used to label the data, but the quest is to find the best possible labeling method without it.

The second question How far can we trust the estimated distributions to generalize over unseen data? has inspired decades of research and will be considered later in this text.

example001.jpg Example 1.2 The Iris data set

The Iris data set was collected by the American botanist Edgar Anderson and subsequently analyzed by the English geneticist and statistician Sir Ronald Aylmer Fisher in 1936 [127]. The Iris data set has become one of the iconic hallmarks of pattern recognition and has been used in thousands of publications over the years [348; 39]. This book would be incomplete without a mention of it.

The Iris data still serves as a prime example of a well-behaved data set. There are three balanced classes, each represented with a sample of 50 objects. The classes are species of the Iris flower (Figure 1.3): setosa, versicolor, and virginica. The four features describing an Iris flower are sepal length, sepal width, petal length, and petal width. The classes form neat elliptical clusters in the four-dimensional space. Scatter plots of the data in the spaces spanned by the six pairs of features are displayed in Figure 1.4. Class setosa is clearly distinguishable from the other two classes in all projections.

FIGURE 1.3 Iris flower specimen

FIGURE 1.4 Scatter plot of the Iris data in the two-dimensional spaces spanned by the six pairs of features.

1.1.4 Generate Your Own Data

Trivial as it might be, sometimes you need a piece of code to generate your own data set with specified characteristics in order to test your own classification method.

1.1.4.1 The Normal Distribution

The normal distribution (or also Gaussian distribution) is widespread in nature and is one of the fundamental models in statistics. The one-dimensional normal distribution, denoted N(μ, σ²), is characterized by mean and variance . In n dimensions, the normal distribution is characterized by an n-dimensional vector of the mean, , and an n × n covariance matrix Σ. The notation for an n-dimensional normally distributed random variable is . The normal distribution is the most natural assumption reflecting the following situation: there is an ideal prototype ( ) and all the data are distorted versions of it. Small distortions are more likely to occur than large distortions, causing more objects to be located in the close vicinity of the ideal prototype than far away from it. The scatter of the points around the prototype is associated with the covariance matrix Σi.

The probability density function (pdf) of is

(1.1)

numbered Display Equation

where |Σ| is the determinant of Σ. For the one-dimensional case, x and μ are scalars, and Σ reduces to the variance σ². Equation 1.1 simplifies to

(1.2)

numbered Display Equation

example001.jpg Example 1.3 Cloud shapes and the corresponding covariance matrices

Figure 1.5 shows four two-dimensional data sets generated from the normal distribution with different covariance matrices shown underneath.

Figures 1.5a and 1.5b are generated with independent (noninteracting) features. Therefore, the data cloud is either spherical (Figure 1.5a), or stretched along one or more coordinate axes (Figure 1.5b). Notice that for these cases the off-diagonal entries of the covariance matrix are zeros. Figures 1.5c and 1.5d represent cases where the features are dependent. The data for this example was generated using the function samplegaussian in Appendix 1.A.1.

FIGURE 1.5 Normally distributed data sets with mean [0, 0]T and different covariance matrices shown underneath.

In the case of independent features we can decompose the n-dimensional pdf as a product of n one-dimensional pdfs. Let σ²k be the diagonal entry of the covariance matrix Σ for the k-th feature, and μk be the k-th component of . Then

numbered Display Equation

(1.3)

numbered Display Equation

The cumulative distribution function for a random variable with a normal distribution, Φ(z) = P(X ≤ z), is available in tabulated form from most statistical textbooks.1

1.1.4.2 Noisy Geometric Figures

Sometimes it is useful to generate your own data set of a desired shape, prevalence of the classes, overlap, and so on. An example of a challenging classification problem with five Gaussian classes is shown in Figure 1.6 along with the MATLAB code that generates and plots the data.

FIGURE 1.6 An example of five Gaussian classes generated using the samplegaussian function from Appendix 1.A.1.

One possible way to generate data with specific geometric shapes is detailed below. Suppose that each of the c classes is described by a shape, governed by parameter t.

The noise-free data is calculated from t, and then noise is added. Let ti be the parameter for class ωi, and [ai, bi] be the interval for ti describing the shape of the class. Denote by pi the desired prevalence of class ωi. Knowing that p1 + ⋅⋅⋅ + pc = 1, we can calculate the approximate number of samples in a data set of N objects. Let Ni be the desired number of objects from class ωi. The first step is to sample uniformly Ni values for ti from the interval [ai, bi]. Subsequently, we find the coordinates x1, …, xn for each element of ti. Finally, noise is added to all values. (We can use the randn MATLAB function for this purpose.) The noise could be scaled by multiplying the values by different constants for the different features. Alternatively, the noise could be scaled with the feature values or the values of ti.

example001.jpg Example 1.4 Ellipses data set

The code for producing this data set is given in Appendix 1.A.1. We used the parametric equations for two-dimensional ellipses:

numbered Display Equationnumbered Display Equation

where (xc, yc) is the center of the ellipse, a and b are respectively the major and the minor semi-axes of the ellipse, and ϕ is the angle between the x-axis and the major axis. To traverse the whole ellipse, parameter t varies from 0 to 2π.

Figure 1.7a shows a data set where the random noise is the same across both features and all values of t. The classes have equal proportions, with 300 points from each class. Using a single ellipse with 1000 points, Figure 1.7b demonstrates the effect of scaling the noise with the parameter t. The MATLAB code is given in Appendix 1.A.1.

FIGURE 1.7 (a) The three-ellipse data set; (b) one ellipse with noise variance proportional to the parameter t.

1.1.4.3 Rotated Checker Board Data.

This is a two-dimensional data set which spans the unit square [0, 1] × [0, 1]. The classes are placed as the light and the dark squares of a checker board and then the whole board is rotated at an angle α. A parameter a specifies the side of the individual square. For example, if a = 0.5, there will be four squares in total before the rotation. Figure 1.8 shows two data sets, each containing 5,000 points, generated with different input parameters. The MATLAB function samplecb(N,a,alpha) in Appendix 1.A.1 generates the data.

FIGURE 1.8 Rotated checker board data (100,000 points in each plot).

The properties which make this data set attractive for experimental purposes are:

The two classes are perfectly separable.

The classification regions for the same class are disjoint.

The boundaries are not parallel to the coordinate axes.

The classification performance will be highly dependent on the sample size.

1.2 CLASSIFIER, DISCRIMINANT FUNCTIONS, CLASSIFICATION REGIONS

A classifier is any function that will assign a class label to an object x:

(1.4) numbered Display Equation

In the canonical model of a classifier [106], c discriminant functions are calculated

(1.5) numbered Display Equation

each one yielding a score for the respective class (Figure 1.9). The object is labeled to the class with the highest score. This labeling choice is called the maximum membership rule. Ties are broken randomly, meaning that x is assigned randomly to one of the tied classes.

FIGURE 1.9 Canonical model of a classifier. An n-dimensional feature vector is passed through c discriminant functions, and the largest function output determines the class label.

The discriminant functions partition the feature space into c decision regions or classification regions denoted :

(1.6)

numbered Display Equation

The decision region for class ωi is the set of points for which the i-th discriminant function has the highest score. According to the maximum membership rule, all points in decision region are assigned to class ωi. The decision regions are specified by the classifier D, or equivalently, by the discriminant functions G. The boundaries of the decision regions are called classification boundaries and contain the points for which the highest discriminant functions tie. A point on the boundary can be assigned to any of the bordering classes. If a decision region contains data points from the labeled set Z with true class label ωj, j ≠ i, classes ωi and ωj are called overlapping. If the classes in Z can be separated completely by a hyperplane (a point in , a line in , a plane in ), they are called linearly separable.

Note that overlapping classes in a given partition can be nonoverlapping if the space was partitioned in a different way. If there are no identical points with different class labels in the data set Z, we can always partition the feature space into pure classification regions. Generally, the smaller the overlapping, the better the classifier. Figure 1.10 shows an example of a two-dimensional data set and two sets of classification regions. Figure 1.10a shows the regions produced by the nearest neighbor classifier, where every point is labeled as its nearest neighbor. According to these boundaries and the plotted data, the classes are nonoverlapping. However, Figure 1.10b shows the optimal classification boundary and the optimal classification regions which guarantee the minimum possible error for unseen data generated from the same distributions. According to the optimal boundary, the classes are overlapping. This example shows that by striving to build boundaries that give a perfect split we may over-fit the training data.

FIGURE 1.10 Classification regions obtained from two different classifiers: (a) the 1-nn boundary (nonoverlapping classes); (b) the optimal boundary (overlapping classes).

Generally, any set of functions g1(x), …, gc(x) is a set of discriminant functions. It is another matter how successfully these discriminant functions separate the classes.

Let G* = {g*1(x), …, gc*(x)} be a set of optimal (in some sense) discriminant functions. We can obtain infinitely many sets of optimal discriminant functions from G* by applying a monotonic transformation f(g*i(x)) that preserves the order of the function values for every . For example, f(ζ) can be a log (ζ) or aζ, for a > 1. Applying the same f to all discriminant functions in G*, we obtain an equivalent set of discriminant functions. Using the maximum membership rule, x will be labeled to the same class by any of the equivalent sets of discriminant functions.

1.3 CLASSIFICATION ERROR AND CLASSIFICATION ACCURACY

It is important to know how well our classifier performs. The performance of a classifier is a compound characteristic, whose most important component is the classification accuracy. If we were able to try the classifier on all possible input objects, we would know exactly how accurate it is. Unfortunately, this is hardly a possible scenario, so an estimate of the accuracy has to be used instead.

Classification error is a characteristic dual to the classification accuracy in that the two values sum up to 1

numbered Display Equation

The quantity of interest is called the generalization error. This is the expected error of the trained classifier on unseen data drawn from the distribution of the problem.

1.3.1 Where Does the Error Come From? Bias and Variance

Why cannot we design the perfect classifier? Figure 1.11 shows a sketch of the possible sources of error. Suppose that we have chosen the classifier model. Even with a perfect training algorithm, our solution (marked as 1 in the figure) may be away from the best solution with this model (marked as 2). This approximation error comes from the fact that we have only a finite data set to train the classifier. Sometimes the training algorithm is not guaranteed to arrive at the optimal classifier with the given data. For example, the backpropagation training algorithm converges to a local minimum of the criterion function. If started from a different initialization point, the solution may be different. In addition to the approximation error, there may be a model error. Point 3 in the figure is the best possible solution in the given feature space. This point may not be achievable with the current classifier model. Finally, there is an irreducible part of the error, called the Bayes error. This error comes from insufficient representation. With the available features, two objects with the same feature values may have different class labels. Such a situation arose in Example 1.1.

Enjoying the preview?

Page 1 of 1

Combining Pattern Classifiers: Methods and Algorithms

About this ebook

Ludmila I. Kuncheva

Related authors

Related to Combining Pattern Classifiers

Related ebooks

Technology & Engineering For You

Related podcast episodes

Related articles

Related categories

Reviews for Combining Pattern Classifiers

What did you think?

Book preview

Combining Pattern Classifiers - Ludmila I. Kuncheva

PREFACE

THE PLAYING FIELD

SOFTWARE

STRUCTURE AND WHAT IS NEW IN THE SECOND EDITION

WHO IS THIS BOOK FOR?

NOTES

ACKNOWLEDGEMENTS

1.1 BASIC CONCEPTS: CLASS, FEATURE, DATA SET

1.1.1 Classes and Class Labels

1.1.2 Features

1.1.3 Data Set

1.1.4 Generate Your Own Data

1.2 CLASSIFIER, DISCRIMINANT FUNCTIONS, CLASSIFICATION REGIONS

1.3 CLASSIFICATION ERROR AND CLASSIFICATION ACCURACY

1.3.1 Where Does the Error Come From? Bias and Variance