Computer Vision: Principles, Algorithms, Applications, Learning

Ebook1,788 pages25 hours

Computer Vision: Principles, Algorithms, Applications, Learning

Name: Computer Vision: Principles, Algorithms, Applications, Learning
Brand: Academic Press
Rating: 5.0 (1 reviews)

By E. R. Davies

Rating: 5 out of 5 stars

5/5

()

Read preview

About this ebook

Computer Vision: Principles, Algorithms, Applications, Learning (previously entitled Computer and Machine Vision) clearly and systematically presents the basic methodology of computer vision, covering the essential elements of the theory while emphasizing algorithmic and practical design constraints. This fully revised fifth edition has brought in more of the concepts and applications of computer vision, making it a very comprehensive and up-to-date text suitable for undergraduate and graduate students, researchers and R&D engineers working in this vibrant subject.

See an interview with the author explaining his approach to teaching and learning computer vision - http://scitechconnect.elsevier.com/computer-vision/

Three new chapters on Machine Learning emphasise the way the subject has been developing; Two chapters cover Basic Classification Concepts and Probabilistic Models; and the The third covers the principles of Deep Learning Networks and shows their impact on computer vision, reflected in a new chapter Face Detection and Recognition.
A new chapter on Object Segmentation and Shape Models reflects the methodology of machine learning and gives practical demonstrations of its application.
In-depth discussions have been included on geometric transformations, the EM algorithm, boosting, semantic segmentation, face frontalisation, RNNs and other key topics.
Examples and applications—including the location of biscuits, foreign bodies, faces, eyes, road lanes, surveillance, vehicles and pedestrians—give the ‘ins and outs’ of developing real-world vision systems, showing the realities of practical implementation.
Necessary mathematics and essential theory are made approachable by careful explanations and well-illustrated examples.
The ‘recent developments’ sections included in each chapter aim to bring students and practitioners up to date with this fast-moving subject.
Tailored programming examples—code, methods, illustrations, tasks, hints and solutions (mainly involving MATLAB and C++)

Skip carousel

LanguageEnglish

PublisherAcademic Press

Release dateNov 15, 2017

ISBN9780128095751

Author

E. R. Davies

Roy Davies is Emeritus Professor of Machine Vision at Royal Holloway, University of London. He has worked on many aspects of vision, from feature detection to robust, real-time implementations of practical vision tasks. His interests include automated visual inspection, surveillance, vehicle guidance, crime detection and neural networks. He has published more than 200 papers, and three books. Machine Vision: Theory, Algorithms, Practicalities (1990) has been widely used internationally for more than 25 years, and is now out in this much enhanced fifth edition. Roy holds a DSc at the University of London, and has been awarded Distinguished Fellow of the British Machine Vision Association, and Fellow of the International Association of Pattern Recognition.

Related authors

Skip carousel

Related to Computer Vision

Related ebooks

Skip carousel

Pattern Recognition
Ebook
Pattern Recognition
byKonstantinos Koutroumbas
Rating: 4 out of 5 stars
4/5
Machine Learning: A Bayesian and Optimization Perspective
Ebook
Machine Learning: A Bayesian and Optimization Perspective
bySergios Theodoridis
Rating: 3 out of 5 stars
3/5
Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp
Ebook
Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp
byPeter Norvig
Rating: 4 out of 5 stars
4/5
Introduction to Dynamic Programming: International Series in Modern Applied Mathematics and Computer Science, Volume 1
Ebook
Introduction to Dynamic Programming: International Series in Modern Applied Mathematics and Computer Science, Volume 1
byLeon Cooper
Rating: 0 out of 5 stars
0 ratings
Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab
Ebook
Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab
byChris Solomon
Rating: 3 out of 5 stars
3/5
Deep Learning for Vision Systems
Ebook
Deep Learning for Vision Systems
byMohamed Elgendy
Rating: 5 out of 5 stars
5/5
Pattern Recognition and Machine Learning
Ebook
Pattern Recognition and Machine Learning
byY. Anzai
Rating: 0 out of 5 stars
0 ratings
Deep Reinforcement Learning in Action
Ebook
Deep Reinforcement Learning in Action
byBrandon Brown
Rating: 4 out of 5 stars
4/5
Feature Extraction and Image Processing for Computer Vision
Ebook
Feature Extraction and Image Processing for Computer Vision
byMark Nixon
Rating: 4 out of 5 stars
4/5
Advanced Deep Learning with TensorFlow 2 and Keras - Second Edition: Apply DL, GANs, VAEs, deep RL, unsupervised learning, object detection and segmentation, and more, 2nd Edition
Ebook
Advanced Deep Learning with TensorFlow 2 and Keras - Second Edition: Apply DL, GANs, VAEs, deep RL, unsupervised learning, object detection and segmentation, and more, 2nd Edition
byRowel Atienza
Rating: 0 out of 5 stars
0 ratings
Machine Learning: Adaptive Behaviour Through Experience: Thinking Machines
Ebook
Machine Learning: Adaptive Behaviour Through Experience: Thinking Machines
byalasdair gilchrist
Rating: 4 out of 5 stars
4/5
Deep Learning with TensorFlow
Ebook
Deep Learning with TensorFlow
byMd. Rezaul Karim
Rating: 5 out of 5 stars
5/5
Deep Learning with Keras
Ebook
Deep Learning with Keras
bySujit Pal
Rating: 5 out of 5 stars
5/5
TensorFlow in 1 Day: Make your own Neural Network
Ebook
TensorFlow in 1 Day: Make your own Neural Network
byKrishna Rungta
Rating: 4 out of 5 stars
4/5
Principles of Artificial Intelligence
Ebook
Principles of Artificial Intelligence
byNils J. Nilsson
Rating: 3 out of 5 stars
3/5
Deep Learning with Python
Ebook
Deep Learning with Python
byFrancois Chollet
Rating: 5 out of 5 stars
5/5
Building Machine Learning Systems with Python
Ebook
Building Machine Learning Systems with Python
byWilli Richert
Rating: 4 out of 5 stars
4/5
Reinforcement Learning Algorithms with Python: Learn, understand, and develop smart algorithms for addressing AI challenges
Ebook
Reinforcement Learning Algorithms with Python: Learn, understand, and develop smart algorithms for addressing AI challenges
byAndrea Lonza
Rating: 0 out of 5 stars
0 ratings
Convolutional Neural Networks in Python: Beginner's Guide to Convolutional Neural Networks in Python
Ebook
Convolutional Neural Networks in Python: Beginner's Guide to Convolutional Neural Networks in Python
byFrank Millstein
Rating: 0 out of 5 stars
0 ratings
Deep Learning Fundamentals in Python
Ebook
Deep Learning Fundamentals in Python
byLazyProgrammer
Rating: 4 out of 5 stars
4/5
Artificial Intelligence: A New Synthesis
Ebook
Artificial Intelligence: A New Synthesis
byNils J. Nilsson
Rating: 4 out of 5 stars
4/5
Hands-On Deep Learning Algorithms with Python: Master deep learning algorithms with extensive math by implementing them using TensorFlow
Ebook
Hands-On Deep Learning Algorithms with Python: Master deep learning algorithms with extensive math by implementing them using TensorFlow
bySudharsan Ravichandiran
Rating: 0 out of 5 stars
0 ratings
GANs in Action: Deep learning with Generative Adversarial Networks
Ebook
GANs in Action: Deep learning with Generative Adversarial Networks
byVladimir Bok
Rating: 0 out of 5 stars
0 ratings
Deep Learning with PyTorch
Ebook
Deep Learning with PyTorch
byLuca Pietro Giovanni Antiga
Rating: 5 out of 5 stars
5/5
OpenCV: Computer Vision Projects with Python
Ebook
OpenCV: Computer Vision Projects with Python
byJoseph Howse
Rating: 0 out of 5 stars
0 ratings
Python Deep Learning
Ebook
Python Deep Learning
byValentino Zocca
Rating: 5 out of 5 stars
5/5
Python Machine Learning For Beginners: Handbook For Machine Learning, Deep Learning And Neural Networks Using Python, Scikit-Learn And TensorFlow
Ebook
Python Machine Learning For Beginners: Handbook For Machine Learning, Deep Learning And Neural Networks Using Python, Scikit-Learn And TensorFlow
byFinn Sanders
Rating: 1 out of 5 stars
1/5
Computer Vision and Image Processing
Ebook
Computer Vision and Image Processing
byLinda Shapiro
Rating: 5 out of 5 stars
5/5
Deep Reinforcement Learning Hands-On - Second Edition: Apply modern RL methods to practical problems of chatbots, robotics, discrete optimization, web automation, and more, 2nd Edition
Ebook
Deep Reinforcement Learning Hands-On - Second Edition: Apply modern RL methods to practical problems of chatbots, robotics, discrete optimization, web automation, and more, 2nd Edition
byMaxim Lapan
Rating: 0 out of 5 stars
0 ratings
Deep Learning with TensorFlow 2 and Keras - Second Edition: Regression, ConvNets, GANs, RNNs, NLP, and more with TensorFlow 2 and the Keras API, 2nd Edition
Ebook
Deep Learning with TensorFlow 2 and Keras - Second Edition: Regression, ConvNets, GANs, RNNs, NLP, and more with TensorFlow 2 and the Keras API, 2nd Edition
byAntonio Gulli
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
Network+ Study Guide & Practice Exams
Ebook
Network+ Study Guide & Practice Exams
byRobert Shimonski
Rating: 4 out of 5 stars
4/5
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
Ebook
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
byTriumph Books
Rating: 5 out of 5 stars
5/5
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
Ebook
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
bySeth Reichelson
Rating: 0 out of 5 stars
0 ratings
CompTIA Security+ Practice Questions
Ebook
CompTIA Security+ Practice Questions
byIP Specialist
Rating: 2 out of 5 stars
2/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
Ebook
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
byKatherine Johnson Martinko
Rating: 0 out of 5 stars
0 ratings
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
Practical Lock Picking: A Physical Penetration Tester's Training Guide
Ebook
Practical Lock Picking: A Physical Penetration Tester's Training Guide
byDeviant Ollam
Rating: 5 out of 5 stars
5/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
Master Builder Roblox: The Essential Guide
Ebook
Master Builder Roblox: The Essential Guide
byTriumph Books
Rating: 4 out of 5 stars
4/5
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
Ebook
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
Hacking: Ultimate Beginner's Guide for Computer Hacking in 2018 and Beyond: Hacking in 2018, #1
Ebook
Hacking: Ultimate Beginner's Guide for Computer Hacking in 2018 and Beyond: Hacking in 2018, #1
byDexter Jackson
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
Podcast episode
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
Podcast episode
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
Podcast episode
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Let's Talk About Natural Language Processing: This episode reboots our podcast with the theme of Natural Language Processing for the next few months. We begin with introductions of Yoshi and Linh Da and then get into a broad discussion about natural language processing: what it is, what some of...
Podcast episode
Let's Talk About Natural Language Processing: This episode reboots our podcast with the theme of Natural Language Processing for the next few months. We begin with introductions of Yoshi and Linh Da and then get into a broad discussion about natural language processing: what it is, what some of...
byData Skeptic
0 ratings
0% found this document useful
#51 Francois Chollet - Intelligence and Generalisation
Podcast episode
#51 Francois Chollet - Intelligence and Generalisation
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
Podcast episode
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
Leveling Up Natural Language Processing with Transfer Learning: An interview with Paul Azunre about how you can use transfer learning techniques to build more flexible natural language processing systems and reduce the requirements for labelled data.
Podcast episode
Leveling Up Natural Language Processing with Transfer Learning: An interview with Paul Azunre about how you can use transfer learning techniques to build more flexible natural language processing systems and reduce the requirements for labelled data.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
Podcast episode
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
Podcast episode
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
Podcast episode
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
byCoder Radio
0 ratings
0% found this document useful
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
Podcast episode
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
byMachine Learning Guide
0 ratings
0% found this document useful
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
Podcast episode
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
byMachine Learning Cafe
0 ratings
0% found this document useful
001 Introduction: Teaches the high level fundamentals of machine learning and artificial intelligence. I teach basic intuition, algorithms, and math. I discuss languages and frameworks, deep learning, and more. ocdevel.com/mlg/1 for notes and resources
Podcast episode
001 Introduction: Teaches the high level fundamentals of machine learning and artificial intelligence. I teach basic intuition, algorithms, and math. I discuss languages and frameworks, deep learning, and more. ocdevel.com/mlg/1 for notes and resources
byMachine Learning Guide
0 ratings
0% found this document useful
41. Bob Nystrom
Podcast episode
41. Bob Nystrom
byIt's All Widgets! Flutter Podcast
0 ratings
0% found this document useful
#111 The Rise of the Julia Programming Language
Podcast episode
#111 The Rise of the Julia Programming Language
byDataFramed
0 ratings
0% found this document useful
Crafting Interpreters With Bob Nystrom: Bob Nystrom is the author of Crafting Interpreters. I speak with Nystrom about building a programming language and an interpreter implementation for it. We talk about parsing, the difference between compiler and interpreters and a lot more. If you are...
Podcast episode
Crafting Interpreters With Bob Nystrom: Bob Nystrom is the author of Crafting Interpreters. I speak with Nystrom about building a programming language and an interpreter implementation for it. We talk about parsing, the difference between compiler and interpreters and a lot more. If you are...
byCoRecursive: Coding Stories
0 ratings
0% found this document useful
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
Podcast episode
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
byData Skeptic
0 ratings
0% found this document useful
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
Podcast episode
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
byMachine Learning Guide
0 ratings
0% found this document useful
#100 Embedded Machine Learning on Edge Devices
Podcast episode
#100 Embedded Machine Learning on Edge Devices
byDataFramed
0 ratings
0% found this document useful
Episode 175: AiA 174: Reducing Boilerplate of Redux/NGRX Patterns in Angular with Austin McDaniel
Podcast episode
Episode 175: AiA 174: Reducing Boilerplate of Redux/NGRX Patterns in Angular with Austin McDaniel
byAdventures in Angular
0 ratings
0% found this document useful
The Philosophy of Next.js: Sam and Ryan discuss the core values of the Next.js framework, and how those values motivate several of the framework’s design decisions. They talk about caching, why layouts don’t have access to the URL, and why the router doesn’t expose navigation events, as well as how developers should think about extending Next’s functionality with their own application code.
Podcast episode
The Philosophy of Next.js: Sam and Ryan discuss the core values of the Next.js framework, and how those values motivate several of the framework’s design decisions. They talk about caching, why layouts don’t have access to the URL, and why the router doesn’t expose navigation events, as well as how developers should think about extending Next’s functionality with their own application code.
byFrontend First
0 ratings
0% found this document useful
E9 - Is the Marathon really 42.195km?
Podcast episode
E9 - Is the Marathon really 42.195km?
byPro Running News
0 ratings
0% found this document useful
025 jsAir - (Rerun) Functional and Immutable Design Patterns in JavaScript with Dan Abramov and Brian Lonsdorf: (Rerun) Functional and Immutable Design Patterns in JavaScript with Dan Abramov and Brian Lonsdorf Description: The original show for this week was canceled (Find out why this episode was canceled here). So this is a rerun of our most popular show. Fun...
Podcast episode
025 jsAir - (Rerun) Functional and Immutable Design Patterns in JavaScript with Dan Abramov and Brian Lonsdorf: (Rerun) Functional and Immutable Design Patterns in JavaScript with Dan Abramov and Brian Lonsdorf Description: The original show for this week was canceled (Find out why this episode was canceled here). So this is a rerun of our most popular show. Fun...
byJavaScript Air
0 ratings
0% found this document useful
59 ngAir - Angular 2 testing using Protractor 2C Karma and more with Julie Ralph: Angular 2 testing using Protractor, Karma and more withJulie Ralph Panelists: PatrickJS, Ed Conolly Guests: Julie RalphOutline Background Testing landscape Firstquestion...Jasmine or Mocha? j/k Seriously, though, give us a layout of the testinglandsc...
Podcast episode
59 ngAir - Angular 2 testing using Protractor 2C Karma and more with Julie Ralph: Angular 2 testing using Protractor, Karma and more withJulie Ralph Panelists: PatrickJS, Ed Conolly Guests: Julie RalphOutline Background Testing landscape Firstquestion...Jasmine or Mocha? j/k Seriously, though, give us a layout of the testinglandsc...
byAngular Air
0 ratings
0% found this document useful
Episode 381: RR 374: Ruby 2.5 Enumerable Predicates Accept Pattern Argument WITH Prathamesh Sonpatki
Podcast episode
Episode 381: RR 374: Ruby 2.5 Enumerable Predicates Accept Pattern Argument WITH Prathamesh Sonpatki
byRuby Rogues
0 ratings
0% found this document useful
Episode 403: JSJ 398: Node 12 with Paige Niedringhaus
Podcast episode
Episode 403: JSJ 398: Node 12 with Paige Niedringhaus
byJavaScript Jabber
0 ratings
0% found this document useful
35 ngAir - Angular 2 and React: Angular 2 and React - If you know React, you probably really love it. If you know Angular 2, you probably love it as well. We're going to talk with people who have experience with both and plan to choose Angular 2 over React when Angular 2 is officially ...
Podcast episode
35 ngAir - Angular 2 and React: Angular 2 and React - If you know React, you probably really love it. If you know Angular 2, you probably love it as well. We're going to talk with people who have experience with both and plan to choose Angular 2 over React when Angular 2 is officially ...
byAngular Air
0 ratings
0% found this document useful
The future of aero wheel development might be a bit rocky
Podcast episode
The future of aero wheel development might be a bit rocky
byNerd Alert Podcast
0 ratings
0% found this document useful

Skip carousel

The Fundamental Limits of Machine Learning
Nautilus
Article
The Fundamental Limits of Machine Learning
Sep 20, 2016
5 min read
Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
Deep Learning Technique for Object Detection
Techfastly
Article
Deep Learning Technique for Object Detection
Jun 1, 2021
3 min read
Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Interchangeable Camera Lenses
Amateur Photographer
Article
Interchangeable Camera Lenses
Nov 5, 2019
It's an interesting fact that a modern mirrorless digital camera is cheaper to manufacture than a lens. A camera comprises assembly items that are mass produced using lithography (printing) processes. Compared with a film camera, the mechanical film
2 min read
The DSLR User’s Guide To Buying A Mirrorless Camera
TechLife
Article
The DSLR User’s Guide To Buying A Mirrorless Camera
Dec 13, 2021
10 min read
The Dslr User’s Guide To Buying A Mirrorless Camera
Camera
Article
The Dslr User’s Guide To Buying A Mirrorless Camera
Sep 6, 2021
10 min read
The Mirrorless Revolution
PhotoPlus : The Canon Magazine
Article
The Mirrorless Revolution
Apr 28, 2020
3 min read
Under The Hood
GP Racing UK
Article
Under The Hood
Feb 23, 2023
3 min read
Professor Newman On… Rapid Elaboration
Amateur Photographer
Article
Professor Newman On… Rapid Elaboration
Apr 9, 2019
2 min read
Modern Landscapes
Photography Week
Article
Modern Landscapes
May 28, 2020
4 min read
The Mirrorless Camera
Camera
Article
The Mirrorless Camera
Dec 16, 2018
1 min read
Looking To The Future With Lenses
Pro Photo
Article
Looking To The Future With Lenses
Nov 4, 2020
9 min read
Pen To Paper
Racecar Engineering
Article
Pen To Paper
Feb 2, 2024
Over the last couple of months, I have been working with a number of junior engineers at senior undergraduate and junior postgraduate level. While their enthusiasm has never been in question, I am shocked by the lack of basic skills I am seeing. The
7 min read
Love The One You’re With
Pro Photo
Article
Love The One You’re With
Mar 9, 2020
3 min read
Fidelity vs Practicality
Australian Model Railway Magazine
Article
Fidelity vs Practicality
Nov 10, 2020
Fidelity (noun): faithfulness, loyalty, exact correspondence to the original (Concise Oxford Dictionary). The word ‘fidelity’ refers to how faithful a reproduction of something is to the original item. This has application in the hobby of model railw
8 min read
Buying A Mirrorless Camera? It’s All About The Lenses
Camera
Article
Buying A Mirrorless Camera? It’s All About The Lenses
Dec 21, 2020
13 min read
Heritage Value
Camera
Article
Heritage Value
Jul 31, 2023
21 min read
What's Happening With Full-frame Mirrorless Cameras?
Photo Review
Article
What's Happening With Full-frame Mirrorless Cameras?
Nov 29, 2018
6 min read
Part 4 WHAT CAMERA SHOULD I BUY
NZ Hunter
Article
Part 4 WHAT CAMERA SHOULD I BUY
Sep 26, 2021
12 min read
Four Thirds today
Amateur Photographer
Article
Four Thirds today
Jan 4, 2022
If you are thinking about buying and using a Four Thirds camera, you need to accept that you are dealing with a system that is at least 12 years old and possibly close to 20. There’s no point comparing specifications with modern mirrorless cameras be
1 min read
Biggest Photography Myths Busted
Amateur Photographer
Article
Biggest Photography Myths Busted
Jun 4, 2019
9 min read
Big Game
Camera
Article
Big Game
May 4, 2020
15 min read
Astrophotography’s NEXT BIG thing
Australian Sky & Telescope
Article
Astrophotography’s NEXT BIG thing
Aug 3, 2022
6 min read
D-SLR Lenses On An MILC
Smart Photography
Article
D-SLR Lenses On An MILC
Sep 5, 2022
7 min read
Professor Newman On… Autofocus – Algorithm Versus Physics
Amateur Photographer
Article
Professor Newman On… Autofocus – Algorithm Versus Physics
Jun 28, 2022
I’ve recently written several articles on the function of autofocus from a technical point of view, mainly in the context of mirrorless versus DSLR systems. In short, DSLRs operate using advanced optics assisted by computer algorithms, whilst their m
2 min read
Cropping: In-camera Vs. Post-processing
Smart Photography
Article
Cropping: In-camera Vs. Post-processing
Aug 4, 2023
7 min read
Shake, Rattle And Roll
Amateur Photographer
Article
Shake, Rattle And Roll
Dec 11, 2018
In 2004 Konica Minolta introduced a digital camera that was to change the design of future cameras radically. It was the Dynax 7D: a single lens reflex using the Minolta A autofocus lens mount. While much of the camera's specification matched that of
3 min read
Choosing The Best Lenses For Nightscapes
Australian Sky & Telescope
Article
Choosing The Best Lenses For Nightscapes
Dec 1, 2021
AFTER A SOLID TRIPOD, a good-quality, fast lens is the best investment an aspiring astrophotographer can make. It need not be an expensive purchase. But it does require research, as the lens market has become rather complex in the past few years, to
7 min read
What To Do When Equipment Is Discontinued
Photo Review
Article
What To Do When Equipment Is Discontinued
Aug 26, 2021
Progress is inevitable and sooner or later the equipment you’re using will be classed as out-of-date. It may be in five years time or, if you’re lucky and have made wise choices, 10 to 15 years. Technology changes quickly in the digital age. Have you
5 min read

Related categories

Skip carousel

Reviews for Computer Vision

Rating: 5 out of 5 stars

5/5

1 rating0 reviews

Book preview

Computer Vision - E. R. Davies

Computer Vision

Principles, Algorithms, Applications, Learning

Fifth Edition

E.R. Davies

Royal Holloway, University of London, United Kingdom

Cover image

Title page

Copyright

Dedication

About the Author

Foreword

Preface to the Fifth Edition

Preface to the First Edition

Acknowledgments

Topics Covered in Application Case Studies

Influences Impinging Upon Integrated Vision System Design

Glossary of Acronyms and Abbreviations

Chapter 1. Vision, the challenge

Abstract

1.1 Introduction—Man and His Senses

1.2 The Nature of Vision

1.3 From Automated Visual Inspection to Surveillance

1.4 What This Book Is About

1.5 The Part Played by Machine Learning

1.6 The Following Chapters

1.7 Bibliographical Notes

Part 1: Low-level vision

Part 1. Low-level vision

Chapter 2. Images and imaging operations

Abstract

2.1 Introduction

2.2 Image Processing Operations

2.3 Convolutions and Point Spread Functions

2.4 Sequential Versus Parallel Operations

2.5 Concluding Remarks

2.6 Bibliographical and Historical Notes

2.7 Problems

Chapter 3. Image filtering and morphology

Abstract

3.1 Introduction

3.2 Noise Suppression by Gaussian Smoothing

3.3 Median Filters

3.4 Mode Filters

3.5 Rank Order Filters

3.6 Sharp–Unsharp Masking

3.7 Shifts Introduced by Median Filters

3.8 Shifts Introduced by Rank Order Filters

3.9 The Role of Filters in Industrial Applications of Vision

3.10 Color in Image Filtering

3.11 Dilation and Erosion in Binary Images

3.12 Mathematical Morphology

3.13 Morphological Grouping

3.14 Morphology in Grayscale Images

3.15 Concluding Remarks

3.16 Bibliographical and Historical Notes

3.17 Problems

Chapter 4. The role of thresholding

Abstract

4.1 Introduction

4.2 Region-Growing Methods

4.3 Thresholding

4.4 Adaptive Thresholding

4.5 More Thoroughgoing Approaches to Threshold Selection

4.6 The Global Valley Approach to Thresholding

4.7 Practical Results Obtained Using the Global Valley Method

4.8 Histogram Concavity Analysis

4.9 Concluding Remarks

4.10 Bibliographical and Historical Notes

4.11 Problems

Chapter 5. Edge detection

Abstract

5.1 Introduction

5.2 Basic Theory of Edge Detection

5.3 The Template Matching Approach

5.4 Theory of 3×3 Template Operators

5.5 The Design of Differential Gradient Operators

5.6 The Concept of a Circular Operator

5.7 Detailed Implementation of Circular Operators

5.8 The Systematic Design of Differential Edge Operators

5.9 Problems With the Above Approach—Some Alternative Schemes

5.10 Hysteresis Thresholding

5.11 The Canny Operator

5.12 The Laplacian Operator

5.13 Concluding Remarks

5.14 Bibliographical and Historical Notes

5.15 Problems

Chapter 6. Corner, interest point, and invariant feature detection

Abstract

6.1 Introduction

6.2 Template Matching

6.3 Second-Order Derivative Schemes

6.4 A Median Filter–based Corner Detector

6.5 The Harris Interest Point Operator

6.6 Corner Orientation

6.7 Local Invariant Feature Detectors and Descriptors

6.8 Concluding Remarks

6.9 Bibliographical and Historical Notes

6.10 Problems

Chapter 7. Texture analysis

Abstract

7.1 Introduction

7.2 Some Basic Approaches to Texture Analysis

7.3 Graylevel Co-occurrence Matrices

7.4 Laws’ Texture Energy Approach

7.5 Ade’s Eigenfilter Approach

7.6 Appraisal of the Laws and Ade approaches

7.7 Concluding Remarks

7.8 Bibliographical and Historical Notes

Part 2: Intermediate-level vision

Part 2. Intermediate-level vision

Chapter 8. Binary shape analysis

Abstract

8.1 Introduction

8.2 Connectedness in Binary Images

8.3 Object Labeling and Counting

8.4 Size Filtering

8.5 Distance Functions and Their Uses

8.6 Skeletons and Thinning

8.7 Other Measures for Shape Recognition

8.8 Boundary Tracking Procedures

8.9 Concluding Remarks

8.10 Bibliographical and Historical Notes

8.11 Problems

Chapter 9. Boundary pattern analysis

Abstract

9.1 Introduction

9.2 Boundary Tracking Procedures

9.3 Centroidal Profiles

9.4 Problems With the Centroidal Profile Approach

9.5 The (s,ψ) Plot

9.6 Tackling the Problems of Occlusion

9.7 Accuracy of Boundary Length Measures

9.8 Concluding Remarks

9.9 Bibliographical and Historical Notes

9.10 Problems

Chapter 10. Line, circle, and ellipse detection

Abstract

10.1 Introduction

10.2 Application of the Hough Transform to Line Detection

10.3 The Foot-of-Normal Method

10.4 Using RANSAC for Straight Line Detection

10.5 Location of Laparoscopic Tools

10.6 Hough-Based Schemes for Circular Object Detection

10.7 The Problem of Unknown Circle Radius

10.8 Overcoming the Speed Problem

10.9 Ellipse Detection

10.10 Human Iris Location

10.11 Concluding Remarks

10.12 Bibliographical and Historical Notes

10.13 Problems

Chapter 11. The generalized Hough transform

Abstract

11.1 Introduction

11.2 The Generalized Hough Transform

11.3 The Relevance of Spatial Matched Filtering

11.4 Gradient Weighting Versus Uniform Weighting

11.5 Use of the GHT for Ellipse Detection

11.6 Comparing the Various Methods for Ellipse Detection

11.7 A Graph-Theoretic Approach to Object Location

11.8 Possibilities for Saving Computation

11.9 Using the GHT for Feature Collation

11.10 Generalizing the Maximal Clique and Other Approaches

11.11 Search

11.12 Concluding Remarks

11.13 Bibliographical and Historical Notes

11.14 Problems

Chapter 12. Object segmentation and shape models

Abstract

12.1 Introduction

12.2 Active Contours

12.3 Practical Results Obtained Using Active Contours

12.4 The Level-Set Approach to Object Segmentation

12.5 Shape Models

12.6 Concluding Remarks

12.7 Bibliographical and Historical Notes

Part 3: Machine learning and deep learning networks

Part 3. Machine learning and deep learning networks

Chapter 13. Basic classification concepts

Abstract

13.1 Introduction

13.2 The Nearest Neighbor Algorithm

13.3 Bayes’ Decision Theory

13.4 Relation of the Nearest Neighbor and Bayes’ Approaches

13.5 The Optimum Number of Features

13.6 Cost Functions and Error–Reject Tradeoff

13.7 Supervised and Unsupervised Learning

13.8 Cluster Analysis

13.9 The Support Vector Machine

13.10 Artificial Neural Networks

13.11 The Back-Propagation Algorithm

13.12 Multilayer Perceptron Architectures

13.13 Overfitting to the Training Data

13.14 Concluding Remarks

13.15 Bibliographical and Historical Notes

13.16 Problems

Chapter 14. Machine learning: Probabilistic methods

Abstract

14.1 Introduction

14.2 Mixtures of Gaussians and the EM Algorithm

14.3 A More General View of the EM Algorithm

14.4 Some Practical Examples

14.5 Principal Components Analysis

14.6 Multiple Classifiers

14.7 The Boosting Approach

14.8 Modeling AdaBoost

14.9 Loss Functions for Boosting

14.10 The LogitBoost Algorithm

14.11 The Effectiveness of Boosting

14.12 Boosting with Multiple Classes

14.13 The Receiver Operating Characteristic

14.14 Concluding Remarks

14.15 Bibliographical and Historical Notes

14.16 Problems

Chapter 15. Deep-learning networks

Abstract

15.1 Introduction

15.2 Convolutional Neural Networks

15.3 Parameters for Defining CNN Architectures

15.4 LeCun et al.’s LeNet Architecture

15.5 Krizhevsky et al.’s AlexNet Architecture

15.6 Zeiler and Fergus’s Work on CNN Architectures

15.7 Zeiler and Fergus’s Visualization Experiments

15.8 Simonyan and Zisserman’s VGGNet Architecture

15.9 Noh et al.’s DeconvNet Architecture

15.10 Badrinarayanan et al.’s SegNet Architecture

15.11 Recurrent Neural Networks

15.12 Concluding Remarks

15.13 Bibliographical and Historical Notes

Part 4: 3D vision and motion

Part 4. 3D vision and motion

Chapter 16. The three-dimensional world

Abstract

16.1 Introduction

16.2 Three-Dimensional Vision—The Variety of Methods

16.3 Projection Schemes for Three-Dimensional Vision

16.4 Shape from Shading

16.5 Photometric Stereo

16.6 The Assumption of Surface Smoothness

16.7 Shape from Texture

16.8 Use of Structured Lighting

16.9 Three-Dimensional Object Recognition Schemes

16.10 Horaud’s Junction Orientation Technique

16.11 An Important Paradigm—Location of Industrial Parts

16.12 Concluding Remarks

16.13 Bibliographical and Historical Notes

16.14 Problems

Chapter 17. Tackling the perspective n-point problem

Abstract

17.1 Introduction

17.2 The Phenomenon of Perspective Inversion

17.3 Ambiguity of Pose Under Weak Perspective Projection

17.4 Obtaining Unique Solutions to the Pose Problem

17.5 Concluding Remarks

17.6 Bibliographical and Historical Notes

17.7 Problems

Chapter 18. Invariants and perspective

Abstract

18.1 Introduction

18.2 Cross Ratios: The Ratio of Ratios Concept

18.3 Invariants for Noncollinear Points

18.4 Invariants for Points on Conics

18.5 Differential and Semidifferential Invariants

18.6 Symmetric Cross-Ratio Functions

18.7 Vanishing Point Detection

18.8 More on Vanishing Points

18.9 Apparent Centers of Circles and Ellipses

18.10 Perspective Effects in Art and Photography

18.11 Concluding Remarks

18.12 Bibliographical and Historical Notes

18.13 Problems

Chapter 19. Image transformations and camera calibration

Abstract

19.1 Introduction

19.2 Image Transformations

19.3 Camera Calibration

19.4 Intrinsic and Extrinsic Parameters

19.5 Correcting for Radial Distortions

19.6 Multiple View Vision

19.7 Generalized Epipolar Geometry

19.8 The Essential Matrix

19.9 The Fundamental Matrix

19.10 Properties of the Essential and Fundamental Matrices

19.11 Estimating the Fundamental Matrix

19.12 An Update on the Eight-Point Algorithm

19.13 Image Rectification

19.14 3-D Reconstruction

19.15 Concluding Remarks

19.16 Bibliographical and Historical Notes

19.17 Problems

Chapter 20. Motion

Abstract

20.1 Introduction

20.2 Optical Flow

20.3 Interpretation of Optical Flow Fields

20.4 Using Focus of Expansion to Avoid Collision

20.5 Time-to-Adjacency Analysis

20.6 Basic Difficulties with the Optical Flow Model

20.7 Stereo from Motion

20.8 The Kalman Filter

20.9 Wide Baseline Matching

20.10 Concluding Remarks

20.11 Bibliographical and Historical Notes

20.12 Problem

Part 5: Putting computer vision to work

Part 5. Putting computer vision to work

Chapter 21. Face detection and recognition: The impact of deep learning

Abstract

21.1 Introduction

21.2 A Simple Approach to Face Detection

21.3 Facial Feature Detection

21.4 The Viola–Jones Approach to Rapid Face Detection

21.5 The Eigenface Approach to Face Recognition

21.6 More on the Difficulties of Face Recognition

21.7 Frontalization

21.8 The Sun et al. DeepID Face Representation System

21.9 Fast Face Detection Revisited

21.10 The Face as Part of a 3-D Object

21.11 Concluding Remarks

21.12 Bibliographical and Historical Notes

Chapter 22. Surveillance

Abstract

22.1 Introduction

22.2 Surveillance—The Basic Geometry

22.3 Foreground–Background Separation

22.4 Particle Filters

22.5 Use of Color Histograms for Tracking

22.6 Implementation of Particle Filters

22.7 Chamfer Matching, Tracking, and Occlusion

22.8 Combining Views from Multiple Cameras

22.9 Applications to the Monitoring of Traffic Flow

22.10 License Plate Location

22.11 Occlusion Classification for Tracking

22.12 Distinguishing Pedestrians by Their Gait

22.13 Human Gait Analysis

22.14 Model-based Tracking of Animals

22.15 Concluding Remarks

22.16 Bibliographical and Historical Notes

22.17 Problem

Chapter 23. In-vehicle vision systems

Abstract

23.1 Introduction

23.2 Locating the Roadway

23.3 Location of Road Markings

23.4 Location of Road Signs

23.5 Location of Vehicles

23.6 Information Obtained by Viewing License Plates and Other Structural Features

23.7 Locating Pedestrians

23.8 Guidance and Egomotion

23.9 Vehicle Guidance in Agriculture

23.10 Concluding Remarks

23.11 More Detailed Developments and Bibliographies Relating to Advanced Driver Assistance Systems

23.12 Problem

Chapter 24. Epilogue—Perspectives in vision

Abstract

24.1 Introduction

24.2 Parameters of Importance in Machine Vision

24.3 Tradeoffs

24.4 Moore’s Law in Action

24.5 Hardware, Algorithms, and Processes

24.6 The Importance of Choice of Representation

24.7 Past, Present, and Future

24.8 The Deep Learning Explosion

24.9 Bibliographical and Historical Notes

Appendix A. Robust statistics

A.1 Introduction

A.2 Preliminary Definitions and Analysis

A.3 The M-Estimator (Influence Function) Approach

A.4 The Least Median of Squares Approach to Regression

A.5 Overview of the Robustness Problem

A.6 The RANSAC Approach

A.7 Concluding Remarks

A.8 Bibliographical and Historical Notes

A.9 Problems

Appendix B. The sampling theorem

B.1 The Sampling Theorem

Appendix C. The representation of color

C.1 Introduction

C.2 Details of the HSI Color Representation

C.3 A Typical Example of the Use of Color

C.4 Bibliographical and Historical Notes

Appendix D. Sampling from distributions

D.1 Introduction

D.2 The Box–Muller and Related Methods

D.3 Bibliographical and Historical Notes

References

Index

Copyright

Academic Press is an imprint of Elsevier

125 London Wall, London EC2Y 5AS, United Kingdom

525 B Street, Suite 1800, San Diego, CA 92101-4495, United States

50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices

Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of Congress

ISBN: 978-0-12-809284-2

For Information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals

Publisher: Mara Conner

Acquisition Editor: Tim Pitts

Editorial Project Manager: Charlotte Kent

Production Project Manager: Sruthi Satheesh

Cover Designer: Greg Harris

Typeset by MPS Limited, Chennai, India

Dedication

This book is dedicated to my family.

To my late mother, Mary Davies, to record her never-failing love and devotion.

To my late father, Arthur Granville Davies, who passed on to me his appreciation of the beauties of mathematics and science.

To my wife, Joan, for love, patience, support, and inspiration.

To my children, Elizabeth, Sarah, and Marion, the music in my life.

To my grandchildren, Jasper, Jerome, Eva, and Tara, for constantly reminding me of the carefree joys of youth!

About the Author

Roy Davies is Emeritus Professor of Machine Vision at Royal Holloway, University of London, United Kingdom. He has worked on many aspects of vision, from feature detection and noise suppression to robust pattern matching and real-time implementations of practical vision tasks. His interests include automated visual inspection, surveillance, vehicle guidance, and crime detection. He has published more than 200 papers and three books—Machine Vision: Theory, Algorithms, Practicalities (1990), Electronics, Noise and Signal Recovery (1993), and Image Processing for the Food Industry (2000); the first of these has been widely used internationally for more than 25 years, and is now out in this much enhanced fifth edition. Roy is a fellow of the IoP and the IET, and a senior member of the IEEE. He is on the Editorial Boards of Pattern Recognition Letters, Real-Time Image Processing, Imaging Science, and IET Image Processing. He holds a DSc from the University of London, he was awarded BMVA Distinguished Fellow in 2005, and Fellow of the International Association of Pattern Recognition in 2008.

Foreword

Mark S. Nixon, University of Southampton, Southampton, United Kingdom

It is an honor to write a foreword for Roy Davies’ new edition of Computer and Machine Vision, now entitled Computer Vision: Principles, Algorithms, Applications, Learning. This is one of the major books in Computer Vision and not just for its longevity, having now reached its Fifth Edition. It is actually a splendid achievement to reach this status and it reflects not only on the tenacity and commitment of its author, but also on the achievements of the book itself.

Computer Vision has shown awesome progress in its short history. This is part due to technology: computers are much faster and memory is now much cheaper than they were in the early days when Roy started his research. There have been many achievements and many developments. All of this can affect the evolution of a textbook. There have been excellent textbooks in the past, which were neither continued nor maintained. That has been avoided here as the textbook has continued to mature with the field and its many developments.

We can look forward to a future where automated computer vision systems will make our lives easier while enriching them too. There are already many applications of Computer Vision in the food industry and robotic cars that will be with us very soon. Then there are continuing advancements in medical image analysis, where Computer Vision techniques can be used to aid in diagnosis and therapy by automated means. Even accessing a mobile phone is considerably more convenient when using a fingerprint and access by face recognition continues to improve. These have all come about due to advancements in computers, Computer Vision, and applied artificial intelligence.

Adherents of Computer Vision will know it to be an exciting field indeed. It manages to cover many aspects of technology from human vision to machine learning requiring electronic hardware, computer implementations, and a lot of computer software. Roy continues to cover these in excellent detail.

I remember the First Edition when it was first published in 1990 with its unique and pragmatic blend of theory, implementation, and algorithms. I am pleased to see that the Fifth Edition maintains this unique approach, much appreciated by students in previous editions who wanted an accessible introduction to Computer Vision. It has certainly increased in size with age, and that is often the way with books. It is most certainly the way with Computer Vision since many of its researchers continue to improve, refine, and develop new techniques.

A major change here is the inclusion of Deep Learning. Indeed, this has been a major change in the field of Computer Vision and Pattern Recognition. One implication of the increase in computing power and the reduction of memory cost is that techniques can become considerably more complex, and that complexity lends itself to application in the analysis of big data. One cannot ignore the performance of deep learning and convolutional neural networks: one only has to peruse the program of top international conferences to perceive their revolutionary effect on research direction. Naturally, it is early days but it is good to have guidance as we have here. The nature of performance is always in question in any system in artificial intelligence and part of the way to answer those questions is to consider more deeply the architectures and their basis. That again is the function of a textbook for it is the distillation of research and practice in a ratiocinated exposition. It is a brave move to include Deep Learning in this edition, but a necessary one.

And what of Roy Davies himself? Following his DPhil in Solid State Physics at Oxford, he later developed a new sensitive method in Nuclear Resonance called Davies-ENDOR (Electron and Nuclear Double Resonance) which avoided the blind spots of its predecessor Mims-ENDOR. In 1970 he was appointed as a lecturer at Royal Holloway and a long series of publications in pattern recognition and its applications led to the award of his Personal Chair, his DSc and then the Distinguished Fellow of the British Machine Vision Association (BMVA), 2005. He has served the BMVA in many ways, latterly editing its Newsletter. Clearly the level of his work and his many contacts and papers have contributed much to the material that is found herein.

I look forward to having this Fifth Edition sitting proudly in my shelf, replacing the Fourth that will in turn pass to one of my student’s shelves. It will not stop there for long for it is one of the textbooks I often turn to for the information I need. Unlike the snapshots to be found on the Web, in a textbook I find it placed in context and in sequence and with extension to other material. That is the function of a textbook and it will be well served by this Fifth Edition.

July 2017

Preface to the Fifth Edition

Roy Davies, Royal Holloway, University of London, United Kingdom

The first edition of this book came out in 1990, and was welcomed by many researchers and practitioners. However, in the subsequent two decades the subject moved on at a rapidly accelerating rate, and many topics that hardly deserved a mention in the first edition had to be solidly incorporated into subsequent editions. For example, it seemed particularly important to bring in significant amounts of new material on feature detection, mathematical morphology, texture analysis, inspection, artificial neural networks, 3D vision, invariance, motion analysis, object tracking, and robust statistics. And in the fourth edition, cognizance had to be taken of the widening range of applications of the subject: in particular, two chapters had to be added on surveillance and in-vehicle vision systems. Since then, the subject has not stood still. In fact, the past four or five years have seen the onset of an explosive growth in research on deep neural networks, and the practical achievements resulting from this have been little short of staggering. It soon became abundantly clear that the fifth edition would have to reflect this radical departure—both in fundamental explanation and in practical coverage. Indeed, it necessitated a new part in the book—Part 3, Machine Learning and Deep Learning Networks—a heading which affirms that the new content reflects not only Deep Learning (a huge enhancement over the older Artificial Neural Networks) but also an approach to pattern recognition that is based on rigorous probabilistic methodology.

All this is not achieved without presentation problems: for probabilistic methodology can only be managed properly within a rather severe mathematical environment. Too little maths, and the subject could be so watered down as to be virtually content-free: too much maths, and many readers might not be able to follow the explanations. Clearly, one should not protect readers from the (mathematical) reality of the situation. Hence, Chapter 14 had to be written in such a way as to demonstrate in full what type of methodology is involved, while providing paths that would take readers past some of the mathematical complexities—at least, on first encounter. Once past the relatively taxing Chapter 14, Chapters 15 and 21 take the reader through two accounts consisting largely of case studies, the former through a crucial development period (2012–2015) for deep learning networks, and the latter through a similar period (2013–2016) during which deep learning was targeted strongly at face detection and recognition, enabling remarkable advances to be made. It should not go unnoticed that these additions have so influenced the content of the book that the title had to be modified to reflect them. Interestingly, the organization of the book was further modified by collecting three applications chapters into the new Part 5, Putting Computer Vision to Work.

It is worth remarking that, at this point in time, computer vision has attained a level of maturity that has made it substantially more rigorous, reliable, generic, and—in the light of the improved hardware facilities now available for its implementation (in particular, extremely powerful GPUs)—capable of real-time performance. This means that workers are more than ever before using it in serious applications, and with fewer practical difficulties. It is intended that this edition of the book will reflect this radically new and exciting state of affairs at a fundamental level.

A typical final-year undergraduate course on vision for Electronic Engineering and Computer Science students might include much of the work of Chapters 1–13 and Chapter 16, plus a selection of sections from other chapters, according to requirements. For MSc or PhD research students, a suitable lecture course might go on to cover Parts 3 or 4 in depth, and several of the chapters in Part 5, with many practical exercises being undertaken on image analysis systems. (The importance of the appendix on robust statistics should not be underestimated once one gets onto serious work, though this will probably be outside the restrictive environment of an undergraduate syllabus.) Here much will depend on the research programme being undertaken by each individual student. At this stage the text may have to be used more as a handbook for research, and indeed, one of the prime aims of the volume is to act as a handbook for the researcher and practitioner in this important area.

As mentioned in the original Preface, this book leans heavily on experience I have gained from working with postgraduate students: in particular, I would like to express my gratitude to Mark Edmonds, Simon Barker, Daniel Celano, Darrel Greenhill, Derek Charles, Mark Sugrue, and Georgios Mastorakis, all of whom have in their own ways helped to shape my view of the subject. In addition, it is a pleasure to recall very many rewarding discussions with my colleagues Barry Cook, Zahid Hussain, Ian Hannah, Dev Patel, David Mason, Mark Bateman, Tieying Lu, Adrian Johnstone, and Piers Plummer, the last two of whom were particularly prolific in generating hardware systems for implementing my research group’s vision algorithms. Next, I would like to record my thanks to my British Machine Vision Association colleagues for many wide-ranging discussions on the nature of the subject: in particular, I am hugely grateful to Majid Mirmehdi, Adrian Clark, Neil Thacker, and Mark Nixon, who, over time, have strongly influenced the development of the book and left a permanent mark on it. Next, I would like to thank the anonymous reviewers for making insightful comments and what have turned out to be extremely valuable suggestions. Finally, I am indebted to Tim Pitts of Elsevier Science for his help and encouragement, without which this fifth edition might never have been completed.

Supporting materials:

Elsevier’s website for the book contains programming and other resources to help readers and students using this text. Please check the publisher’s website for further information: https://www.elsevier.com/books-and-journals/book-companion/9780128092842.

Preface to the First Edition

Over the past 30 years or so, machine vision has evolved into a mature subject embracing many topics and applications: these range from automatic (robot) assembly to automatic vehicle guidance, from automatic interpretation of documents to verification of signatures, and from analysis of remotely sensed images to checking of fingerprints and human blood cells; currently, automated visual inspection is undergoing very substantial growth, necessary improvements in quality, safety, and cost-effectiveness being the stimulating factors. With so much ongoing activity, it has become a difficult business for the professional to keep up with the subject and with relevant methodologies: in particular, it is difficult for them to distinguish accidental developments from genuine advances. It is the purpose of this book to provide background in this area.

The book was shaped over a period of 10–12 years, through material I have given on undergraduate and postgraduate courses at London University, and contributions to various industrial courses and seminars. At the same time, my own investigations coupled with experience gained while supervising PhD and postdoctoral researchers helped to form the state of mind and knowledge that is now set out here. Certainly it is true to say that if I had had this book 8, 6, 4, or even 2 years ago, it would have been of inestimable value to myself for solving practical problems in machine vision. It is therefore my hope that it will now be of use to others in the same way. Of course, it has tended to follow an emphasis that is my own—and in particular one view of one path towards solving automated visual inspection and other problems associated with the application of vision in industry. At the same time, although there is a specialism here, great care has been taken to bring out general principles—including many applying throughout the field of image analysis. The reader will note the universality of topics such as noise suppression, edge detection, principles of illumination, feature recognition, Bayes’ theory, and (nowadays) Hough transforms. However, the generalities lie deeper than this. The book has aimed to make some general observations and messages about the limitations, constraints, and tradeoffs to which vision algorithms are subject. Thus there are themes about the effects of noise, occlusion, distortion, and the need for built-in forms of robustness (as distinct from less successful ad hoc varieties and those added on as an afterthought); there are also themes about accuracy, systematic design, and the matching of algorithms and architectures. Finally, there are the problems of setting up lighting schemes which must be addressed in complete systems, yet which receive scant attention in most books on image processing and analysis. These remarks will indicate that the text is intended to be read at various levels—a factor that should make it of more lasting value than might initially be supposed from a quick perusal of the contents.

Of course, writing a text such as this presents a great difficulty in that it is necessary to be highly selective: space simply does not allow everything in a subject of this nature and maturity to be dealt with adequately between two covers. One solution might be to dash rapidly through the whole area mentioning everything that comes to mind, but leaving the reader unable to understand anything in detail or to achieve anything having read the book. However, in a practical subject of this nature this seemed to me a rather worthless extreme. It is just possible that the emphasis has now veered too much in the opposite direction, by coming down to practicalities (detailed algorithms, details of lighting schemes, and so on): individual readers will have to judge this for themselves. On the other hand, an author has to be true to himself and my view is that it is better for a reader or student to have mastered a coherent series of topics than to have a mishmash of information that he is later unable to recall with any accuracy. This, then, is my justification for presenting this particular material in this particular way and for reluctantly omitting from detailed discussion such important topics as texture analysis, relaxation methods, motion, and optical flow.

As for the organization of the material, I have tried to make the early part of the book lead into the subject gently, giving enough detailed algorithms (especially in Chapter 2: Images and imaging operations and Chapter 6: Corner, interest point, and invariant feature detection) to provide a sound feel for the subject—including especially vital, and in their own way quite intricate, topics such as connectedness in binary images. Hence Part I provides the lead-in, although it is not always trivial material and indeed some of the latest research ideas have been brought in (e.g., on thresholding techniques and edge detection). Part II gives much of the meat of the book. Indeed, the (book) literature of the subject currently has a significant gap in the area of intermediate-level vision; while high-level vision (AI) topics have long caught the researcher’s imagination, intermediate-level vision has its own difficulties which are currently being solved with great success (note that the Hough transform, originally developed in 1962, and by many thought to be a very specialist topic of rather esoteric interest, is arguably only now coming into its own). Part II and the early chapters of Part III aim to make this clear, while Part IV gives reasons why this particular transform has become so useful. As a whole, Part III aims to demonstrate some of the practical applications of the basic work covered earlier in the book, and to discuss some of the principles underlying implementation: it is here that chapters on lighting and hardware systems will be found. As there is a limit to what can be covered in the space available, there is a corresponding emphasis on the theory underpinning practicalities. Probably this is a vital feature, since there are many applications of vision both in industry and elsewhere, yet listing them and their intricacies risks dwelling on interminable detail, which some might find insipid; furthermore, detail has a tendency to date rather rapidly. Although the book could not cover 3D vision in full (this topic would easily consume a whole volume in its own right), a careful overview of this complex mathematical and highly important subject seemed vital. It is therefore no accident that Chapter 16, The three-dimensional world, is the longest in the book. Finally, Part IV asks questions about the limitations and constraints of vision algorithms and answers them by drawing on information and experience from earlier chapters. It is tempting to call the last chapter the Conclusion. However, in such a dynamic subject area any such temptation has to be resisted, although it has still been possible to draw a good number of lessons on the nature and current state of the subject. Clearly, this chapter presents a personal view but I hope it is one that readers will find interesting and useful.

Acknowledgments

The author would like to credit the following sources for permission to reproduce tables, figures, and extracts of text from earlier publications:

Elsevier

For permission to reprint portions of the following papers from Image and Vision Computing as text in Chapter 5; as Tables 5.1–5.5; and as Figs. 3.31, 5.2:

Davies (1984b, 1987b)

For permissiovon to reprint portions of the following paper from Pattern Recognition as text in Chapter 8; and as Fig. 8.11:

Davies and Plummer (1981)

For permission to reprint portions of the following papers from Pattern Recognition Letters as text in Chapters 3, 5, 10, 11, 13; as Tables 3.2; 10.4; 11.1; and as Figs. 3.6, 3.8, 3.10, 5.1, 5.3, 10.1, 10.10, 10.11, 10.12, 10.13, 11.1,11.3, 11.4, 11.5, 11.6, 11.7, 11.8, 11.9, 11.10, 11.11:

Davies (1986, 1987a,c,d, 1988b,c,e, 1989a)

For permission to reprint portions of the following paper from Signal Processing as text in Chapter 3; and as Fig. 3.15, 3.17, 3.18, 3.19, 3.20:

Davies (1989b)

For permission to reprint portions of the following paper from Advances in Imaging and Electron Physics as text in Chapter 3:

Davies (2003c)

For permission to reprint portions of the following article from Encyclopedia of Physical Science and Technology as Figs. 8.9, 8.12, 9.1, 9.4:

Davies, E.R., 1987. Visual inspection, automatic (robotics). In: Meyers, R.A. (Ed.) Encyclopedia of Physical Science and Technology, vol. 14. Academic Press, San Diego, pp. 360–377.

IEEE

For permission to reprint portions of the following paper as text in Chapter 3; and as Figs. 3.4, 3.5, 3.7, 3.11:

Davies (1984a)

IET

For permission to reprint portions of the following papers from the IET Proceedings and Colloquium Digests as text in Chapters 3, 4, 6, 13, 21, 22, 23; as Tables 3.3, 4.2; and as Fig. 3.21, 3.28, 3.29, 4.6, 4.7, 4.8, 4.9, 4.10, 6.5, 6.6, 6.7, 6.8, 6.9, 6.12, 11.20, 14.16, 14.17, 22.16, 22.17, 22.18, 23.1, 23.3, 23.4:

Davies (1988a, 1999c, 2000a, 2005, 2008)

Sugrue and Davies (2007)

Mastorakis and Davies (2011)

Davies et al. (1998)

Davies et al. (2003)

IFS Publications Ltd

For permission to reprint portions of the following paper as text in Chapters 12, 20; and as Figs. 10.7, 10.8:

Davies (1984c)

The Royal Photographic Society

For permission to reprint portions of the following papers (see also the Maney website: www.maney.co.uk/journals/ims) as text in Chapter 3; and as Fig. 3.12, 3.13, 3.22, 3.23, 3.24:

Davies (2000c)

Charles and Davies (2004)

Springer-Verlag

For permission to reprint portions of the following papers as text in Chapter 6; and as Figs. 6.2, 6.4:

Davies (1988d), Figs. 1–3

World Scientific

For permission to reprint portions of the following book as text in Chapters 7, 22, 23; and as Fig. 3.25, 3.26, 3.27, 5.4, 22.20, 23.15, 23.16:

Davies, 2000. Image Processing for the Food Industry. World Scientific, Singapore.

The Committee of the Alvey Vision Club

To acknowledge that extracts of text in Chapter 11 and Figs. 11.12, 11.13, 11.17 were first published in the Proceedings of the 4th Alvey Vision Conference:

Davies, E.R., 1988. An alternative to graph matching for locating objects from their salient features. In: Proceedings of 4th Alvey Vision Conference, Manchester, 31 August–2 September, pp. 281–286.

F.H. Sumner

For permission to reprint portions of the following article from State of the Art Report: Supercomputer Systems Technology as text in Chapter 8; and as Fig. 8.4:

Davies, E.R., 1982. Image processing. In: Sumner, F.H. (Ed.), State of the Art Report: Supercomputer Systems Technology. Pergamon Infotech, Maidenhead, pp. 223–244.

Royal Holloway, University of London

For permission to reprint extracts from the following examination questions, originally written by E.R. Davies:

EL385/97/2; EL333/98/2; EL333/99/2, 3, 5, 6; EL333/01/2, 4–6; PH5330/98/3, 5; PH5330/03/1–5; PH4760/04/1–5.

University of London

For permission to reprint extracts from the following examination questions, originally written by E.R. Davies:

PH385/92/2, 3; PH385/93/1–3; PH385/94/1–4; PH385/95/4; PH385/96/3, 6; PH433/94/3, 5; PH433/96/2, 5.

Collectors of publicly available image databases and utilities

To acknowledge use of the following image databases and utilities for generating a number of images presented in Chapters 15 and 21:

The Cambridge semantic segmentation online demo

The images in Fig. 15.14 were processed using the online demo available from the University of Cambridge, UK (see Badrinarayanan et al., 2015) at

http://mi.eng.cam.ac.uk/projects/segnet/ (website accessed 07.10.16).

The CMU image dataset

The newsradio image used to obtain Fig. 21.6 was taken from Test Set C—collected at CMU by Rowley, H.A., Baluja, S., and Kanade, T.—and is described in their paper:

Rowley, H.A., Baluja, S., Kanade, T., 1998. Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 23–38.

It may be downloaded from the website:

http://vasc.ri.cmu.edu/idb/html/face/frontal_images/ (website accessed 20.04.17).

The Bush LFW dataset

The images of George W. Bush used in Chapter 21 were taken from the set collected at the University of Massachusetts:

Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E., 2007. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. University of Massachusetts, Amherst, Technical Report 07-49, October.

The database may be downloaded from the website:

http://vis-www.cs.umass.edu/lfw/ (website accessed 20.04.17).

Topics Covered in Application Case Studies

Influences Impinging Upon Integrated Vision System Design

Glossary of Acronyms and Abbreviations

1-D one dimension/one-dimensional

2-D two dimensions/two-dimensional

3-D three dimensions/three-dimensional

AAM active appearance model

ACM Association for Computing Machinery (USA)

ADAS advanced driver assistance system

AFW annotated faces in the wild

AI artificial intelligence

ANN artificial neural network

AP average precision

APF auxiliary particle filter

ASCII American Standard Code for Information Interchange

ASIC application specific integrated circuit

ASM active shape model

ATM automated teller machine

AUC area under curve

AVI audio video interleave

BCVM between-class variance method

BDRF bidirectional reflectance distribution function

BetaSAC beta [distribution] sampling consensus

BMVA British Machine Vision Association

BPTT backpropagation through time

CAD computer-aided design

CAM computer-aided manufacture

CCTV closed-circuit television

CDF cumulative distribution function

CLIP cellular logic image processor

CNN convolutional neural network

CPU central processor unit

CRF conditional random field

DCSM distinct class based splitting measure

DET Beaudet determinant operator

DG differential gradient

DN Dreschler–Nagel corner detector

DNN deconvolution network

DoF degree of freedom

DoG difference of Gaussians

DPM deformable parts models

EM expectation maximization

EURASIP European Association for Signal Processing

f.c. fully connected

FAR frontalization for alignment and recognition

FAST features from accelerated segment test

FCN fully convolutional network

FDDB face detection data set and benchmark

FDR face detection and recognition

FFT fast Fourier transform

FN false negative

fnr false negative rate

FoE focus of expansion

FoV field of view

FP false positive

FPGA field programmable gate array

FPP full perspective projection

fpr false positive rate

GHT generalized Hough transform

GLOH gradient location and orientation histogram

GMM Gaussian mixture model

GPS global positioning system

GPU graphics processing unit

GroupSAC group sampling consensus

GVM global valley method

HOG histogram of orientated gradients

HSI hue, saturation, intensity

HT Hough transform

IBR intensity extrema-based region detector

IDD integrated directional derivative

IEE Institution of Electrical Engineers (UK)

IEEE Institute of Electrical and Electronics Engineers (USA)

IET Institution of Engineering and Technology (UK)

ILSVRC ImageNet large-scale visual recognition object challenge

ILW iterated likelihood weighting

IMPSAC importance sampling consensus

IoP Institute of Physics (UK)

IRLFOD image-restricted, label-free outside data

ISODATA iterative self-organizing data analysis

JPEG/JPG Joint Photographic Experts Group

k-NN k-nearest neighbor

KL Kullback–Leibler

KR Kitchen–Rosenfeld corner detector

LED light emitting diode

LFF local-feature-focus method

LFPW labeled face parts in the wild

LFW labeled faces in the wild

LIDAR light detection and ranging

LMedS least median of squares

LoG Laplacian of Gaussian

LRN local response normalization

LS least squares

LSTM long short-term memory

LUT lookup table

MAP maximum a posteriori

MDL minimum description length

ML machine learning

MLP multi-layer perceptron

MoG mixture of Gaussians

MP microprocessor

MSER maximally stable extremal region

NAPSAC n adjacent points sample consensus

NIR near infra-red

NN nearest neighbor

OCR optical character recognition

OVR one versus the rest

PASCAL Network of Excellence on pattern analysis, statistical modeling and computational learning

PC personal computer

PCA principal components analysis

PE processing element

PnP perspective n-point

PPR probabilistic pattern recognition

PR pattern recognition

PROSAC progressive sample consensus

PSF point spread function

R-CNN regions with CNN features

RAM random access memory

RANSAC random sample consensus

RBF radial basis function [classifier]

RELU rectified linear unit

RGB red, green, blue

RHT randomized Hough transform

RKHS reproducible kernel Hilbert space

RMS root mean square

RNN recurrent neural network

ROC receiver–operator characteristic

RoI region of interest

RPS Royal Photographic Society (UK)

s.d. standard deviation

SFC Facebook social face classification

SFOP scale-invariant feature operator

SIFT scale invariant feature transform

SIMD single instruction stream, multiple data stream

Sir sampling importance resampling

SIS sequential importance sampling

SISD single instruction stream, single data stream

SOC sorting optimization curve

SOM self-organizing map

SPIE Society of Photo-optical Instrumentation Engineers

SPR statistical pattern recognition

STA spatiotemporal attention [neural network]

SURF speeded-up robust features

SUSAN smallest univalue segment assimilating nucleus

SVM support vector machine

TM template matching

TMF truncated median filter

TN true negative

tnr true negative rate

TP true positive

tpr true positive rate

TV television

USEF unit step edge function

VGG Visual Geometry Group (Oxford)

VJ Viola–Jones

VLSI very large scale integration

VMF vector median filter

VOC visual object classes

VP vanishing point

WPP weak perspective projection

YOLO you only look once

YTF YouTube faces

ZH Zuniga–Haralick corner detector

Chapter 1

Vision, the challenge

Abstract

This chapter introduces the subject of computer vision. It shows how recognition may be performed partly by image processing, although abstract pattern recognition methods are usually needed to complete the task. Important in this process is normalization of the image content to reduce variability so that statistical pattern recognizers such as the nearest neighbor algorithm can carry out their task with limited training requirements and low error rates. It extends the discussion by introducing machine learning and the recently prominent deep learning networks. This chapter also discusses the various applications of vision, contrasting automated visual inspection, and surveillance.

Keywords

Computer vision; process of recognition; nearest neighbor algorithm; template matching; image preprocessing; need for normalization; machine learning; deep learning networks; automated visual inspection; surveillance

1.1 Introduction—Man and His Senses

Of the five senses—vision, hearing, smell, taste, and touch—vision is undoubtedly the one that man has come to depend upon above all others, and indeed the one that provides most of the data he receives. Not only do the input pathways from the eyes provide megabits of information at each glance but also the data rates for continuous viewing probably exceed 10 Mbps. However, much of this information is redundant and is compressed by the various layers of the visual cortex, so that the higher centers of the brain have to interpret abstractly only a small fraction of the data. Nonetheless, the amount of information the higher centers receive from the eyes must be at least two orders of magnitude greater than all the information they obtain from the other senses.

Another feature of the human visual system is the ease with which interpretation is carried out. We see a scene as it is—trees in a landscape, books on a desk, widgets in a factory. No obvious deductions are needed and no overt effort is required to interpret each scene; in addition, answers are effectively immediate and are normally available within a tenth of a second. Just now and again some doubt arises—e.g., a wire cube might be seen correctly or inside out. This and a host of other optical illusions are well known, although for the most part we can regard them as curiosities—irrelevant freaks of nature. Somewhat surprisingly, illusions are quite important, since they reflect hidden assumptions that the brain is making in its struggle with the huge amounts of complex visual data it is receiving. We have to pass by this story here (although it resurfaces now and again in various parts of this book). However, the important point is that we are for the most part unaware of the complexities of vision. Seeing is not a simple process: it is just that vision has evolved over millions of years, and there was no particular advantage in evolution giving us any indication of the difficulties of the task (if anything, to have done so would have cluttered our minds with irrelevant information and slowed our reaction times).

In the present-day and age, man is trying to get machines to do much of his work for him. For simple mechanistic tasks this is not particularly difficult, but for more complex tasks the machine must be given the sense of vision. Efforts have been made to achieve this, sometimes in modest ways, for well over 40 years. At first, schemes were devised for reading, for interpreting chromosome images, and so on; but when such schemes were confronted with rigorous practical tests, the problems often turned out to be more difficult. Generally, researchers react to finding that apparent trivia are getting in the way by intensifying their efforts and applying great ingenuity, and this was certainly so with early efforts at vision algorithm design. However, it soon became plain that the task really is a complex one, in which numerous fundamental problems confront the researcher, and the ease with which the eye can interpret scenes turned out to be highly deceptive.

Of course, one of the ways in which the human visual system gains over the machine is that the brain possesses more than 10¹⁰ cells (or neurons), some of which have well over 10,000 contacts (or synapses) with other neurons. If each neuron acts as a type of microprocessor, then we have an immense computer in which all the processing elements can operate concurrently. Taking the largest single man-made computer to contain several hundred million rather modest processing elements, the majority of the visual and mental processing tasks that the eye–brain system can perform in a flash have no chance of being performed by present-day man-made systems. Added to these problems of scale, there is the problem of how to organize such a large processing system and also how to program it. Clearly, the eye–brain system is partly hard-wired by evolution but there is also an interesting capability to program it dynamically by training during active use. This need for a large parallel processing system with the attendant complex control problems shows that computer vision must indeed be one of the most difficult intellectual problems to tackle.

So what are the problems involved in vision that make it apparently so easy for the eye, yet so difficult for the machine? In the next few sections an attempt is made to answer this question.

1.2 The Nature of Vision

1.2.1 The Process of Recognition

This section illustrates the intrinsic difficulties of implementing computer vision, starting with an extremely simple example—that of character recognition. Consider the set of patterns shown in Fig. 1.1A. Each pattern can be considered as a set of 25 bits of information, together with an associated class indicating its interpretation. In each case imagine a computer learning the patterns and their classes by rote. Then any new pattern may be classified (or recognized) by comparing it with this previously learnt training set, and assigning it to the class of the nearest pattern in the training set. Clearly, test pattern (1) (Fig. 1.1B) will be allotted to class U on this basis. Chapter 13, Basic Classification Concepts, shows that this method is a simple form of the nearest neighbor approach to pattern recognition.

Figure 1.1 Some simple 25-bit patterns and their recognition classes used to illustrate some of the basic problems of recognition: (A) training set patterns (for which the known classes are indicated); (B) test patterns.

The scheme outlined above seems straightforward and is indeed highly effective, even being able to cope with situations where distortions of the test patterns occur or where noise is present: this is illustrated by test patterns (2) and (3). However, this approach is not always foolproof. First, there are situations where distortions or noise is excessive, so errors of interpretation arise. Second, there are situations where patterns are not badly distorted or subject to obvious noise, yet are misinterpreted: this seems much more serious, since it indicates an unexpected limitation of the technique rather than a reasonable result of noise or distortion. In particular, these problems arise where the test pattern is displaced or misorientated relative to the appropriate training set pattern, as with test pattern (6).

As will be seen in Chapter 13, Basic Classification Concepts, there is a powerful principle that indicates why the unlikely limitation given above can arise: it is simply that there are insufficient training set patterns, and that those that are present are insufficiently representative of what will arise in practical situations. Unfortunately, this presents a major difficulty, since providing enough training set patterns incurs a serious storage problem and an even more serious search problem when patterns are tested. Furthermore, it is easy to see that these problems are exacerbated as patterns become larger and more real (obviously, the examples of Fig. 1.1 are far from having enough resolution even to display normal type-fonts). In fact, a combinatorial explosion takes place: this is normally taken to mean that one or more parameters produce fast-varying (often exponential) effects, which explode as the parameters increase by modest amounts. Forgetting for the moment that the patterns of Fig. 1.1 have familiar shapes, let us temporarily regard them as random bit patterns. Now the number of bits in these N×N patterns is N: even in a case where N=20, remembering all these patterns and their interpretations would be impossible on any practical machine, and searching systematically through them would take impracticably long (involving times of the order of the age of the universe). Thus it is not only impracticable to consider such brute force means of solving the recognition problem, but is also effectively impossible theoretically. These considerations show that other means are required to tackle the problem.

1.2.2 Tackling the Recognition Problem

An obvious means of tackling the recognition problem is to standardize the images in some way. Clearly, normalizing the position and orientation of any 2D picture object would help considerably: indeed this would reduce the number of degrees of freedom by three. Methods for achieving this involve centralizing the objects—arranging that their centroids are at the center of the normalized image—and making their major axes (e.g., deduced by moment calculations) vertical or horizontal. Next, we can make use of the order that is known to be present in the image—and here it may be noted that very few patterns of real interest are indistinguishable from random dot patterns. This approach can be taken further: if patterns are to be nonrandom, isolated noise points may be eliminated. Ultimately, all these methods help by making the test pattern closer to a restricted set of training set patterns (although care must also be taken to process the training set patterns initially so that they are representative of the processed test patterns).

It is useful to consider character recognition further. Here we can make additional use of what is known about the structure of characters—namely, that they consist of limbs of roughly constant width. In that case the width carries no useful information, so the patterns can be thinned to stick figures (called skeletons—see Chapter 8: Binary Shape Analysis); then, hopefully, there is an even greater chance that the test patterns will be similar to appropriate training set patterns (Fig. 1.2). This process can be regarded as another instance of reducing the number of degrees of freedom in the image, and hence of helping to minimize the combinatorial explosion—or, from a practical point of view, to minimize the size of the training set necessary for effective recognition.

Figure 1.2 Use of thinning to regularize character shapes. Here character shapes of different limb widths—or even varying limb widths—are reduced to stick figures or skeletons. Thus irrelevant information is removed and at the same time recognition is facilitated.

Next, consider a rather different way of looking at the problem. Recognition is necessarily a problem of discrimination—i.e., of discriminating between patterns of different classes. However, in practice, considering the natural variation of patterns, including the effects of noise and distortions (or even the effects of breakages or occlusions), there is also a problem of generalizing over patterns of the same class. In practical problems there is a tension between the need to discriminate and the need to generalize. Nor is this a fixed situation. Even for the character recognition task, some classes are so close to others (n’s and h’s will be similar) that less generalization is possible than in other cases. On the other hand, extreme forms of generalization arise when, for example, an A is to be recognized as an A whether it is a capital or small letter, or in italic, bold, suffix, or other form of font—even if it is handwritten. The variability is determined largely by the training set initially provided. What we emphasize here, however, is that generalization is as necessary a prerequisite to successful recognition as is discrimination.

At this point it is worth considering more carefully the means whereby generalization was achieved in the examples cited above. First, objects were positioned and orientated appropriately; second, they were cleaned of noise spots; and third, they were thinned to skeleton figures (although the latter process is relevant only for certain tasks such as character recognition). In the last case, we are generalizing over characters drawn with all possible limb widths, width being an irrelevant degree of freedom for this type of recognition task. Note that we could have generalized the characters further by normalizing their size and saving another degree of freedom. The common feature of all these processes is that they aim to give the characters a high level of standardization against known types of variability before finally attempting to recognize them.

The standardization (or generalization) processes outlined above are all realized by image processing, i.e., the conversion of one image into another by suitable means. The result is a two-stage recognition scheme: first, images are converted into more amenable forms containing the same numbers of bits of data; and second, they are classified with the result that their data content is reduced to very few bits (Fig. 1.3). In fact, recognition is a process of data abstraction, the final data being abstract and totally unlike the original data. Thus we must imagine a letter A starting as an array of perhaps 20×20 bits arranged in the form of an A, and then ending as the 7 bits in an ASCII representation of an A, namely 1000001 (which is essentially a random bit pattern bearing no resemblance to an A).

Figure 1.3 The two-stage recognition paradigm: C, input from camera; G, grab image (digitize and store); P, preprocess; R, recognize (i, image data; a, abstract data). The classical paradigm for object recognition is that of (1) preprocessing (image processing) to suppress noise or other artefacts and to regularize the image data and (2) applying a process of abstract (often statistical) pattern recognition to extract the very few bits required to classify the object.

The last paragraph reflects to a large extent the history of image analysis. Early on, a good proportion of the image analysis problems being tackled were envisaged as consisting of an image preprocessing task carried out by image processing techniques, followed by a recognition task undertaken by pure pattern recognition methods (see Chapter 13: Basic Classification Concepts). These two topics—image processing and pattern recognition—consumed much research effort and effectively dominated the subject of image analysis, while intermediate-level approaches such as the Hough transform were, for a time, slower to develop. One of the aims of this book is to ensure that such intermediate-level processing techniques are given due emphasis, and indeed that the best range of techniques is applied to any computer vision task.

1.2.3 Object Location

The problem that was tackled above—that of character recognition—is a highly constrained one. In a great many practical applications it is necessary to search pictures for objects of various types, rather than just interpreting a small area of a picture.

Search is a task that can involve prodigious amounts of computation and is also subject to a combinatorial explosion. Imagine the task of searching for a letter E in a page of text. An obvious way of achieving this is to move a suitable template of size n×n over the whole image, of size N×N, and to find where a match occurs (Fig. 1.4). A match can be defined as a position where there is exact agreement between the template and the local portion of the image but, in keeping with the ideas of Section 1.2.1, it will evidently be more relevant to look for a best local match (i.e., a position where the match is locally better than in adjacent regions) and where the match is also good in some more absolute sense, indicating that an E is present.

Figure 1.4 Template matching, the process of moving a suitable template over an image to determine the precise positions at which a match occurs, hence revealing the presence of objects of a particular type.

One of the most natural ways of checking for a match is to measure the Hamming distance between the template and the local n×n region of the image, i.e., to sum the number of differences between corresponding bits. This is essentially the process described in Section 1.2.1. Then places with a low Hamming distance are places where the match is good. These template-matching ideas can be extended to cases where the corresponding bit positions in the template and the image do not just have binary values but may have intensity values over a range 0–255. In that case the sums obtained are no longer Hamming distances but may be generalized to the form:

(1.1)

It being the local template value, Ii being the local image value, and the sum being taken over the area of the template. This makes template matching practicable in many situations: the possibilities are examined in more detail in subsequent chapters.

We referred above to a combinatorial explosion in this search problem too. The reason this arises is as follows. First, when a 5×5 template is moved over an N×N image in order to look for a match, the number of operations required is of the order of 5²N², totaling some 1 million operations for a 256×256 image. The problem is that when larger objects are being sought in an image, the number of operations increases as the square of the size of the object, the total number of operations being N²n² when an n×n template is used. For a 30×30 template and a 256×256 image, the number of operations required rises to ~60 million. Note that, in general, a template will be larger than the object it is used to search for, because some background will have to be included to help demarcate the object.

Next, recall that in general, objects may appear in many orientations in an image (E’s on a printed page are exceptional). If we imagine a possible 360 orientations (i.e., one per degree of rotation), then a corresponding number of templates will in principle have to be applied in order to locate the object. This additional degree of freedom pushes the search effort and time to enormous levels, so far away from the possibility of real-time implementation that new approaches must be found for tackling the task. [Real-time is a commonly used phrase meaning that the information has

Enjoying the preview?

Page 1 of 1

Computer Vision: Principles, Algorithms, Applications, Learning

About this ebook

E. R. Davies

Related authors

Related to Computer Vision

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for Computer Vision

What did you think?

Book preview

Computer Vision - E. R. Davies

Computer Vision

Table of Contents

Cover image

Title page

Copyright

Dedication

About the Author

Foreword

Preface to the Fifth Edition

Preface to the First Edition

Acknowledgments

Topics Covered in Application Case Studies

Glossary of Acronyms and Abbreviations

Chapter 1. Vision, the challenge

Part 1: Low-level vision

Part 1. Low-level vision

Chapter 2. Images and imaging operations

Chapter 3. Image filtering and morphology

Chapter 4. The role of thresholding

Chapter 5. Edge detection

Chapter 6. Corner, interest point, and invariant feature detection

Chapter 7. Texture analysis

Part 2: Intermediate-level vision

Part 2. Intermediate-level vision

Chapter 8. Binary shape analysis

Chapter 9. Boundary pattern analysis

Chapter 10. Line, circle, and ellipse detection

Chapter 11. The generalized Hough transform

Chapter 12. Object segmentation and shape models

Part 3: Machine learning and deep learning networks

Part 3. Machine learning and deep learning networks

Chapter 13. Basic classification concepts

Chapter 14. Machine learning: Probabilistic methods

Chapter 15. Deep-learning networks

Part 4: 3D vision and motion

Part 4. 3D vision and motion

Chapter 16. The three-dimensional world

Chapter 17. Tackling the perspective n-point problem

Chapter 18. Invariants and perspective

Chapter 19. Image transformations and camera calibration

Chapter 20. Motion

Part 5: Putting computer vision to work

Part 5. Putting computer vision to work

Chapter 21. Face detection and recognition: The impact of deep learning

Chapter 22. Surveillance

Chapter 23. In-vehicle vision systems

Chapter 24. Epilogue—Perspectives in vision

Appendix A. Robust statistics

Appendix B. The sampling theorem

Appendix C. The representation of color

Appendix D. Sampling from distributions

References

Index

Copyright

Dedication

About the Author

Foreword

Preface to the Fifth Edition

Preface to the First Edition

Acknowledgments

Glossary of Acronyms and Abbreviations

Abstract

Keywords

1.1 Introduction—Man and His Senses

1.2 The Nature of Vision

1.2.1 The Process of Recognition