Feature Extraction and Image Processing for Computer Vision

Ebook1,190 pages13 hours

Feature Extraction and Image Processing for Computer Vision

Name: Feature Extraction and Image Processing for Computer Vision
Brand: Academic Press
Rating: 4.0 (2 reviews)

By Mark Nixon and Alberto Aguado

Rating: 4 out of 5 stars

4/5

()

Read preview

About this ebook

Feature Extraction for Image Processing and Computer Vision is an essential guide to the implementation of image processing and computer vision techniques, with tutorial introductions and sample code in MATLAB and Python. Algorithms are presented and fully explained to enable complete understanding of the methods and techniques demonstrated. As one reviewer noted, "The main strength of the proposed book is the link between theory and exemplar code of the algorithms." Essential background theory is carefully explained.

This text gives students and researchers in image processing and computer vision a complete introduction to classic and state-of-the art methods in feature extraction together with practical guidance on their implementation.

The only text to concentrate on feature extraction with working implementation and worked through mathematical derivations and algorithmic methods
A thorough overview of available feature extraction methods including essential background theory, shape methods, texture and deep learning
Up to date coverage of interest point detection, feature extraction and description and image representation (including frequency domain and colour)
Good balance between providing a mathematical background and practical implementation
Detailed and explanatory of algorithms in MATLAB and Python

Skip carousel

Intelligence (AI) & Semantics

LanguageEnglish

PublisherAcademic Press

Release dateNov 17, 2019

ISBN9780128149775

Author

Mark Nixon

Mark Nixon is the Professor in Computer Vision at the University of Southampton UK. His research interests are in image processing and computer vision. His team develops new techniques for static and moving shape extraction which have found application in biometrics and in medical image analysis. His team were early workers in automatic face recognition, later came to pioneer gait recognition and more recently joined the pioneers of ear biometrics. With Tieniu Tan and Rama Chellappa, their book Human ID based on Gait is part of the Springer Series on Biometrics and was published in 2005. He has chaired/ program chaired many conferences (BMVC 98, AVBPA 03, IEEE Face and Gesture FG06, ICPR 04, ICB 09, IEEE BTAS 2010) and given many invited talks. Dr. Nixon is a Fellow IET and a Fellow IAPR.

Related to Feature Extraction and Image Processing for Computer Vision

Related ebooks

Skip carousel

Pattern Recognition and Machine Learning
Ebook
Pattern Recognition and Machine Learning
byY. Anzai
Rating: 0 out of 5 stars
0 ratings
Designing Machine Learning Systems with Python
Ebook
Designing Machine Learning Systems with Python
byDavid Julian
Rating: 0 out of 5 stars
0 ratings
Building Machine Learning Systems with Python
Ebook
Building Machine Learning Systems with Python
byWilli Richert
Rating: 4 out of 5 stars
4/5
Practical Machine Learning for Data Analysis Using Python
Ebook
Practical Machine Learning for Data Analysis Using Python
byAbdulhamit Subasi
Rating: 0 out of 5 stars
0 ratings
Large Scale Machine Learning with Python
Ebook
Large Scale Machine Learning with Python
byBastiaan Sjardin
Rating: 2 out of 5 stars
2/5
Introduction to Algorithms for Data Mining and Machine Learning
Ebook
Introduction to Algorithms for Data Mining and Machine Learning
byXin-She Yang
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Robot Perception and Cognition
Ebook
Deep Learning for Robot Perception and Cognition
byAlexandros Iosifidis
Rating: 4 out of 5 stars
4/5
Deep Learning on Edge Computing Devices: Design Challenges of Algorithm and Architecture
Ebook
Deep Learning on Edge Computing Devices: Design Challenges of Algorithm and Architecture
byXichuan Zhou
Rating: 0 out of 5 stars
0 ratings
Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab
Ebook
Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab
byChris Solomon
Rating: 3 out of 5 stars
3/5
Advanced Deep Learning with TensorFlow 2 and Keras - Second Edition: Apply DL, GANs, VAEs, deep RL, unsupervised learning, object detection and segmentation, and more, 2nd Edition
Ebook
Advanced Deep Learning with TensorFlow 2 and Keras - Second Edition: Apply DL, GANs, VAEs, deep RL, unsupervised learning, object detection and segmentation, and more, 2nd Edition
byRowel Atienza
Rating: 0 out of 5 stars
0 ratings
Deep Learning with Keras: Beginner’s Guide to Deep Learning with Keras
Ebook
Deep Learning with Keras: Beginner’s Guide to Deep Learning with Keras
byFrank Millstein
Rating: 3 out of 5 stars
3/5
Introduction to Deep Learning and Neural Networks with Python™: A Practical Guide
Ebook
Introduction to Deep Learning and Neural Networks with Python™: A Practical Guide
byAhmed Fawzy Gad
Rating: 0 out of 5 stars
0 ratings
Computational Learning Approaches to Data Analytics in Biomedical Applications
Ebook
Computational Learning Approaches to Data Analytics in Biomedical Applications
byKhalid Al-Jabery
Rating: 5 out of 5 stars
5/5
Deep Learning for Beginners: A Comprehensive Introduction of Deep Learning Fundamentals for Beginners to Understanding Frameworks, Neural Networks, Large Datasets, and Creative Applications with Ease
Ebook
Deep Learning for Beginners: A Comprehensive Introduction of Deep Learning Fundamentals for Beginners to Understanding Frameworks, Neural Networks, Large Datasets, and Creative Applications with Ease
bySteven Cooper
Rating: 3 out of 5 stars
3/5
Intelligent Image and Video Compression: Communicating Pictures
Ebook
Intelligent Image and Video Compression: Communicating Pictures
byDavid Bull
Rating: 5 out of 5 stars
5/5
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning
Ebook
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning
byMichael Ying Yang
Rating: 0 out of 5 stars
0 ratings
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
Ebook
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
bySteven Cooper
Rating: 0 out of 5 stars
0 ratings
Computer Vision for the Web
Ebook
Computer Vision for the Web
byAkhmadeev Foat
Rating: 0 out of 5 stars
0 ratings
Neural Data Science: A Primer with MATLAB® and Python™
Ebook
Neural Data Science: A Primer with MATLAB® and Python™
byErik Lee Nylen
Rating: 5 out of 5 stars
5/5
Ascend AI Processor Architecture and Programming: Principles and Applications of CANN
Ebook
Ascend AI Processor Architecture and Programming: Principles and Applications of CANN
byXiaoyao Liang
Rating: 0 out of 5 stars
0 ratings
Deep Learning and Parallel Computing Environment for Bioengineering Systems
Ebook
Deep Learning and Parallel Computing Environment for Bioengineering Systems
byArun Kumar Sangaiah
Rating: 0 out of 5 stars
0 ratings
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
Ebook
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
byDr. Rajkumar Tekchandani
Rating: 0 out of 5 stars
0 ratings
Embedded Systems Architecture: A Comprehensive Guide for Engineers and Programmers
Ebook
Embedded Systems Architecture: A Comprehensive Guide for Engineers and Programmers
byTammy Noergaard
Rating: 5 out of 5 stars
5/5
Deep Learning with Keras
Ebook
Deep Learning with Keras
bySujit Pal
Rating: 5 out of 5 stars
5/5
Computer Vision: Principles, Algorithms, Applications, Learning
Ebook
Computer Vision: Principles, Algorithms, Applications, Learning
byE. R. Davies
Rating: 5 out of 5 stars
5/5
Deep Learning for Medical Image Analysis
Ebook
Deep Learning for Medical Image Analysis
byS. Kevin Zhou
Rating: 4 out of 5 stars
4/5
Computer Vision and Image Processing
Ebook
Computer Vision and Image Processing
byLinda Shapiro
Rating: 5 out of 5 stars
5/5
Advanced Machine Learning with Python
Ebook
Advanced Machine Learning with Python
byJohn Hearty
Rating: 0 out of 5 stars
0 ratings
Pattern Recognition and Image Processing
Ebook
Pattern Recognition and Image Processing
byD Luo
Rating: 5 out of 5 stars
5/5
Deep Learning with TensorFlow
Ebook
Deep Learning with TensorFlow
byMd. Rezaul Karim
Rating: 5 out of 5 stars
5/5

Intelligence (AI) & Semantics For You

Skip carousel

Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
Ebook
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
byUtpal Chakraborty
Rating: 0 out of 5 stars
0 ratings
Midjourney Mastery - The Ultimate Handbook of Prompts
Ebook
Midjourney Mastery - The Ultimate Handbook of Prompts
byAndreea Todinca
Rating: 5 out of 5 stars
5/5
AI for Educators: AI for Educators
Ebook
AI for Educators: AI for Educators
byMatt Miller
Rating: 5 out of 5 stars
5/5
101 Midjourney Prompt Secrets
Ebook
101 Midjourney Prompt Secrets
byMarcus Byrne
Rating: 3 out of 5 stars
3/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: Unlock the Power of AI for Enhanced Communication and Relationships: English
Ebook
Mastering ChatGPT: Unlock the Power of AI for Enhanced Communication and Relationships: English
byVasyl Kolomiiets
Rating: 0 out of 5 stars
0 ratings
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
ChatGPT For Fiction Writing: AI for Authors
Ebook
ChatGPT For Fiction Writing: AI for Authors
byNova Leigh
Rating: 5 out of 5 stars
5/5
Dancing with Qubits: How quantum computing works and how it can change the world
Ebook
Dancing with Qubits: How quantum computing works and how it can change the world
byRobert S. Sutor
Rating: 5 out of 5 stars
5/5
ChatGPT For Dummies
Ebook
ChatGPT For Dummies
byPam Baker
Rating: 0 out of 5 stars
0 ratings
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
Ebook
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
byTJ Books
Rating: 3 out of 5 stars
3/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
Ebook
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
byS M Howard
Rating: 4 out of 5 stars
4/5
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
Ebook
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
bySebastian Raschka
Rating: 5 out of 5 stars
5/5
Discovery Writing with ChatGPT: AI-Powered Storytelling: Three Story Method, #6
Ebook
Discovery Writing with ChatGPT: AI-Powered Storytelling: Three Story Method, #6
byJ. Thorn
Rating: 0 out of 5 stars
0 ratings
The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions
Ebook
What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions
byJasmine Wang
Rating: 5 out of 5 stars
5/5
ChatGPT
Ebook
ChatGPT
byRobert Conway
Rating: 1 out of 5 stars
1/5
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
Ebook
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
byThe Passive Income Strategist
Rating: 4 out of 5 stars
4/5
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
Ebook
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
byMatthew Hayes
Rating: 0 out of 5 stars
0 ratings
TensorFlow in 1 Day: Make your own Neural Network
Ebook
TensorFlow in 1 Day: Make your own Neural Network
byKrishna Rungta
Rating: 4 out of 5 stars
4/5
ChatGPT for Marketing: A Practical Guide
Ebook
ChatGPT for Marketing: A Practical Guide
byJuanjo Ramos
Rating: 3 out of 5 stars
3/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
Ebook
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
byLogan Rivers
Rating: 5 out of 5 stars
5/5
The Business Case for AI: A Leader's Guide to AI Strategies, Best Practices & Real-World Applications
Ebook
The Business Case for AI: A Leader's Guide to AI Strategies, Best Practices & Real-World Applications
byKavita Ganesan
Rating: 0 out of 5 stars
0 ratings
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
Podcast episode
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
Podcast episode
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
Podcast episode
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
byData Skeptic
0 ratings
0% found this document useful
Exploring The Patterns And Practices For Deep Learning With Andrew Ferlitsch: An interview with Andrew Ferlitsch about his experiences building and teaching deep learning models and his work on a book to capture those lessons for everyone to learn from.
Podcast episode
Exploring The Patterns And Practices For Deep Learning With Andrew Ferlitsch: An interview with Andrew Ferlitsch about his experiences building and teaching deep learning models and his work on a book to capture those lessons for everyone to learn from.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Season Four Launchisode: As our first trilogy comes to a close, and we embark on the next one, we’re doing what all great trilogies do: Upending everything that made the initial one great and starting afresh. We've incorporated listener feedback, hear more about our guests' personal lives, dive into architecture, and debut a new segment!
Podcast episode
Season Four Launchisode: As our first trilogy comes to a close, and we embark on the next one, we’re doing what all great trilogies do: Upending everything that made the initial one great and starting afresh. We've incorporated listener feedback, hear more about our guests' personal lives, dive into architecture, and debut a new segment!
byElixir Wizards
0 ratings
0% found this document useful
Podcast Ep. #18 – Prof. Wenbin Yu on the Structure Genome: On this episode I am speaking to Wenbin Yu, who is a professor at the School of Aeronautics and Astronautics of Purdue University and CTO of AnalySwift, a provider of simulation software for composites. Wenbin has achieved many accolades in both the ac...
Podcast episode
Podcast Ep. #18 – Prof. Wenbin Yu on the Structure Genome: On this episode I am speaking to Wenbin Yu, who is a professor at the School of Aeronautics and Astronautics of Purdue University and CTO of AnalySwift, a provider of simulation software for composites. Wenbin has achieved many accolades in both the ac...
byAerospace Engineering Podcast
0 ratings
0% found this document useful
10. Unlocking Contract Intelligence: The Intersection of AI and Transformative Mathematics with Randy Friedman: The CLM Rx
Podcast episode
10. Unlocking Contract Intelligence: The Intersection of AI and Transformative Mathematics with Randy Friedman: The CLM Rx
byThe CLM Rx
0 ratings
0% found this document useful
Understanding Graph Database Patterns
Podcast episode
Understanding Graph Database Patterns
byThe Cloudcast
0 ratings
0% found this document useful
?️ThursdAI - Jul 27: SDXL1.0, Superconductors? StackOverflowAI and Frontier Model Forum
Podcast episode
?️ThursdAI - Jul 27: SDXL1.0, Superconductors? StackOverflowAI and Frontier Model Forum
byThursdAI - The top AI news from the past week
0 ratings
0% found this document useful
Data Center War Stories with Mike Julian: Mike Julian is the CEO of The Duckbill Group, a company you might be familiar with. Prior to co-founding Duckbill with yours truly, Mike was editor in chief at Monitoring Weekly, principal at Aster Labs, a senior DevOps consultant at Taos, a senior system
Podcast episode
Data Center War Stories with Mike Julian: Mike Julian is the CEO of The Duckbill Group, a company you might be familiar with. Prior to co-founding Duckbill with yours truly, Mike was editor in chief at Monitoring Weekly, principal at Aster Labs, a senior DevOps consultant at Taos, a senior system
byScreaming in the Cloud
0 ratings
0% found this document useful
State of Containers in the Public Cloud
Podcast episode
State of Containers in the Public Cloud
byThe Cloudcast
0 ratings
0% found this document useful
Automated Data Labeling for AI Apps
Podcast episode
Automated Data Labeling for AI Apps
byThe Cloudcast
0 ratings
0% found this document useful
ATLAS with Dr. Mario Lassnig: Our guest today is Dr. Mario Lassnig, a software engineer working on the ATLAS Experiment at CERN!
Podcast episode
ATLAS with Dr. Mario Lassnig: Our guest today is Dr. Mario Lassnig, a software engineer working on the ATLAS Experiment at CERN!
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Forgotten, But Not Gone: How Model-Based Development Is Still Alive and Well Today: Computer Aided Software Engineering (CASE) tools, which helped make the analysis, design, and implementation phases of software development better, faster, and cheaper, fell out of favor in the mid-'90s. Yet much of what they have to offer remains and...
Podcast episode
Forgotten, But Not Gone: How Model-Based Development Is Still Alive and Well Today: Computer Aided Software Engineering (CASE) tools, which helped make the analysis, design, and implementation phases of software development better, faster, and cheaper, fell out of favor in the mid-'90s. Yet much of what they have to offer remains and...
byOracle University Podcast
0 ratings
0% found this document useful
Platform Engineering at a FAANG Company
Podcast episode
Platform Engineering at a FAANG Company
byThe Cloudcast
0 ratings
0% found this document useful
Anyone Listening? Quantum Cryptography Applications with Vlatko Vedral: Upgrading isn't just for phone systems. Quantum information science tackles the upgrade of old existing technologies, which run by classical physics laws, to those that function in the quantum realm. It's as easy as it sounds: Vlatko Vederal tells...
Podcast episode
Anyone Listening? Quantum Cryptography Applications with Vlatko Vedral: Upgrading isn't just for phone systems. Quantum information science tackles the upgrade of old existing technologies, which run by classical physics laws, to those that function in the quantum realm. It's as easy as it sounds: Vlatko Vederal tells...
byFinding Genius Podcast
0 ratings
0% found this document useful
Project Jupyter with Jessica Forde, Yuvi Panda and Chris Holdgraf: This week Jessica Forde, Yuvi Panda and Chris Holdgraf join Melanie and Mark to discuss all things Project Jupyter.
Podcast episode
Project Jupyter with Jessica Forde, Yuvi Panda and Chris Holdgraf: This week Jessica Forde, Yuvi Panda and Chris Holdgraf join Melanie and Mark to discuss all things Project Jupyter.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Devon Estes from Sketch on Benchee, Performance and Training: Devon Estes joins our ongoing discussion about performance and training in the Elixir world, shares about his current work on the beta for Sketch Cloud, his previous Erlang consultancy role at one of the largest banks in Europe, and the massive responsibility he carried while working on the bottom line application.
Podcast episode
Devon Estes from Sketch on Benchee, Performance and Training: Devon Estes joins our ongoing discussion about performance and training in the Elixir world, shares about his current work on the beta for Sketch Cloud, his previous Erlang consultancy role at one of the largest banks in Europe, and the massive responsibility he carried while working on the bottom line application.
byElixir Wizards
0 ratings
0% found this document useful
DevOps and Incident Response Evolution
Podcast episode
DevOps and Incident Response Evolution
byThe Cloudcast
0 ratings
0% found this document useful
The Cloudcast #262 - Understanding Dropbox's Infrastructure Transition: Aaron and Brian talk with James Cowling (@jamesacowling; Storage Team Lead @Dropbox) about the Dropbox migration from AWS, project “Magic Pocket”, building distributed systems, lessons learned and how much better something must be before making a massi...
Podcast episode
The Cloudcast #262 - Understanding Dropbox's Infrastructure Transition: Aaron and Brian talk with James Cowling (@jamesacowling; Storage Team Lead @Dropbox) about the Dropbox migration from AWS, project “Magic Pocket”, building distributed systems, lessons learned and how much better something must be before making a massi...
byThe Cloudcast
0 ratings
0% found this document useful
TestContainers to Reduce Developer Frustration
Podcast episode
TestContainers to Reduce Developer Frustration
byThe Cloudcast
0 ratings
0% found this document useful
Composable Data Analytics
Podcast episode
Composable Data Analytics
byThe Cloudcast
0 ratings
0% found this document useful
The Evolution of OpenTelemetry with Austin Parker
Podcast episode
The Evolution of OpenTelemetry with Austin Parker
byScreaming in the Cloud
0 ratings
0% found this document useful
Intro to Vector Databases
Podcast episode
Intro to Vector Databases
byThe Cloudcast
0 ratings
0% found this document useful
The Cloudcast #355 - Exploring IoT Edge
Podcast episode
The Cloudcast #355 - Exploring IoT Edge
byThe Cloudcast
0 ratings
0% found this document useful
Deserted Island DevOps with Austin Parker: Austin Parker is a principal developer advocate at LightStep. Prior to this position, he worked as a software architect at Apprenda, an adjunct instruction and researcher at the University of Albany, a telecommunications specialist at Alltech, and as a su
Podcast episode
Deserted Island DevOps with Austin Parker: Austin Parker is a principal developer advocate at LightStep. Prior to this position, he worked as a software architect at Apprenda, an adjunct instruction and researcher at the University of Albany, a telecommunications specialist at Alltech, and as a su
byScreaming in the Cloud
0 ratings
0% found this document useful
Exploring Open-Source for Tissue Image Analysis and Data Science Business w/ Trevor McKee, Pathomics.io
Podcast episode
Exploring Open-Source for Tissue Image Analysis and Data Science Business w/ Trevor McKee, Pathomics.io
byDigital Pathology Podcast
0 ratings
0% found this document useful
Qubit with Matthew Tamsett and Ravi Upreti: Our guests Matthew Tamsett and Ravi Upreti join Gabi Ferrara and Aja Hammerly to talk about data science and their project, Qubit.
Podcast episode
Qubit with Matthew Tamsett and Ravi Upreti: Our guests Matthew Tamsett and Ravi Upreti join Gabi Ferrara and Aja Hammerly to talk about data science and their project, Qubit.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
3D Printing Circuit Boards for Fast Prototyping: Let’s talk about the future of printed circuit board prototyping. Sean Patterson, the President of Nano Dimension USA is here to introduce the technology behind DragonFly IV®, the multi-material 3D printer for electronics fabrication. He will walk us...
Podcast episode
3D Printing Circuit Boards for Fast Prototyping: Let’s talk about the future of printed circuit board prototyping. Sean Patterson, the President of Nano Dimension USA is here to introduce the technology behind DragonFly IV®, the multi-material 3D printer for electronics fabrication. He will walk us...
byOnTrack: The PCB Design Podcast
0 ratings
0% found this document useful

Skip carousel

Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
» Stochastic Algorithms
Linux Format
Article
» Stochastic Algorithms
Dec 14, 2021
If you’re up for some relatively maths-heavy computer-science reading (and who isn’t?), then consider looking into stochastic algorithms. Sometimes lumped together with machine-learning, stochastic algorithms is a loosely defined category that you co
1 min read
Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Neural Pathways
Guitar Magazine
Article
Neural Pathways
Jul 2, 2021
5 min read
4D Camera Gives Robots a Wider View
Futurity
Article
4D Camera Gives Robots a Wider View
Jul 25, 2017
Researchers have created a new camera that could create four-dimensional images and capture nearly 140 degrees of information. “We’re great at making cameras for humans but do robots need to see the way humans do? Probably not…” The camera could gene
3 min read
Quantum Simulators An Overview
Techfastly
Article
Quantum Simulators An Overview
Oct 1, 2021
4 min read
Folding@home In Practice
Maximum PC
Article
Folding@home In Practice
Jul 20, 2021
A computational chemist undertaking a postdoctoral study at the KTH Royal Institute of Technology in Stockholm, Sweden, Sergio Perez Conesa uses Folding@home in the hope of uncovering a new drug or treatment for common illnesses associated with ion c
5 min read
Zulip Economy
Linux Format
Article
Zulip Economy
Oct 20, 2020
10 min read
The Evolution Of Live-action Media
3D World
Article
The Evolution Of Live-action Media
Dec 29, 2021
5 min read
Team Encodes Digital ‘Hello’ Into Lab-made DNA
Futurity
Article
Team Encodes Digital ‘Hello’ Into Lab-made DNA
Mar 26, 2019
4 min read
Quantum Computing and The Rise Of Machine Learning
Techfastly
Article
Quantum Computing and The Rise Of Machine Learning
Oct 1, 2021
2 min read
THE FUTURE of Nature Photography
Outdoor Photographer
Article
THE FUTURE of Nature Photography
Dec 14, 2019
9 min read
The Race To Exascale Supercomputers
Maximum PC
Article
The Race To Exascale Supercomputers
Jun 21, 2022
9 min read
Prototype Paves Way For ‘Computer-on-a-chip’
Futurity
Article
Prototype Paves Way For ‘Computer-on-a-chip’
Feb 22, 2019
2 min read
Picture In A Mainframe
Linux Format
Article
Picture In A Mainframe
Jul 2, 2019
11 min read
Researchers Reveal DVD-like Disc That Stores Up To 200 Terabytes
PCWorld
Article
Researchers Reveal DVD-like Disc That Stores Up To 200 Terabytes
Apr 2, 2024
2 min read
Experiments In Photogrammetry
British Columbia History
Article
Experiments In Photogrammetry
Jun 15, 2023
Ever since the fire of June 30, 2021, destroyed the Lytton Museum and Archives, I have been trying to assemble preservation methods designed to reduce the effect of another catastrop loss. To this end, I have been studying ways of making digital thre
2 min read
Chinese Students' Dream Device Defeats Japan's Most Powerful Supercomputer In World Contest
Post Magazine
Article
Chinese Students' Dream Device Defeats Japan's Most Powerful Supercomputer In World Contest
Jun 15, 2022
A small computer developed by Chinese students outperformed Japan's most powerful machine in solving a major complex data problem related to artificial intelligence, according to the latest global ranking. Supercomputer Fugaku in Japan has nearly 4 m
3 min read
Scanning Ahead…
Digital Camera World
Article
Scanning Ahead…
Jan 7, 2022
2 min read
A.i. Coding
Linux Format
Article
A.i. Coding
Aug 22, 2023
16 min read
Business applications For Quantum computing
Rotman Management
Article
Business applications For Quantum computing
May 1, 2022
COMPUTERS DO ARITHMETIC. Underlying every amazing application of computers today is math, calculated using binary digits or ‘bits.’ The original computers of the early 1950s could perform about 465 multiplications per second — much faster than the ‘h
11 min read
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
The European Business Review
Article
Powering Costing With Artificial Intelligence: The Case Of Vodafone Procurement
May 25, 2021
8 min read
The Deep Learning Revolution For Artificial Intelligence
Facility Management
Article
The Deep Learning Revolution For Artificial Intelligence
Mar 28, 2019
3 min read
Solo 8: See This Cheaper Robot Creature Jump
Futurity
Article
Solo 8: See This Cheaper Robot Creature Jump
Jun 16, 2020
3 min read
Keeping In Step
Australian Photography
Article
Keeping In Step
May 21, 2023
1 min read
Future-proof Your Files
Amateur Photographer
Article
Future-proof Your Files
Oct 1, 2019
5 min read
What Tech Can Learn from the Fruit Fly’s Search Algorithm
Nautilus
Article
What Tech Can Learn from the Fruit Fly’s Search Algorithm
Nov 13, 2017
5 min read
Quantum Jump
Business Today
Article
Quantum Jump
Dec 25, 2018
2 min read
Universal Quantum
BN1 Magazine
Article
Universal Quantum
Mar 9, 2022
6 min read
How Technology Commons Revolutionise Industry Foundations
The European Business Review
Article
How Technology Commons Revolutionise Industry Foundations
Feb 11, 2022
9 min read

Related categories

Skip carousel

Reviews for Feature Extraction and Image Processing for Computer Vision

Rating: 4 out of 5 stars

4/5

2 ratings0 reviews

Book preview

Feature Extraction and Image Processing for Computer Vision - Mark Nixon

Feature Extraction and Image Processing for Computer Vision

Fourth Edition

Mark S. Nixon

Electronics and Computer Science, University of Southampton

Alberto S. Aguado

Foundry, London

Cover image

Title page

Copyright

Dedication

Preface

1. Introduction

1.1. Overview

1.2. Human and computer vision

1.3. The human vision system

1.4. Computer vision systems

1.5. Processing images

1.6. Associated literature

1.7. Conclusions

2. Images, sampling and frequency domain processing

2.1. Overview

2.2. Image formation

2.3. The Fourier Transform

2.4. The sampling criterion

2.5. The discrete Fourier Transform

2.6. Properties of the Fourier Transform

2.7. Transforms other than Fourier

2.8. Applications using frequency domain properties

2.9. Further reading

3. Image processing

3.1. Overview

3.2. Histograms

3.3. Point operators

3.4. Group operations

3.5. Other image processing operators

3.6. Mathematical morphology

3.7. Further reading

4. Low-level feature extraction (including edge detection)

4.1. Overview

4.2. Edge detection

4.3. Phase congruency

4.4. Localised feature extraction

4.5. Describing image motion

4.6. Further reading

5. High-level feature extraction: fixed shape matching

5.1. Overview

5.2. Thresholding and subtraction

5.3. Template matching

5.4. Feature extraction by low-level features

5.5. Hough transform

5.6. Further reading

6. High-level feature extraction: deformable shape analysis

6.1. Overview

6.2. Deformable shape analysis

6.3. Active contours (snakes)

6.4. Shape Skeletonisation

6.5. Flexible shape models – active shape and active appearance

6.6. Further reading

7. Object description

7.1. Overview and invariance requirements

7.2. Boundary descriptions

7.3. Region descriptors

7.4. Further reading

8. Region-based analysis

8.1. Overview

8.2. Region-based analysis

8.3. Texture description and analysis

8.4. Further reading

9. Moving object detection and description

9.1. Overview

9.2. Moving object detection

9.3. Tracking moving features

9.4. Moving feature extraction and description

9.5. Further reading

10. Camera geometry fundamentals

10.1. Overview

10.2. Projective space

10.3. The perspective camera

10.4. Affine camera

10.5. Weak perspective model

10.6. Discussion

10.7. Further reading

11. Colour images

11.1. Overview

11.2. Colour image theory

11.3. Perception-based colour models: CIE RGB and CIE XYZ

11.4. Additive and subtractive colour models

11.5. Luminance and chrominance colour models

11.6. Additive perceptual colour models

11.7. More colour models

12. Distance, classification and learning

12.1. Overview

12.2. Basis of classification and learning

12.3. Distance and classification

12.4. Neural networks and Support Vector Machines

12.5. Deep learning

12.6. Further reading

Index

Copyright

Academic Press is an imprint of Elsevier

125 London Wall, London EC2Y 5AS, United Kingdom

525 B Street, Suite 1650, San Diego, CA 92101, United States

50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices

Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of Congress

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

ISBN: 978-0-12-814976-8

For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals

Publisher: Mara Conner

Acquisition Editor: Tim Pitts

Editorial Project Manager: Joanna M. Collett

Production Project Manager: Anitha Sivaraj

Cover Designer: Alan Studholme

Typeset by TNQ Technologies

Dedication

We would like to dedicate this book to our parents. To Gloria and to Joaquin Aguado, and to the late Brenda and Ian Nixon.

Preface

What is new in the fourth edition?

Society makes increasing use of image processing and computer vision: manufacturing systems, medical image analysis, robotic cars, and biometrics are splendid examples of where society benefits from this technology. To achieve this there has been, and continues to be, much research and development. The research develops into books, and so the books need updating. We have always been interested to note that our book contains stock image processing and computer vision techniques which are yet to be found in other regular textbooks (OK, some are to be found in specialist books, though these rarely include much tutorial material). This was true of the previous editions and certainly occurs here.

A big change in the Fourth Edition is the move to Python and Matlab, to replace the earlier use of Mathcad and Matlab. We have reordered much of the material and added new material where appropriate. There continue to be many new techniques for feature extraction and description. There has been quite a revolution in image processing and computer vision whilst the Fourth Edition was in process, namely the emergence of deep learning. This is noted throughout, and a new chapter is added on this topic. As well as deep learning, other additions include filtering techniques (non-local means and bilateral filtering), keypoint detectors, saliency operators, optical flow techniques, feature descriptions (Krawtchouk moments), region-based analysis (watershed, MSER and superpixels), space–time interest points and more distance measures (histogram intersection, Chi² (χ²) and the earth mover's distance). We do not include statistical pattern recognition approaches, and for that it is best to look elsewhere (this book would otherwise be enormous). Our interest here is in the implementation and usage of feature extraction. As such, this book—IOHO—remains the most up-to-date text in feature extraction and image processing in computer vision.

As there are four editions now, it is appropriate to have a recap on the previous additions. Each edition corrected the previous production errors, some of which we must confess are our own, and included more tutorial material where appropriate. (If you find an error, there is a promise of free beer in the next section.) The completely new material in the Third Edition was on moving object detection, tracking and description. We also extended the book to use colour, and more modern techniques for object extraction and description especially those capitalising on wavelets and on scale space. The Second Edition updated and extended with new material on smoothing, geometric active contours, keypoint detection and moments. Some material has been filtered out at each stage to retain consistency. Our apologies if your favourite, or your own, technique has been omitted. Feature extraction and image processing is as large as it is enjoyable.

Why did we write this book?

We always expected to be asked: ‘why on earth write a new book on computer vision?’, and we have been. Fair question: there are already many good books on computer vision already out in the bookshops, as you will find referenced later, so why add to them. Part of the answer is that any textbook is a snapshot of material that exists prior to it. Computer vision, the art of processing images stored within a computer, has seen a considerable amount of research by highly qualified people, and the volume of research would appear even to have increased in recent years. That means many new techniques have been developed, and many of the more recent approaches have yet to migrate to textbooks. It is not just the new research: part of the speedy advance in computer vision technique has left some areas covered only in scanty detail. By the nature of research, one cannot publish material on technique that is seen more to fill historical gaps, rather than to advance knowledge. This is again where a new text can contribute.

Finally, the technology itself continues to advance. This means that there is new hardware, new programming languages and new programming environments. In particular for computer vision, the advance of technology means that computing power and memory are now relatively cheap. It is certainly considerably cheaper than when computer vision was starting as a research field. One of the authors here notes that his phone has more considerably more memory, is faster, has bigger disk space and better graphics than the computer that served the entire university of his student days. And he is not that old! One of the more advantageous recent changes brought by progress has been the development of mathematical programming systems. These allow us to concentrate on mathematical technique itself, rather than on implementation detail. There are several sophisticated flavours of which Matlab, one of the chosen vehicles here, is (arguably) the most popular. We have been using these techniques in research and in teaching, they have been of considerable benefit there. In research, they help us to develop technique faster and to evaluate its final implementation. For teaching, the power of a modern laptop and a mathematical system combines to show students, in lectures and in study, not only how techniques are implemented but also how and why they work with an explicit relation to conventional teaching material.

We wrote this book for these reasons. There is a host of material we could have included but chose to omit; the taxonomy and structure we use to expose the subject is of our own construction. By virtue of the enormous breadth of the subject of image processing and computer vision, we restricted the focus to feature extraction and image processing in computer vision for this has not only been the focus of our research, and it is also where the attention of established textbooks, with some exceptions, can be rather sparse. It is, however, one of the prime targets of applied computer vision, so would benefit from better attention. We have aimed to clarify some of its origins and development, whilst also exposing implementation using mathematical systems. As such, we have written this text with our original aims in mind and maintained the approach through the later editions.

The book and its support

Each chapter of this book presents a package of information concerning feature extraction in image processing and computer vision. Each package is developed from its origins and later referenced to material that is more recent. Naturally, there is often theoretical development prior to implementation. We provide working implementations of most of the major techniques we describe, and applied them to process a selection of imagery. Though the focus of our own work has been more in analysing medical imagery or in biometrics (the science of recognising people by behavioural or physiological characteristics, like face recognition), the techniques are general and can migrate to other application domains.

You will find a host of further supporting information at the book's website: https://www.southampton.ac.uk/∼msn/book/. First, you will find the Matlab and Python implementations that support the text so that you can study the techniques described herein. The website will be kept up-to-date as possible, for it also contains links to other material such as websites devoted to techniques and to applications, as well as to available software and on-line literature. Finally, any errata will be reported there. It is our regret and our responsibility that these will exist, and our inducement for their reporting concerns a pint of beer. If you find an error that we do not know about (not typos like spelling, grammar and layout) then use the mailto on the website and we shall send you a pint of good English beer, free!

There is a certain amount of mathematics in this book. The target audience is third or fourth year students in BSc/BEng/MEng/MSc in electrical or electronic engineering, software engineering and computer science, or in mathematics or physics, and this is the level of mathematical analysis here. Computer vision can be thought of as a branch of applied mathematics, though this does not really apply to some areas within its remit, and certainly applies to the material herein. The mathematics essentially concerns mainly calculus and geometry though some of it is rather more detailed than the constraints of a conventional lecture course might allow. Certainly, not all the material here is covered in detail in undergraduate courses at Southampton.

The book starts with an overview of computer vision hardware, software and established material, with reference to the most sophisticated vision system yet ‘developed’: the human vision system. Though the precise details of the nature of processing that allows us to see have yet to be determined, there is a considerable range of hardware and software that allows us to give a computer system the capability to acquire, process and reason with imagery, the function of ‘sight’. The first chapter also provides a comprehensive bibliography of material you can find on the subject, not only including textbooks, and also available software and other material. As this will no doubt be subject to change, it might well be worth consulting the website for more up-to-date information. The preferences for journal references are those which are likely to be found in local university libraries or on the web, IEEE Transactions in particular. These are often subscribed to as they are relatively low cost and are often of very high quality.

The next chapter concerns the basics of signal processing theory for use in computer vision. It introduces the Fourier transform that allows you to look at a signal in a new way, in terms of its frequency content. It also allows us to work out the minimum size of a picture to conserve information, to analyse the content in terms of frequency and even helps to speed up some of the later vision algorithms. It does involve a few equations, but it is a new way of looking at data and at signals and proves to be a rewarding topic of study in its own right. It extends to wavelets, which are a popular analysis tool in image processing.

We then start to look at basic image processing techniques, where image points are mapped into a new value first by considering a single point in an original image and then by considering groups of points. Not only do we see common operations to make a picture's appearance better, especially for human vision, but also see how to reduce the effects of different types of commonly encountered image noise. We shall see some of the modern ways to remove noise and thus clean images, and we shall look at techniques which process an image using notions of shape, rather than mapping processes.

The following chapter concerns low-level features that are the techniques that describe the content of an image, at the level of a whole image rather than in distinct regions of it. One of the most important processes we shall meet is called edge detection. Essentially, this reduces an image to a form of a caricaturist's sketch, though without a caricaturist's exaggerations. The major techniques are presented in detail, together with descriptions of their implementation. Other image properties we can derive include measures of curvature, which developed into modern methods of feature extraction, and measures of movement. The newer techniques are keypoints that localise image information and feature point detection in particular. There are other image properties that can also be used for low-level feature extraction such as phase congruency and saliency. Together, many techniques can be used to describe the content of an image.

The edges, the keypoints, the curvature or the motion need to be grouped in some way so that we can find shapes in an image. Using basic thresholding rarely suffices for shape extraction. One of the approaches is to group low-level features to find an object—in a way this is object extraction without shape. Another approach to shape extraction concerns analysing the match of low-level information to a known template of a target shape. As this can be computationally very cumbersome, we then progress to a technique that improves computational performance, whilst maintaining an optimal performance. The technique is known as the Hough transform, and it has long been a popular target for researchers in computer vision who have sought to clarify its basis, improve its speed and increase its accuracy and robustness. Essentially, by the Hough transform we estimate the parameters that govern a shape's appearance, where the shapes range from lines to ellipses and even to unknown shapes.

Some applications of shape extraction require determination of rather more than the parameters that control appearance, and require to be able to deform or flex to match the image template. For this reason, the chapter on shape extraction by matching is followed by one on flexible shape analysis. This leads to interactive segmentation via snakes (active contours). The later material on the formulation by level-set methods brought new power to deformable shape extraction techniques. Further, we shall see how we can describe a shape by its skeleton though with practical difficulty which can be alleviated by symmetry (though this can be slow to compute) and also how global constraints concerning the statistics of a shape's appearance can be used to guide final extraction.

Up to this point, we have not considered techniques that can be used to describe the shape found in an image. We shall find that the two major approaches concern techniques that describe a shape's perimeter and those that describe its area. Some of the perimeter description techniques, the Fourier descriptors, are even couched using Fourier transform theory that allows analysis of their frequency content. One of the major approaches to area description, statistical moments, also has a form of access to frequency components, though it is of a very different nature to the Fourier analysis. We now include new formulations that are phrased in discrete terms, rather than as approximations to discrete. One advantage is that insight into descriptive ability can be achieved by reconstruction which should get back to the original shape.

We then move on to region-based analysis. This includes some classic computer vision approaches for segmentation and description, especially superpixels which are a grouping process reflecting structure and reduced resolution. Then we move to texture which describes patterns with no known analytical description and has been the target of considerable research in computer vision and image processing.

Much computer vision, for computational reasons, concerns spatial images only, and here we describe spatiotemporal techniques detecting and analysing moving objects from within sequences of images. Moving objects are detected by separating the foreground from the background, known as background subtraction. Having separated the moving components, one approach is then to follow or track the object as it moves within a sequence of image frames. The moving object can be described and recognised from the tracking information or by collecting together the sequence of frames to derive moving object descriptions.

We include material that is germane to the text, such as camera models and co-ordinate geometry and on methods of colour description. These are aimed to be short introductions and are germane to much of the material throughout but not needed directly to cover it.

We then describe how to learn and discriminate between objects and patterns. There is also introductory material on how to classify these patterns against known data, with a selection of the distance measures that can be used within that, and this is a window on a much larger area, to which appropriate pointers are given. This book is not about machine learning, and there are plenty of excellent texts that describe that. We have to address deep learning, since it is a combination of feature extraction and learning. Taking the challenge directly, we address deep learning and its particular relation with feature extraction and classification. This is a new way of processing images which has great power and can be very fast. We show the relationship between the new deep learning approaches and classic feature extraction techniques.

An underlying premise throughout the text is that there is never a panacea in engineering, it is invariably about compromise. There is material not contained in the book, and some of this and other related material is referenced throughout the text, especially on-line material.

In this way, the text covers all major areas of feature extraction and image processing in computer vision. There is considerably more material in the subject than is presented here: for example, there is an enormous volume of material in 3D computer vision and in 2D signal processing which is only alluded to here. Topics that are specifically not included are 3D processing, watermarking, image coding, statistical pattern recognition and machine learning. To include all that would lead to a monstrous book that no one could afford, or even pick up. So we admit we give a snapshot, and we hope more that it is considered to open another window on a fascinating and rewarding subject.

In gratitude

We are immensely grateful to the input of our colleagues, in particular to Prof Steve Gunn, Dr John Carter, Dr Sasan Mahmoodi, Dr Kate Farrahi and to Dr Jon Hare. The family who put up with it are Maria Eugenia and Caz and the nippers. We are also very grateful to past and present researchers in computer vision at the Vision Learning and Control (VLC) research group under (or who have survived?) Mark's supervision at the Electronics and Computer Science, University of Southampton. As well as Alberto and Steve, these include Dr Hani Muammar, Prof Xiaoguang Jia, Prof Yan Qiu Chen, Dr Adrian Evans, Dr Colin Davies, Dr Mark Jones, Dr David Cunado, Dr Jason Nash, Dr Ping Huang, Dr Liang Ng, Dr David Benn, Dr Douglas Bradshaw, Dr David Hurley, Dr John Manslow, Dr Mike Grant, Bob Roddis, Prof Andrew Tatem, Dr Karl Sharman, Dr Jamie Shutler, Dr Jun Chen, Dr Andy Tatem, Dr Chew-Yean Yam, Dr James Hayfron-Acquah, Dr Yalin Zheng, Dr Jeff Foster, Dr Peter Myerscough, Dr David Wagg, Dr Ahmad Al-Mazeed, Dr Jang-Hee Yoo, Dr Nick Spencer, Dr Stuart Mowbray, Dr Stuart Prismall, Prof Peter Gething, Dr Mike Jewell, Dr David Wagg, Dr Alex Bazin, Hidayah Rahmalan, Dr Xin Liu, Dr Imed Bouchrika, Dr Banafshe Arbab-Zavar, Dr Dan Thorpe, Dr Cem Direkoglu, Dr Sina Samangooei, Dr John Bustard, D. Richard Seely, Dr Alastair Cummings, Dr Muayed Al-Huseiny, Dr Mina Ibrahim, Dr Darko Matovski, Dr Gunawan Ariyanto, Dr Sung-Uk Jung, Dr Richard Lowe, Dr Dan Reid, Dr George Cushen, Dr Ben Waller, Dr Nick Udell, Dr Anas Abuzaina, Dr Thamer Alathari, Dr Musab Sahrim, Dr Ah Reum Oh, Dr Tim Matthews, Dr Emad Jaha, Dr Peter Forrest, Dr Jaime Lomeli, Dr Dan Martinho-Corbishley, Dr Bingchen Guo, Dr Jung Sun, Dr Nawaf Almudhahka, Di Meng, Moneera Alamnakani, and John Evans (for the great hippo photo). There has been much input from Mark's postdocs too, omitting those already mentioned, these include Dr Hugh Lewis, Dr Richard Evans, Dr Lee Middleton, Dr Galina Veres, Dr Baofeng Guo, Dr Michaela Goffredo and Dr Wenshu Zhang. We are also very grateful to other past Southampton students of BEng and MEng Electronic Engineering, MEng Information Engineering, BEng and MEng Computer Engineering, MEng Software Engineering and BSc Computer Science who have pointed our earlier mistakes (and enjoyed the beer), have noted areas for clarification and in some cases volunteered some of the material herein. Beyond Southampton, we remain grateful to the reviewers and to those who have written in and made many helpful suggestions, and to Prof Daniel Cremers, Dr Timor Kadir, Prof Tim Cootes, Prof Larry Davis, Dr Pedro Felzenszwalb, Prof Luc van Gool, Prof Aaron Bobick, Prof Phil Torr, Dr Long Tran-Thanh, Dr Tiago de Freitas, Dr Seth Nixon, for observations on and improvements to the text and/or for permission to use images. Naturally we are very grateful to the Elsevier editorial team who helped us reach this point, particularly Joanna Collett and Tim Pitts, and especially to Anitha Sivaraj for her help with the final text. To all of you, our very grateful thanks.

Final message

We ourselves have already benefited much by writing this book. As we already know, previous students have also benefited and contributed to it as well. It remains our hope that it does inspire people to join in this fascinating and rewarding subject that has proved to be such a source of pleasure and inspiration to its many workers.

Mark S. Nixon

Electronics and Computer Science, University of Southampton

Alberto S. Aguado

Foundry, London

Nov 2019

Feature Extraction and Image Processing in Computer Vision

1 Introduction

Abstract

This is where we start, by looking at the human visual system to investigate what is meant by vision, how a computer can be made to sense pictorial data and how we can process an image. In this book , the processing languages are Python and Matlab and this Chapter includes an introduction to both systems. The overview of this chapter is shown in Table 1.1; you will find a similar overview at the start of each chapter. References/citations are collected at the end of each chapter.

Keywords

CCD; CMOS; Cones; Framestore; Human eye; Human vision system; Illusions; Journals; Lateral Geniculate Nucleus; Matlab; Neural processing; Pixel sensors; Python; Rods; Textbooks; Web links

1.1. Overview

This is where we start, by looking at the human visual system to investigate what is meant by vision, how a computer can be made to sense pictorial data and how we can process an image. The overview of this chapter is shown in Table 1.1; you will find a similar overview at the start of each chapter. References/citations are collected at the end of each chapter.

1.2. Human and computer vision

A computer vision system processes images acquired from an electronic camera, which is like the human vision system where the brain processes images derived from the eye. Computer vision is a rich and rewarding topic for study and research for electronic engineers, computer scientists and many others. Now that cameras are cheap and widely available and computer power and memory are vast, computer vision is found in many places. There are now many vision systems in routine industrial use: cameras inspect mechanical parts to check size, food is inspected for quality and images used in astronomy benefit from computer vision techniques. Forensic studies and biometrics (ways to recognise people) using computer vision include automatic face recognition and recognising people by the ‘texture’ of their irises. These studies are paralleled by biologists and psychologists who continue to study how our human vision system works and how we see and recognise objects (and people).

Table 1.1

A selection of (computer) images is given in Fig. 1.1, these images comprise a set of points or picture elements (usually concatenated to pixels) stored as an array of numbers in a computer. To recognise faces, based on an image such as Fig. 1.1A, we need to be able to analyse constituent shapes, such as the shape of the nose, the eyes and the eyebrows, to make some measurements to describe and then recognise a face. Fig. 1.1B is an ultrasound image of the carotid artery (which is near the side of the neck and supplies blood to the brain and the face), taken as a cross-section through it. The top region of the image is near the skin; the bottom is inside the neck. The image arises from combinations of the reflections of the ultrasound radiation by tissue. This image comes from a study aimed to produce three-dimensional models of arteries, to aid vascular surgery. Note that the image is very noisy, and this obscures the shape of the (elliptical) artery. Remotely sensed images are often analysed by their texture content. The perceived texture is different between the road junction and the different types of foliage seen in Fig. 1.1C. Finally, Fig. 1.1D is a magnetic resonance image (MRI) of a cross section near the middle of a human body. The chest is at the top of the image, and the lungs and blood vessels are the dark areas, the internal organs and the fat appear grey. MRI images are in routine medical use nowadays, owing to their ability to provide high-quality images.

There are many different image sources. In medical studies, MRI is good for imaging soft tissue but does not reveal the bone structure (the spine cannot be seen in Fig. 1.1D); this can be achieved by using computerised tomography which is better at imaging bone, as opposed to soft tissue. Remotely sensed images can be derived from infrared (thermal) sensors or synthetic-aperture radar, rather than by cameras, as in Fig. 1.1C. Spatial information can be provided by two-dimensional arrays of sensors, including sonar arrays. There are perhaps more varieties of sources of spatial data in medical studies than in any other area. But computer vision techniques are used to analyse any form of data, not just the images from cameras.

Figure 1.1 Real images from different sources.

Synthesised images are good for evaluating techniques and finding out how they work, and some of the bounds on performance. Two synthetic images are shown in Fig. 1.2. Fig. 1.2A is an image of circles that were specified mathematically. The image is an ideal case: the circles are perfectly defined and the brightness levels have been specified to be constant. This type of synthetic image is good for evaluating techniques which find the borders of the shape (its edges), the shape itself and even for making a description of the shape. Fig. 1.2B is a synthetic image made up of sections of real image data. The borders between the regions of image data are exact, again specified by a program. The image data come from a well-known texture database, the Brodatz album of textures. This was scanned and stored as a computer image. This image can be used to analyse how well computer vision algorithms can identify regions of differing texture.

This chapter will show you how basic computer vision systems work, in the context of the human vision system. It covers the main elements of human vision showing you how your eyes work (and how they can be deceived!). For computer vision, this chapter covers the hardware and the software used for image analysis, giving an introduction to Python and Matlab®, the software and mathematical packages, respectively, used throughout this text to implement computer vision algorithms. Finally, a selection of pointers to other material is provided, especially those for more detail on the topics covered in this chapter.

1.3. The human vision system

Human vision is a sophisticated system that senses and acts on visual stimuli. It has evolved for millions of years, primarily for defence or survival. Intuitively, computer and human vision appear to have the same function. The purpose of both systems is to interpret spatial data, data that are indexed by more than one dimension. Even though computer and human vision are functionally similar, you cannot expect a computer vision system to exactly replicate the function of the human eye. This is partly because we do not understand fully how the vision system of the eye and brain works, as we shall see in this section. Accordingly, we cannot design a system to exactly replicate its function. In fact, some of the properties of the human eye are useful when developing computer vision techniques, whereas others are actually undesirable in a computer vision system. But we shall see computer vision techniques which can to some extent, replicate -and in some cases even improve upon -the human vision system.

Figure 1.2 Examples of synthesised images.

You might ponder this, so put one of the fingers from each of your hands in front of your face and try to estimate the distance between them. This is difficult, and we are sure you would agree that your measurement would not be very accurate. Now put your fingers very close together. You can still tell that they are apart even when the distance between them is tiny. So human vision can distinguish relative distance well, but is poor for absolute distance. Computer vision is the other way around: it is good for estimating absolute difference, but with relatively poor resolution for relative difference. The number of pixels in the image imposes the accuracy of the computer vision system, but that does not come until the next chapter. Let us start at the beginning, by seeing how the human vision system works.

In human vision, the sensing element is the eye from which images are transmitted via the optic nerve to the brain, for further processing. The optic nerve has insufficient bandwidth to carry all the information sensed by the eye. Accordingly, there must be some pre-processing before the image is transmitted down the optic nerve. The human vision system can be modelled in three parts:

1. the eye−this is a physical model since much of its function can be determined by pathology;

2. a processing system−this is an experimental model since the function can be modelled, but not determined precisely; and

3. analysis by the brain − this is a psychological model since we cannot access or model such processing directly, but only determine behaviour by experiment and inference.

1.3.1. The eye

The function of the eye is to form an image; a cross-section of the eye is illustrated in Fig. 1.3. Vision requires an ability to selectively focus on objects of interest. This is achieved by the ciliary muscles that hold the lens. In old age, it is these muscles which become slack, and the eye loses its ability to focus at short distance. The iris, or pupil, is like an aperture on a camera and controls the amount of light entering the eye. It is a delicate system and needs protection, this is provided by the cornea (sclera). This is outside the choroid which has blood vessels that supply nutrition and is opaque to cut down the amount of light. The retina is on the inside of the eye, which is where light falls to form an image. By this system muscles rotate the eye, and shape the lens, to form an image on the fovea (focal point) where the majority of sensors are situated. The blind spot is where the optic nerve starts, there are no sensors there.

Figure 1.3 Human eye.

Focussing involves shaping the lens, rather than positioning it as in a camera. The lens is shaped to refract close images greatly, and distant objects little, essentially by ‘stretching’ it. The distance of the focal centre of the lens varies from approximately 14 mm to around 17 mm depending on the lens shape. This implies that a world scene is translated into an area of about 2 mm². Good vision has high acuity (sharpness), which implies that there must be very many sensors in the area where the image is formed.

There are actually nearly 100 million sensors dispersed around the retina. Light falls on these sensors to stimulate photochemical transmissions, which results in nerve impulses that are collected to form the signal transmitted by the eye. There are two types of sensor: firstly, the rods − these are used for black and white (scotopic) vision; and secondly, the cones – these are used for colour (photopic) vision. There are approximately 10 million cones and nearly all are found within 5 degrees of the fovea. The remaining 100 million rods are distributed around the retina, with the majority between 20 and 5 degrees of the fovea. Acuity is actually expressed in terms of spatial resolution (sharpness) and brightness/colour resolution and is greatest within 1 degree of the fovea.

There is only one type of rod, but there are three types of cones. These types are the following:

1. S – short wavelength: these sense light towards the blue end of the visual spectrum;

2. M – medium wavelength: these sense light around green; and

3. L – long wavelength: these sense light towards the red region of the spectrum.

The total response of the cones arises from summing the response of these three types of cones, this gives a response covering the whole of the visual spectrum. The rods are sensitive to light within the entire visual spectrum, giving the monochrome capability of scotopic vision. When the light level is low, images are formed away from the fovea to use the superior sensitivity of the rods, but without the colour vision of the cones. Note that there are actually very few of the blueish cones, and there are many more of the others. But we can still see a lot of blue (especially given ubiquitous denim!). So, somehow, the human vision system compensates for the lack of blue sensors, to enable us to perceive it. The world would be a funny place with red water! The vision response is actually logarithmic and depends on brightness adaption from dark conditions where the image is formed on the rods, to brighter conditions where images are formed on the cones. More on colour sensing is to be found in Chapter 11.

One inherent property of the eye, known as Mach bands, affects the way we perceive images. These are illustrated in Fig. 1.4 and are the bands that appear to be where two stripes of constant shade join. By assigning values to the image brightness levels, the cross-section of plotted brightness is shown in Fig. 1.4A. This shows that the picture is formed from stripes of constant brightness. Human vision perceives an image for which the cross-section is as plotted in Fig. 1.4C. These Mach bands do not really exist, but are introduced by your eye. The bands arise from overshoot in the eyes' response at boundaries of regions of different intensity (this aids us to differentiate between objects in our field of view). The real cross-section is illustrated in Fig. 1.4B. Note also that a human eye can distinguish only relatively few grey levels. It actually has a capability to discriminate between 32 levels (equivalent to 5 bits), whereas the image of Fig. 1.4A could have many more brightness levels. This is why your perception finds it more difficult to discriminate between the low-intensity bands on the left of Fig. 1.4A. (Note that Mach bands cannot be seen in the earlier image of circles, Fig. 1.2A, due to the arrangement of grey levels.) This is the limit of our studies of the first level of human vision; for those who are interested, [Cornsweet70] provides many more details concerning visual perception.

Figure 1.4 Illustrating mach bands.

So we have already identified two properties associated with the eye that it would be difficult to include, and would often be unwanted, in a computer vision system: Mach bands and sensitivity to unsensed phenomena. These properties are integral to human vision. At present, human vision is far more sophisticated than we can hope to achieve with a computer vision system. Infrared-guided missile vision systems can actually have difficulty in distinguishing between a bird at 100 m and a plane at 10 km. Poor birds! (Lucky plane?). Human vision can handle this with ease.

1.3.2. The neural system

Neural signals provided by the eye are essentially the transformed response of the wavelength dependent receptors, the cones and the rods. One model is to combine these transformed signals by addition, as illustrated in Fig. 1.5. The response is transformed by a logarithmic function, mirroring the known response of the eye. This is then multiplied by a weighting factor that controls the contribution of a particular sensor. This can be arranged to allow combination of responses from a particular region. The weighting factors can be chosen to afford particular filtering properties. For example, in lateral inhibition, the weights for the centre sensors are much greater than the weights for those at the extreme. This allows the response of the centre sensors to dominate the combined response given by addition. If the weights in one half are chosen to be negative, whilst those in the other half are positive, then the output will show detection of contrast (change in brightness), given by the differencing action of the weighting functions.

The signals from the cones can be combined in a manner that reflects chrominance (colour) and luminance (brightness). This can be achieved by subtraction of logarithmic functions, which is then equivalent to taking the logarithm of their ratio. This allows measures of chrominance to be obtained. In this manner, the signals derived from the sensors are combined prior to transmission through the optic nerve. This is an experimental model, since there are many ways possible to combine the different signals together.

Figure 1.5 Neural processing.

Visual information is then sent back to arrive at the lateral geniculate nucleus (LGN) which is in the thalamus and is the primary processor of visual information. This is a layered structure containing different types of cells, with differing functions. The axons from the LGN pass information on to the visual cortex. The function of the LGN is largely unknown, though it has been shown to play a part in coding the signals that are transmitted. It is also considered to help the visual system focus its attention, such as on sources of sound. For further information on retinal neural networks, see [Ratliff65]; an alternative study of neural processing can be found in [Overington92].

1.3.3. Processing

The neural signals are then transmitted to two areas of the brain for further processing. These areas are the associative cortex, where links between objects are made, and the occipital cortex, where patterns are processed. It is naturally difficult to determine precisely what happens in this region of the brain. To date, there have been no volunteers for detailed study of their brain's function (though progress with new imaging modalities such as positive emission tomography or electrical impedance tomography will doubtless help). For this reason, there are only psychological models to suggest how this region of the brain operates.

It is well known that one function of the human vision system is to use edges, or boundaries, of objects. We can easily read the word in Fig. 1.6A, this is achieved by filling in the missing boundaries in the knowledge that the pattern most likely represents a printed word. But we can infer more about this image; there is a suggestion of illumination, causing shadows to appear in unlit areas. If the light source is bright, then the image will be washed out, causing the disappearance of the boundaries which are interpolated by our eyes. So there is more than just physical response, there is also knowledge, including prior knowledge of solid geometry. This situation is illustrated in Fig. 1.6B that could represent three ‘pacmen’ about to collide, or a white triangle placed on top of three black circles. Either situation is possible.

Figure 1.6 How human vision uses edges.

Figure 1.7 Static illusions.

It is also possible to deceive human vision, primarily by imposing a scene that it has not been trained to handle. In the famous Zollner illusion, Fig. 1.7A, the bars appear to be slanted, whereas in reality they are vertical (check this by placing a pen between the lines): the small crossbars mislead your eye into perceiving the vertical bars as slanting. In the Ebbinghaus illusion, Fig. 1.7B, the inner circle appears to be larger when surrounded by small circles, than it is when surrounded by larger circles.

There are dynamic illusions too: you can always impress children with the ‘see my wobbly pencil’ trick. Just hold the pencil loosely between your fingers then, to whoops of childish glee, when the pencil is shaken up and down, the solid pencil will appear to bend. Benham's disk, Fig. 1.8, shows how hard it is to model vision accurately. If you make up a version of this disk into a spinner (push a matchstick through the centre) and spin it anti-clockwise, you do not see three dark rings, you will see three coloured ones. The outside one will appear to be red, the middle one a sort of green, and the inner one will appear deep blue. (This can depend greatly on lighting – and contrast between the black and white on the disk. If the colours are not clear, try it in a different place, with different lighting.) You can appear to explain this when you notice that the red colours are associated with the long lines, and the blue with short lines. But that is from physics, not psychology. Now spin the disk clockwise. The order of the colours reverses: red is associated with the short lines (inside), and blue with the long lines (outside). So the argument from physics is clearly incorrect, since red is now associated with short lines not long ones, revealing the need for psychological explanation of the eyes' function. This is not colour perception, see [Armstrong91] for an interesting (and interactive!) study of colour theory and perception.

Figure 1.8 Benham's disk.

Naturally, there are many texts on human vision – one popular text on human visual perception (and its relationship with visual art) is by Livingstone [Livingstone14]; there is an online book: The Joy of Vision (http://www.yorku.ca/eye/thejoy.htm) – useful, despite its title! Marr's seminal text [Marr82] is a computational investigation into human vision and visual perception, investigating it from a computer vision viewpoint. For further details on pattern processing in human vision, see [Bruce90]; for more illusions see [Rosenfeld82] and an excellent – and dynamic – collection at https://michaelbach.de/ot. Many of the properties of human vision are hard to include in a computer vision system, but let us now look at the basic components that are used to make computers see.

1.4. Computer vision systems

Given the progress in computer technology and domestic photography, computer vision hardware is now relatively inexpensive; a basic computer vision system requires a camera, a camera interface and a computer. These days, many personal computers offer the capability for a basic vision system, by including a camera and its interface within the system. There are specialised systems for computer vision, offering high performance in more than one aspect. These can be expensive, as any specialist system is.

1.4.1. Cameras

A camera is the basic sensing element. In simple terms, most cameras rely on the property of light to cause hole–electron pairs (the charge carriers in electronics) in a conducting material. When a potential is applied (to attract the charge carriers), this charge can be sensed as current. By Ohm's law, the voltage across a resistance is proportional to the current through it, so the current can be turned in to a voltage by passing it through a resistor. The number of hole–electron pairs is proportional to the amount of incident light. Accordingly, greater charge (and hence greater voltage and current) is caused by an increase in brightness. In this manner, cameras can provide as output, a voltage which is proportional to the brightness of the points imaged by the camera.

There are three main types of camera: vidicons, charge-coupled devices (CCDs) and, later, CMOS cameras (complementary metal oxide silicon – now the dominant technology for logic circuit implementation). Vidicons are the old (analogue) technology, which though cheap (mainly by virtue of longevity in production) have largely been replaced by the newer CCD and CMOS digital technologies. The digital technologies now dominate much of the camera market because they are lightweight and cheap (with other advantages) and are therefore used in the domestic video market.

Vidicons operate in a manner akin to an old television in reverse. The image is formed on a screen, and then sensed by an electron beam that is scanned across the screen. This produces an output which is continuous, the output voltage is proportional to the brightness of points in the scanned line, and is a continuous signal, a voltage which varies continuously with time. On the other hand, CCDs and CMOS cameras use an array of sensors; these are regions where charge is collected, which is proportional to the light incident on that region. This is then available in discrete, or sampled, form as opposed to the continuous sensing of a vidicon. This is similar to human vision with its array of cones and rods, but digital cameras use a rectangular regularly spaced lattice, whereas human vision uses a hexagonal lattice with irregular spacing.

Two main types of semiconductor pixel sensors are illustrated in Fig. 1.9. In the passive sensor, the charge generated by incident light is presented to a bus through a pass transistor. When the signal Tx is activated, the pass transistor is enabled and the sensor provides a capacitance to the bus, one that is proportional to the incident light. An active pixel includes an amplifier circuit that can compensate for limited fill factor of the photodiode. The select signal again controls presentation of the sensor's information to the bus. A further reset signal allows the charge site to be cleared when the image is rescanned.

The basis of a CCD sensor is illustrated in Fig. 1.10. The number of charge sites gives the resolution of the CCD sensor; the contents of the charge sites (or buckets) need to be converted to an output (voltage) signal. In simple terms, the contents of the buckets are emptied into vertical transport registers which are shift registers moving information towards the horizontal transport registers. This is the column bus supplied by the pixel sensors. The horizontal transport registers empty the information row by row (point by point) into a signal conditioning unit which transforms the sensed charge into a voltage which is proportional to the charge in a bucket, and hence proportional to the brightness of the corresponding point in the scene imaged by the camera. CMOS cameras are like a form of memory: the charge incident on a particular site in a two-dimensional lattice is proportional to the brightness at a point. The charge is then read like computer memory. (In fact, a computer memory RAM chip can act as a rudimentary form of camera when the circuit – the one buried in the chip – is exposed to light.)

Figure 1.9 Pixel sensors.

Figure 1.10 Charge-coupled device sensing element.

There are many more varieties of vidicon (Chalnicon, etc.) than there are of CCD technology (charge injection device, etc.), perhaps due to the greater age of basic vidicon technology. Vidicons are cheap but have a number of intrinsic performance problems. The scanning process essentially relies on ‘moving parts’. As such, the camera performance will change with time, as parts wear; this is known as ageing. Also, it is possible to burn an image into the scanned screen by using high incident light levels; vidicons can also suffer lag that is a delay in response to moving objects in a scene. On the other hand, the digital technologies are dependent on the physical arrangement of charge sites and as such do not suffer from ageing, but can suffer from irregularity in the charge sites' (silicon) material. The underlying technology also makes CCD and CMOS cameras less sensitive to lag and burn, but the signals associated with the CCD transport registers can give rise to readout effects. CCDs actually only came to dominate camera technology when technological difficulty associated with quantum efficiency (the magnitude of response to incident light) for the shorter, blue, wavelengths was solved. One of the major problems in CCD cameras is blooming where bright (incident) light causes a bright spot to grow and disperse in the image (this used to happen in the analogue technologies too). This happens much less in CMOS cameras because the charge sites can be much better defined and reading their data is equivalent to reading memory sites as opposed to shuffling charge between sites. Also, CMOS cameras have now overcome the problem of fixed pattern noise that plagued earlier MOS cameras. CMOS cameras are actually much more recent than CCDs. This begs a question as to which is best: CMOS or CCD? An early view was that CCD could provide higher-quality images, whereas CMOS is a cheaper technology and because it lends itself directly to intelligent cameras with on-board processing. The feature size of points (pixels) in a CCD sensor is limited to be about 4 μm so that enough light is collected. In contrast, the feature size in CMOS technology is considerably smaller. It is then possible to integrate signal processing within the camera chip, and thus it is perhaps possible that CMOS cameras will eventually replace CCD technologies for many applications. However, modern CCDs' process technology is more mature, so the debate will doubtless continue!

Finally, there are specialist cameras, which include high-resolution devices (giving pictures with many points), low-light level cameras which can operate in very dark conditions and infrared cameras which sense heat to provide thermal images; hyperspectral cameras have more sensing bands. For more detail concerning modern camera practicalities and imaging systems, see [Nakamura05] and more recently [Kuroda14]. For more details on sensor development, particularly CMOS, [Fossum97] is still well worth a look. For more detail on images, see [Phillips18] with a particular focus on quality (hey – there is even mosquito noise!).

A light field – or plenoptic – camera is one that can sense depth as well as brightness [Adelson05]. The light field is essentially a two-dimensional set of spatial images, thus giving a four-dimensional array of pixels. The light field can be captured in a number of ways, by moving cameras, or multiple cameras. The aim is to capture the plenoptic function that describes the light as a function of position, angle, wavelength and time [Wu17]. These days, commercially available cameras use lenses to derive the light field. These can be used to render an image into full depth of plane focus (imagine an image of an object taken at close distance where only the object is in focus combined with an image where the background is in focus to give an image where both the object and the background are in focus). A surveillance operation could focus on what is behind an object which would show in a normal camera image. This gives an alternative approach to 3D object analysis, by sensing the object in 3D. Wherever there are applications, industry will follow, and that has proved to be the case.

There are new dynamic vision sensors which sense motion [Lichtsteiner08, Son17] and are much closer to the starting grid than the light field cameras. Clearly, the resolution and speed continue to improve, and there are applications emerging that use these sensors. We shall find in Chapters 4 and 9 it is possible to estimate motion from sequences of images. These sensors are different, since they specifically target motion. As the target application is security (much security video is dull stuff indeed, with little motion) allowing recording only of material of likely interest.

1.4.2. Computer interfaces

Though digital cameras continue to advance, there are still some legacies from the older analogue systems to be found in the some digital systems. There is also some older technology in deployed systems. As such, we shall cover the main points of the two approaches. Essentially, the image sensor converts light into a signal which is expressed either as a continuous signal, or in sampled (digital) form. Some (older) systems expressed the camera signal as an analogue continuous signal, according to a standard, and this was converted at the computer (and still is in some cases, using a frame grabber). Modern digital systems convert the sensor information into digital information with on-chip circuitry and then provide the digital information according to a specified standard. The older systems, such as surveillance systems, supplied (or supply) video whereas the newer systems are digital. Video implies delivering the moving image as a sequence of frames of which one format is digital video (DV).

An analogue continuous camera signal is transformed into digital (discrete) format using an analogue to digital (A/D) converter. Flash converters are usually used due to the high speed required for conversion (say 11 MHz that cannot be met by any other conversion technology). Usually, 8-bit A/D converters are used; at 6dB/bit, this gives 48 dB which just satisfies the CCIR stated bandwidth of approximately 45 dB. The outputs of the A/D converter are then stored. Note that there are aspects of the sampling process which are of considerable interest in computer vision; these are covered in Chapter 2.

Figure 1.11 Interlacing in television pictures.

In digital camera systems, this processing is usually performed on the camera chip, and the camera eventually supplies digital information, often in coded form. Currently, Thunderbolt is the hardware interface that dominates the high end of the market and USB is used at the lower end. There was a system called Firewire, but it has now faded. Images are constructed from a set of lines, those lines scanned by a camera. In the older analogue systems, in order to reduce requirements on transmission (and for viewing), the 625 lines (in the PAL system, NTSC is of lower resolution) were transmitted in two interlaced fields, each of 312.5 lines, as illustrated in Fig. 1.11. These were the odd and the even fields. Modern televisions are progressive scan, which is like reading a book: the picture is constructed line by line. There is also an aspect ratio in picture transmission: pictures are arranged to be longer than they are high. These factors are chosen to make television images attractive to human vision. Nowadays, digital video cameras can provide digital output, in progressive scan delivering sequences of images that are readily processed. There are Gigabit Ethernet cameras which transmit high-speed video and control information over Ethernet networks. Or there are webcams, or just digital camera systems that deliver images straight to the computer. Life just gets easier!

1.5. Processing images

We shall be using software and packages to process

Enjoying the preview?

Page 1 of 1

Feature Extraction and Image Processing for Computer Vision

About this ebook

Mark Nixon

Read more from Mark Nixon

Related authors

Related to Feature Extraction and Image Processing for Computer Vision

Related ebooks

Intelligence (AI) & Semantics For You

Related podcast episodes

Related articles

Related categories

Reviews for Feature Extraction and Image Processing for Computer Vision

What did you think?

Book preview

Feature Extraction and Image Processing for Computer Vision - Mark Nixon

Table of Contents

Copyright

Dedication

Preface

What is new in the fourth edition?

Why did we write this book?

The book and its support

In gratitude

Final message

1

Introduction

Abstract

Keywords

1.1. Overview

1.2. Human and computer vision

Table 1.1

1.3. The human vision system

1.3.1. The eye

1.3.2. The neural system

1.3.3. Processing

1.4. Computer vision systems

1.4.1. Cameras

1.4.2. Computer interfaces

1.5. Processing images