Deep Learning for Medical Image Analysis

Ebook1,034 pages10 hours

Deep Learning for Medical Image Analysis

Name: Deep Learning for Medical Image Analysis
ISBN: 9780323858885

By S. Kevin Zhou

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Deep Learning for Medical Image Analysis, Second Edition is a great learning resource for academic and industry researchers and graduate students taking courses on machine learning and deep learning for computer vision and medical image computing and analysis. Deep learning provides exciting solutions for medical image analysis problems and is a key method for future applications. This book gives a clear understanding of the principles and methods of neural network and deep learning concepts, showing how the algorithms that integrate deep learning as a core component are applied to medical image detection, segmentation, registration, and computer-aided analysis.

· Covers common research problems in medical image analysis and their challenges

· Describes the latest deep learning methods and the theories behind approaches for medical image analysis

· Teaches how algorithms are applied to a broad range of application areas including cardiac, neural and functional, colonoscopy, OCTA applications and model assessment

· Includes a Foreword written by Nicholas Ayache

Skip carousel

LanguageEnglish

PublisherAcademic Press

Release dateNov 23, 2023

ISBN9780323858885

Related to Deep Learning for Medical Image Analysis

Related ebooks

Skip carousel

Data Fusion Techniques and Applications for Smart Healthcare
Ebook
Data Fusion Techniques and Applications for Smart Healthcare
byAmit Kumar Singh
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Medical Image Analysis
Ebook
Deep Learning for Medical Image Analysis
byS. Kevin Zhou
Rating: 4 out of 5 stars
4/5
Biomedical Information Technology
Ebook
Biomedical Information Technology
byDavid Dagan Feng
Rating: 0 out of 5 stars
0 ratings
Handbook of Decision Support Systems for Neurological Disorders
Ebook
Handbook of Decision Support Systems for Neurological Disorders
byD. Jude Hemanth
Rating: 0 out of 5 stars
0 ratings
Advances in Computational Techniques for Biomedical Image Analysis: Methods and Applications
Ebook
Advances in Computational Techniques for Biomedical Image Analysis: Methods and Applications
byDeepika Koundal
Rating: 0 out of 5 stars
0 ratings
Brain Tumor MRI Image Segmentation Using Deep Learning Techniques
Ebook
Brain Tumor MRI Image Segmentation Using Deep Learning Techniques
byJyotismita Chaki
Rating: 0 out of 5 stars
0 ratings
Neutrosophic Set in Medical Image Analysis
Ebook
Neutrosophic Set in Medical Image Analysis
byYanhui Guo
Rating: 0 out of 5 stars
0 ratings
Augmenting Neurological Disorder Prediction and Rehabilitation Using Artificial Intelligence
Ebook
Augmenting Neurological Disorder Prediction and Rehabilitation Using Artificial Intelligence
byAnitha S. Pillai
Rating: 0 out of 5 stars
0 ratings
Semantic Models in IoT and eHealth Applications
Ebook
Semantic Models in IoT and eHealth Applications
bySanju Tiwari
Rating: 0 out of 5 stars
0 ratings
Pattern Recognition and Signal Analysis in Medical Imaging
Ebook
Pattern Recognition and Signal Analysis in Medical Imaging
byAnke Meyer-Baese
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Chest Radiographs: Computer-Aided Classification
Ebook
Deep Learning for Chest Radiographs: Computer-Aided Classification
byYashvi Chandola
Rating: 0 out of 5 stars
0 ratings
Intelligent Data-Analytics for Condition Monitoring: Smart Grid Applications
Ebook
Intelligent Data-Analytics for Condition Monitoring: Smart Grid Applications
byHasmat Malik
Rating: 0 out of 5 stars
0 ratings
Computational Intelligence for Multimedia Big Data on the Cloud with Engineering Applications
Ebook
Computational Intelligence for Multimedia Big Data on the Cloud with Engineering Applications
byArun Kumar Sangaiah
Rating: 0 out of 5 stars
0 ratings
Machine Learning in Bio-Signal Analysis and Diagnostic Imaging
Ebook
Machine Learning in Bio-Signal Analysis and Diagnostic Imaging
byNilanjan Dey
Rating: 0 out of 5 stars
0 ratings
Applications of Artificial Intelligence in Medical Imaging
Ebook
Applications of Artificial Intelligence in Medical Imaging
byAbdulhamit Subasi
Rating: 0 out of 5 stars
0 ratings
Cognitive and Soft Computing Techniques for the Analysis of Healthcare Data
Ebook
Cognitive and Soft Computing Techniques for the Analysis of Healthcare Data
byAkash Kumar Bhoi
Rating: 0 out of 5 stars
0 ratings
Meta Learning With Medical Imaging and Health Informatics Applications
Ebook
Meta Learning With Medical Imaging and Health Informatics Applications
byHien Van Nguyen
Rating: 0 out of 5 stars
0 ratings
Deep Learning Techniques for Biomedical and Health Informatics
Ebook
Deep Learning Techniques for Biomedical and Health Informatics
byBasant Agarwal
Rating: 0 out of 5 stars
0 ratings
Connectome Analysis: Characterization, Methods, and Analysis
Ebook
Connectome Analysis: Characterization, Methods, and Analysis
byMarkus D. Schirmer
Rating: 0 out of 5 stars
0 ratings
Smart Wheelchairs and Brain-computer Interfaces: Mobile Assistive Technologies
Ebook
Smart Wheelchairs and Brain-computer Interfaces: Mobile Assistive Technologies
byPablo Diez
Rating: 0 out of 5 stars
0 ratings
Deep Learning and Parallel Computing Environment for Bioengineering Systems
Ebook
Deep Learning and Parallel Computing Environment for Bioengineering Systems
byArun Kumar Sangaiah
Rating: 0 out of 5 stars
0 ratings
Conformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications
Ebook
Conformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications
byVineeth Balasubramanian
Rating: 0 out of 5 stars
0 ratings
Computer Vision for Microscopy Image Analysis
Ebook
Computer Vision for Microscopy Image Analysis
byMei Chen
Rating: 0 out of 5 stars
0 ratings
Diagnostic Biomedical Signal and Image Processing Applications With Deep Learning Methods
Ebook
Diagnostic Biomedical Signal and Image Processing Applications With Deep Learning Methods
byKemal Polat
Rating: 0 out of 5 stars
0 ratings
Demystifying Big Data, Machine Learning, and Deep Learning for Healthcare Analytics
Ebook
Demystifying Big Data, Machine Learning, and Deep Learning for Healthcare Analytics
byPradeep N
Rating: 0 out of 5 stars
0 ratings
Multidisciplinary Microfluidic and Nanofluidic Lab-on-a-Chip: Principles and Applications
Ebook
Multidisciplinary Microfluidic and Nanofluidic Lab-on-a-Chip: Principles and Applications
byXiujun (James) Li
Rating: 0 out of 5 stars
0 ratings
Visual Computing for Medicine: Theory, Algorithms, and Applications
Ebook
Visual Computing for Medicine: Theory, Algorithms, and Applications
byBernhard Preim
Rating: 0 out of 5 stars
0 ratings
Cognitive Systems and Signal Processing in Image Processing
Ebook
Cognitive Systems and Signal Processing in Image Processing
byYu-Dong Zhang
Rating: 0 out of 5 stars
0 ratings
Artificial Intelligence in Tissue and Organ Regeneration
Ebook
Artificial Intelligence in Tissue and Organ Regeneration
byChandra P. Sharma
Rating: 0 out of 5 stars
0 ratings
Magnetic Resonance Imaging: Recording, Reconstruction and Assessment
Ebook
Magnetic Resonance Imaging: Recording, Reconstruction and Assessment
byV. Rajinikanth
Rating: 5 out of 5 stars
5/5

Computers For You

Skip carousel

SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
The Invisible Rainbow: A History of Electricity and Life
Ebook
The Invisible Rainbow: A History of Electricity and Life
byArthur Firstenberg
Rating: 4 out of 5 stars
4/5
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
Ebook
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
byKathleen Hale
Rating: 4 out of 5 stars
4/5
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
Ebook
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
byGary Smith
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
Ebook
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
byAndrew Hodges
Rating: 4 out of 5 stars
4/5
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
Ebook
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
byBruce Sterling
Rating: 4 out of 5 stars
4/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
Ebook
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
byDavid Kadavy
Rating: 5 out of 5 stars
5/5
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
Ebook
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
byKatherine Johnson Martinko
Rating: 0 out of 5 stars
0 ratings
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
Ebook
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
bySeth Reichelson
Rating: 0 out of 5 stars
0 ratings
CompTIA Security+ Practice Questions
Ebook
CompTIA Security+ Practice Questions
byIP Specialist
Rating: 2 out of 5 stars
2/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Going Text: Mastering the Command Line
Ebook
Going Text: Mastering the Command Line
byBrian Schell
Rating: 4 out of 5 stars
4/5
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
People Skills for Analytical Thinkers
Ebook
People Skills for Analytical Thinkers
byGilbert Eijkelenboom
Rating: 5 out of 5 stars
5/5
Remote/WebCam Notarization : Basic Understanding
Ebook
Remote/WebCam Notarization : Basic Understanding
byJeannie Eunice Franks
Rating: 3 out of 5 stars
3/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

Journal Review in Surgical Education: The OR Black Box
Podcast episode
Journal Review in Surgical Education: The OR Black Box
byBehind The Knife: The Surgery Podcast
0 ratings
0% found this document useful
Metaverse Medicine
Podcast episode
Metaverse Medicine
byOIS Podcast | Ophthalmology's leading Podcast
0 ratings
0% found this document useful
Machine Learning and Artificial Intelligence in the Clinical Microbiology Laboratory (JCM ed.): The idea of applying machine learning and digital pathology platforms to everyday workflows in the clinical microbiology laboratory has become increasing intriguing and appealing, especially as labs continue to optimize efficiency in the midst of...
Podcast episode
Machine Learning and Artificial Intelligence in the Clinical Microbiology Laboratory (JCM ed.): The idea of applying machine learning and digital pathology platforms to everyday workflows in the clinical microbiology laboratory has become increasing intriguing and appealing, especially as labs continue to optimize efficiency in the midst of...
byEditors in Conversation
0 ratings
0% found this document useful
How Aira Matrix supports digital pathology with deep learning on demand w/ Chaith Kondragunta
Podcast episode
How Aira Matrix supports digital pathology with deep learning on demand w/ Chaith Kondragunta
byDigital Pathology Podcast
0 ratings
0% found this document useful
Ep. 298 New Innovations in the Treatment of PE: The Flow Medical Story with Founders Dr. Osman Ahmed and Dr. Jonathan Paul
Podcast episode
Ep. 298 New Innovations in the Treatment of PE: The Flow Medical Story with Founders Dr. Osman Ahmed and Dr. Jonathan Paul
byBackTable Vascular & Interventional
0 ratings
0% found this document useful
Why point-of-care ultrasound should be the standard for lung assessment: Innovations in technology have enabled the creation of point-of-care ultrasound, which allows clinicians to gather images at the first patient assessment, improving care quality by reducing wait times for diagnosis. Despite the benefits of...
Podcast episode
Why point-of-care ultrasound should be the standard for lung assessment: Innovations in technology have enabled the creation of point-of-care ultrasound, which allows clinicians to gather images at the first patient assessment, improving care quality by reducing wait times for diagnosis. Despite the benefits of...
byModern Healthcare’s Healthcare Insider Podcast
0 ratings
0% found this document useful
Abdominal trauma, Measles, Aspirin, and Why the Time for Ultrasound is Now: Health care providers who phantom or "quick look" ultrasound examine are causing the viral spread of an epidemic, which we can minimize by responsibly utilizing ultrasound inpatient care and by implementing the coaching that artificial intelligence...
Podcast episode
Abdominal trauma, Measles, Aspirin, and Why the Time for Ultrasound is Now: Health care providers who phantom or "quick look" ultrasound examine are causing the viral spread of an epidemic, which we can minimize by responsibly utilizing ultrasound inpatient care and by implementing the coaching that artificial intelligence...
byCoda Change
0 ratings
0% found this document useful
Technology Treatment – Matt Angle, PhD, Founder & CEO of Paradromics, Inc. – Connecting Computers to Brains to Improve Lives and Reconnect People to Their Abilities: Matt Angle, Ph.D., the founder & CEO of Paradromics, Inc., provides an informative overview of the current state of neuroscience and technology as it pertains to people who have sustained a loss of ability due to injury or disease. Angle...
Podcast episode
Technology Treatment – Matt Angle, PhD, Founder & CEO of Paradromics, Inc. – Connecting Computers to Brains to Improve Lives and Reconnect People to Their Abilities: Matt Angle, Ph.D., the founder & CEO of Paradromics, Inc., provides an informative overview of the current state of neuroscience and technology as it pertains to people who have sustained a loss of ability due to injury or disease. Angle...
byFinding Genius Podcast
0 ratings
0% found this document useful
Collaborators: Project InnerEye with Javier Alvarez and Raj Jena
Podcast episode
Collaborators: Project InnerEye with Javier Alvarez and Raj Jena
byMicrosoft Research Podcast
0 ratings
0% found this document useful
AI-powered digital diagnostic tools for medical, veterinary and environmental laboratories. How Techcyte uses AI for digital cytology and smears w/ Ben Cahoon, Techcyte
Podcast episode
AI-powered digital diagnostic tools for medical, veterinary and environmental laboratories. How Techcyte uses AI for digital cytology and smears w/ Ben Cahoon, Techcyte
byDigital Pathology Podcast
0 ratings
0% found this document useful
Journal Review in Surgical Education: Artificial Intelligence
Podcast episode
Journal Review in Surgical Education: Artificial Intelligence
byBehind The Knife: The Surgery Podcast
0 ratings
0% found this document useful
Cell Exploration with ML at the Allen Institute w/ Jianxu Chen - #383: Today we’re joined by Jianxu Chen, a scientist in the Assay Development group at the Allen Institute for Cell Science. At the latest GTC conference, Jianxu presented his work on the Allen Cell Explorer Toolkit, an open-source project that...
Podcast episode
Cell Exploration with ML at the Allen Institute w/ Jianxu Chen - #383: Today we’re joined by Jianxu Chen, a scientist in the Assay Development group at the Allen Institute for Cell Science. At the latest GTC conference, Jianxu presented his work on the Allen Cell Explorer Toolkit, an open-source project that...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Using AI Created Digital Twins to Accelerate Clinical Trials: One of the challenges of conducting clinical trials is finding enough patient to include in a control arm of a study. This can slow the pace of drug development and increase its costs. Unlearn.AI is seeking to change that by using its artificial inte...
Podcast episode
Using AI Created Digital Twins to Accelerate Clinical Trials: One of the challenges of conducting clinical trials is finding enough patient to include in a control arm of a study. This can slow the pace of drug development and increase its costs. Unlearn.AI is seeking to change that by using its artificial inte...
byThe Bio Report
0 ratings
0% found this document useful
Sumanta Kumar Pal, MD, FASCO - Selecting and Sequencing Targeted and Immunotherapy Regimens for RCC: How Will the Latest Evidence Impact Treatment Decisions for My Patients?: Go online to PeerView.com/XZJ860 to view the activity, download slides and practice aids, and complete the post-test to earn credit.
Podcast episode
Sumanta Kumar Pal, MD, FASCO - Selecting and Sequencing Targeted and Immunotherapy Regimens for RCC: How Will the Latest Evidence Impact Treatment Decisions for My Patients?: Go online to PeerView.com/XZJ860 to view the activity, download slides and practice aids, and complete the post-test to earn credit.
byPeerView Internal Medicine CME/CNE/CPE Audio Podcast
0 ratings
0% found this document useful
Why and how is AI taking over the tissue image analysis field? w/ Jeppe Thagaard, Visiopharm
Podcast episode
Why and how is AI taking over the tissue image analysis field? w/ Jeppe Thagaard, Visiopharm
byDigital Pathology Podcast
0 ratings
0% found this document useful
S3E18: Pre-surgical fMRI uses and nuances
Podcast episode
S3E18: Pre-surgical fMRI uses and nuances
byOHBM Neurosalience
0 ratings
0% found this document useful
AI in Healthcare: Artificial Intelligence is the Way to Advance into the Next Era of Medical Technology: Could artificial intelligence be used in clinical drug trials? Is AI really capable of making the overall process easier and allowing the advancement of medical science, bringing cutting-edge treatment options to patients? may sound like science...
Podcast episode
AI in Healthcare: Artificial Intelligence is the Way to Advance into the Next Era of Medical Technology: Could artificial intelligence be used in clinical drug trials? Is AI really capable of making the overall process easier and allowing the advancement of medical science, bringing cutting-edge treatment options to patients? may sound like science...
byFinding Genius Podcast
0 ratings
0% found this document useful
The Potential of Entire-Eye OCT Imaging
Podcast episode
The Potential of Entire-Eye OCT Imaging
byOIS Podcast | Ophthalmology's leading Podcast
0 ratings
0% found this document useful
Episode: 42 - Machine Learning Informatics for Antibody Discovery
Podcast episode
Episode: 42 - Machine Learning Informatics for Antibody Discovery
byThe Chain: Protein Engineering Podcast
0 ratings
0% found this document useful
Virtual H&E. How Instapath uses optical sectioning microscopy to accelerate pathology diagnosis w/ David Tulman, Instapath
Podcast episode
Virtual H&E. How Instapath uses optical sectioning microscopy to accelerate pathology diagnosis w/ David Tulman, Instapath
byDigital Pathology Podcast
0 ratings
0% found this document useful
Jonathan W. Goldman, MD - Translating Science, Transforming Practice, and Making Headway Toward Better Outcomes in SCLC: Immunotherapy Has Changed the Game, but Where Do We Go Next?: Go online to PeerView.com/KBM860 to view the activity, download slides and practice aids, and complete the post-test to earn credit.
Podcast episode
Jonathan W. Goldman, MD - Translating Science, Transforming Practice, and Making Headway Toward Better Outcomes in SCLC: Immunotherapy Has Changed the Game, but Where Do We Go Next?: Go online to PeerView.com/KBM860 to view the activity, download slides and practice aids, and complete the post-test to earn credit.
byPeerView Internal Medicine CME/CNE/CPE Audio Podcast
0 ratings
0% found this document useful
ANDA: An open-source tool for automated image analysis of neuronal differentiation
Podcast episode
ANDA: An open-source tool for automated image analysis of neuronal differentiation
byPaperPlayer biorxiv cell biology
0 ratings
0% found this document useful
What to consider when choosing an image analysis solution for phenotyping? (part 3) w/ Regan Baird, Visiopharm
Podcast episode
What to consider when choosing an image analysis solution for phenotyping? (part 3) w/ Regan Baird, Visiopharm
byDigital Pathology Podcast
0 ratings
0% found this document useful
Ep. 98 Using AI to Improve Stroke Care with Dr. Ameer Hassan: In this episode, Dr. Ameer Hassan joins Dr. Sabeen Dhand to discuss the use of artificial intelligence (AI) to improve stroke care. We explain the hub and spoke model and how the primary stroke centers communicate in the hub.
Podcast episode
Ep. 98 Using AI to Improve Stroke Care with Dr. Ameer Hassan: In this episode, Dr. Ameer Hassan joins Dr. Sabeen Dhand to discuss the use of artificial intelligence (AI) to improve stroke care. We explain the hub and spoke model and how the primary stroke centers communicate in the hub.
byBackTable Vascular & Interventional
0 ratings
0% found this document useful
Exploring Open-Source for Tissue Image Analysis and Data Science Business w/ Trevor McKee, Pathomics.io
Podcast episode
Exploring Open-Source for Tissue Image Analysis and Data Science Business w/ Trevor McKee, Pathomics.io
byDigital Pathology Podcast
0 ratings
0% found this document useful
Microscale Manufacturing – Rahul Panat, Associate Professor, Department of Mechanical Engineering at Carnegie Mellon University – An Overview of Modern Manufacturing and Technology’s Important Role: Rahul Panat, Associate Professor, Department of Mechanical Engineering at Carnegie Mellon University, provides an overview of his work in microscale additive manufacturing, microelectronics, and much more. Podcast Points: How has 3D printing improved...
Podcast episode
Microscale Manufacturing – Rahul Panat, Associate Professor, Department of Mechanical Engineering at Carnegie Mellon University – An Overview of Modern Manufacturing and Technology’s Important Role: Rahul Panat, Associate Professor, Department of Mechanical Engineering at Carnegie Mellon University, provides an overview of his work in microscale additive manufacturing, microelectronics, and much more. Podcast Points: How has 3D printing improved...
byFinding Genius Podcast
0 ratings
0% found this document useful
An easy AI tool for pathology image analysis. How Aiforia empowers pathologists and scientists with supervised deep learning w/ Tuomas Ropponen, Aiforia
Podcast episode
An easy AI tool for pathology image analysis. How Aiforia empowers pathologists and scientists with supervised deep learning w/ Tuomas Ropponen, Aiforia
byDigital Pathology Podcast
0 ratings
0% found this document useful
From a Vision to a Prototype: Understanding the First Human Bionic Eye: The world’s first human is here, which means several conditions leading to blindness might (eventually) not be. What does it look like? How does a bionic eye work? What will the patient experience? Press play for the answers to these questions and...
Podcast episode
From a Vision to a Prototype: Understanding the First Human Bionic Eye: The world’s first human is here, which means several conditions leading to blindness might (eventually) not be. What does it look like? How does a bionic eye work? What will the patient experience? Press play for the answers to these questions and...
byFinding Genius Podcast
0 ratings
0% found this document useful
How exponentials on top of exponentials in single-cell analysis is transforming biology today
Podcast episode
How exponentials on top of exponentials in single-cell analysis is transforming biology today
byRiskgaming
0 ratings
0% found this document useful
Episode 12: Dr Stefanie Czischek
Podcast episode
Episode 12: Dr Stefanie Czischek
byinsideQuantum
0 ratings
0% found this document useful

Skip carousel

Opinion: Advancing AI In Health Care: It’s All About Trust
STAT
Article
Opinion: Advancing AI In Health Care: It’s All About Trust
Oct 23, 2019
In drug trials, success in Phase 1 is no guarantee of success in Phase 3. In the current mania for AI and deep learning, a proof of concept is taken…
4 min read
An AI Startup, Led By 23andMe Veteran Andy Page, Tries To Take Better Pictures Of The Heart
STAT
Article
An AI Startup, Led By 23andMe Veteran Andy Page, Tries To Take Better Pictures Of The Heart
Oct 1, 2019
Let’s assume you are not an expert highly trained in medical imaging. And let’s assume you were invited one day to try out a new technology for heart ultrasounds — diagnostic tools that are notoriously difficult to use because of the chest wall and b
4 min read
Faster Method Spots Anatomy To Protect From Radiation
Futurity
Article
Faster Method Spots Anatomy To Protect From Radiation
Oct 8, 2019
1 min read
AI Teaches Brain Tumor Surgery Better Than Human Experts
Futurity
Article
AI Teaches Brain Tumor Surgery Better Than Human Experts
Feb 24, 2022
2 min read
Opening The ‘Black Box,’ Google DeepMind AI System Diagnoses Eye Diseases And Shows Its Work
STAT
Article
Opening The ‘Black Box,’ Google DeepMind AI System Diagnoses Eye Diseases And Shows Its Work
Aug 13, 2018
Experts said the level of accuracy is impressive, but the bigger breakthrough is the DeepMind system’s solution to the so-called “black box” problem of artificial intelligence.
5 min read
Is Artificial Intelligence Permanently Inscrutable?: Despite new biology-like tools, some insist interpretation is impossible.
Nautilus
Article
Is Artificial Intelligence Permanently Inscrutable?: Despite new biology-like tools, some insist interpretation is impossible.
Sep 1, 2016
Dmitry Malioutov can’t say much about what he built. As a research scientist at IBM, Malioutov spends part of his time building machine learning systems that solve difficult problems faced by IBM’s corporate clients. One such program was meant for a
13 min read
Is Artificial Intelligence Permanently Inscrutable?
Nautilus
Article
Is Artificial Intelligence Permanently Inscrutable?
Sep 1, 2016
Dmitry Malioutov can’t say much about what he built. As a research scientist at IBM, Malioutov spends part of his time building machine learning systems that solve difficult problems faced by IBM’s corporate clients. One such program was meant for a
13 min read
AI Boosts Clarity Of Laser-based Imaging Tech
Futurity
Article
AI Boosts Clarity Of Laser-based Imaging Tech
Oct 3, 2019
3 min read
AI Microscope Could Check Tumor Removal In Minutes
Futurity
Article
AI Microscope Could Check Tumor Removal In Minutes
Jan 4, 2021
3 min read
Opinion: Faster Concussion Diagnoses Are on the Horizon. Patent Filings Offer a Peek
STAT
Article
Opinion: Faster Concussion Diagnoses Are on the Horizon. Patent Filings Offer a Peek
Dec 18, 2017
5 min read
Wireless Neuro-Stimulator To Revolutionise Patient Care
The Art of Healing
Article
Wireless Neuro-Stimulator To Revolutionise Patient Care
Aug 25, 2022
1 min read
Advancing Healthcare Medical Image Processing
Techfastly
Article
Advancing Healthcare Medical Image Processing
Dec 1, 2021
3 min read
In Tandem
Beijing Review
Article
In Tandem
Aug 31, 2023
The 2023 World Robot Conference (WRC), held in Beijing from August 16 to 22, showcased the latest technological innovations in robotics from around the world. It featured a main forum and more than 20 subforums, each with a different focus—ranging fr
4 min read
With Brain Trauma, Surgeons Don’t Always Know If An Operation Will Help. Could AI Change That?
STAT
Article
With Brain Trauma, Surgeons Don’t Always Know If An Operation Will Help. Could AI Change That?
Aug 14, 2019
5 min read
Light Decodes What A Person Sees From Brain Signals
Futurity
Article
Light Decodes What A Person Sees From Brain Signals
Feb 9, 2021
3 min read
Algorithm Predicts Epileptic Seizures in Real-Time
Futurity
Article
Algorithm Predicts Epileptic Seizures in Real-Time
Apr 25, 2017
Engineering students have created a system designed to prevent seizures caused by epilepsy, a neurological disorder affecting millions. First, the team needed to develop a seizure-prediction algorithm. The students created a machine-learning algorith
2 min read
Device Pinpoints Tumors To Make Lumpectomies More Precise
Futurity
Article
Device Pinpoints Tumors To Make Lumpectomies More Precise
Oct 23, 2018
5 min read
Algorithm Warns Two Years Before Dementia Begins
Futurity
Article
Algorithm Warns Two Years Before Dementia Begins
Aug 24, 2017
2 min read
Implant That Turns Thought Into Action Appears Safe In Trial
Futurity
Article
Implant That Turns Thought Into Action Appears Safe In Trial
Jan 19, 2023
3 min read
Method Uses Light To Measure Brain Blood Flow
Futurity
Article
Method Uses Light To Measure Brain Blood Flow
May 13, 2021
2 min read
Phone Tool Can Spot Strokes As Well As ER Docs
Futurity
Article
Phone Tool Can Spot Strokes As Well As ER Docs
Nov 2, 2020
3 min read
Stretchy Sensor Works Like A Brain To Process Health Data
Futurity
Article
Stretchy Sensor Works Like A Brain To Process Health Data
Aug 8, 2022
2 min read
In Rise Of Brain Implants, Blurring Lines Between Man, Machine?
The Christian Science Monitor
Article
In Rise Of Brain Implants, Blurring Lines Between Man, Machine?
Jul 19, 2019
Elon Musk and his company Neuralink see the prospect of humans “merging” with artificial intelligence, as brain-implant technology improves.
4 min read
Could Light And Sound Rid Heart Imaging Of Scary Radiation?
Futurity
Article
Could Light And Sound Rid Heart Imaging Of Scary Radiation?
Apr 17, 2020
2 min read
FDA Approves AI-based Software That Helps Doctors Take Ultrasound Pictures Of The Heart
STAT
Article
FDA Approves AI-based Software That Helps Doctors Take Ultrasound Pictures Of The Heart
Feb 7, 2020
The FDA has approved a software product from an AI startup that is aimed at making it easier for doctors and other medical professionals to take ultrasound pictures of the…
1 min read
Cambridge-1 And The Future Of Medicine
PC Pro Magazine
Article
Cambridge-1 And The Future Of Medicine
Sep 9, 2021
7 min read
Brain Stimulation Could Help Doctors Learn To Use Surgery Robots
Futurity
Article
Brain Stimulation Could Help Doctors Learn To Use Surgery Robots
Jan 2, 2024
People who received gentle electric currents on the back of their heads learned to maneuver a robotic surgery tool in virtual reality and then in a real setting much more easily than people who didn’t receive those nudges, a new study shows. The find
2 min read
DeepMind AI Predicts Acute Loss Of Kidney Function Two Days In Advance, Study Shows
STAT
Article
DeepMind AI Predicts Acute Loss Of Kidney Function Two Days In Advance, Study Shows
Jul 31, 2019
DeepMind's AI was able to predict 90% of acute kidney injury episodes that required dialysis, with a lead time of 48 hours.
2 min read
Virtual Reality ‘Clinic’ Brings Stroke Therapy Home
Futurity
Article
Virtual Reality ‘Clinic’ Brings Stroke Therapy Home
Nov 22, 2019
3 min read
Circuit Programs Human Cells to Add and Subtract
Futurity
Article
Circuit Programs Human Cells to Add and Subtract
Apr 15, 2017
A new platform offers a fast and more efficient way to target and program mammalian cells as genetic circuits, even complex ones. “The problem synthetic biologists are trying to solve is how we ask cells to make decisions and try to design a strategy
2 min read

Related categories

Skip carousel

Reviews for Deep Learning for Medical Image Analysis

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Deep Learning for Medical Image Analysis - S. Kevin Zhou

Part 1: Deep learning theories and architectures

Outline

Chapter 1. An introduction to neural networks and deep learning

Chapter 2. Deep reinforcement learning in medical imaging

Chapter 3. CapsNet for medical image segmentation

Chapter 4. Transformer for medical image analysis

Chapter 1: An introduction to neural networks and deep learning

Ahmad Wisnu Mulyadib; Jee Seok Yoonb; Eunjin Jeonb; Wonjun Kob; Heung-Il Suka,b aKorea University, Department of Artificial Intelligence, Seongbuk-Gu, Seoul, Korea

bKorea University, Department of Brain and Cognitive Engineering, Seongbuk-Gu, Seoul, Korea

Abstract

Artificial neural networks, conceptually and structurally inspired by neural systems, are of great interest along with deep learning, thanks to their great successes in various fields, including medical imaging analysis. In this chapter, we describe the fundamental concepts and ideas of (deep) neural networks and explain algorithmic advances to learn network parameters efficiently by avoiding overfitting. Specifically, this chapter focuses on introducing 1) feed-forward neural networks, 2) gradient descent-based parameter optimization algorithms, 3) different types of deep models, 4) technical tricks for fast and robust training of deep models and 5) open-source deep learning frameworks for quick practice.

Keywords

Artificial neural network; Deep learning; Feedforward neural network; Convolutional neural network; Recurrent neural network; Deep generative models

1.1 Introduction

A brain or biological neural network is considered as the most well-organized system that processes information from different senses such as sight, hearing, touch, taste and smell in an efficient and intelligent manner. One of the key mechanisms for information processing in a human brain is that the complicated high-level information is processed by means of the collaboration, i.e., connections (called synapses), of a large number of the structurally simple elements (called neurons). In machine learning, artificial neural networks are a family of models that mimic the structural elegance of the neural system and learn patterns inherent in observations.

1.2 Feed-forward neural networks

This section introduces neural networks that process information in a feed-forward manner. Throughout the chapter, matrices and vectors are denoted as boldface uppercase letters and boldface lowercase letters, respectively, and scalars are denoted as normal italic letters. For a transpose operator, a superscript ⊤ is used.

1.2.1 Perceptron

The simplest learnable artificial neural model, known as perceptron [1], is structured with input visible units , trainable connection weights and a bias , and an output unit y as shown in Fig. 1.1(a). Since the perceptron model has a single layer of an output unit, not counting the input visible layer, it is also called a single-layer neural network. Given an observation¹ or datum , the value of the output unit y is obtained from an activation function by taking the weighted sum of the inputs as follows:

(1.1)

where denotes a parameter set, is a connection weight vector and is a bias. Let us introduce a pre-activation variable z that is determined by the weighted sum of the inputs, i.e., . As for the activation function , a "logistic sigmoid" function, i.e., , is commonly used for a binary classification task.

Figure 1.1 An architecture of a single-layer neural network.

Regarding a multi-output task, e.g., multi-class classification or multi-output regression, it is straightforward to extend the perceptron model by adding multiple output units (Fig. 1.1(b)), one for each class, with their respective connection weights as follows:

(1.2)

where , denotes a connection weight from to . As for the activation function, it is common to use a "softmax" function for multi-class classification, where the output values can be interpreted as probability.

1.2.2 Multi-layer perceptron

One of the main limitations of the single-layer neural network comes from its linear separation for a classification task, despite the use of nonlinear activation function. This limitation can be circumvented by introducing a so-called "hidden" layer between the input layer and the output layer as shown in Fig. 1.2. For a two-layer neural network, which is also known as multi-layer perceptron (MLP), we can write its composition function as follows:

(1.3)

where the superscript denotes a layer index, M denotes the number of hidden units and . Hereafter, the bias term is omitted for simplicity. It is possible to add a number of hidden layers ( ) and the corresponding estimation function is defined as

(1.4)

Although different types of activation functions can be applied to different layers or even different units, in theory, it is common to apply the same type of activation function for the hidden layers in the literature. Here, it should be a nonlinear function; otherwise, the function will be represented by a single-layer neural network with a weight matrix equal to the resulting matrix of multiplying weight matrices of hidden layers. Regarding the activation function, a sigmoidal function such as a logistic sigmoid function and a hyperbolic tangent function is commonly used in earlier models thanks to their nonlinear and differential characteristics. However, the two activation functions make it difficult to train the neural network when stacking layers deeply. In this respect, recent works [2–5] proposed other nonlinear functions, and their details are provided in Section 1.6.2.

Figure 1.2 An architecture of a two-layer neural network.

1.2.3 Learning in feed-forward neural networks

In terms of network learning, there are two fundamental problems, namely, network architecture learning and network parameters learning. While the network architecture learning still remains an open question,² there exists an efficient algorithm for network parameters learning as circumstantiated below.

The problem of learning parameters of an L-layer neural network can be formulated as error function minimization. Given a training data set , where denotes an observation and denotes a class indicator vector with one-of-K encoding, i.e., for a class k, only the kth element in a vector is 1 and all the other elements are 0. For a K-class classification, it is common to use a cross-entropy cost function defined as follows:

(1.5)

where denotes the kth element of the target vector . is the kth element of the prediction vector for , which is obtained by Eq. (1.4) with the parameter set .

The error function in Eq. (1.5) is highly nonlinear and nonconvex. Thus there is no analytic solution of the parameter set W that minimizes Eq. (1.5). Instead, we resort to a gradient descent algorithm by updating the parameters iteratively. Specifically, the parameters of L-layers, W, are updated as follows:

(1.6)

where τ denotes an iteration index, η is a learning rate, while denoting as the gradient set of W that are obtained by means of error backpropagation [6]. To compute the derivative of an error function E with respect to the parameters of lth layer, i.e., , we propagate errors from the output layer back to the input layer by a chain rule:

(1.7)

where and denote, respectively, the pre-activation vector and the activation vector of the layer l and . Note that , or equally , corresponds to the error computed at the output layer. For the estimation of the gradient of an error function E with respect to the parameter , it utilizes the error propagated from the output layer through the chains in the form of , , along with . The fraction can also be computed in a similar way as follows:

(1.8)

(1.9)

where denotes a gradient of an activation function with respect to the pre-activation vector .

As for the parameter update in Eq. (1.6), there are two different approaches depending on the timing of parameter update, namely, batch gradient descent and stochastic gradient descent. The batch gradient descent updates the parameters based on the gradients ∇E evaluated over the whole training samples. Meanwhile, the stochastic gradient descent sequentially updates weight parameters by computing gradient on the basis of one sample at a time. When it comes to large-scale learning such as deep learning, it is advocated to apply stochastic gradient descent [7]. As a trade-off between batch gradient and stochastic gradient, a mini-batch gradient descent method, which computes and updates the parameters on the basis of a small set of samples, is commonly used in the literature [8].

1.3 Convolutional neural networks

In conventional multi-layer neural networks, the inputs are always in vector form. However, for (medical) images, the structural or configural information among neighboring pixels or voxels is another source of information. Hence, vectorization inevitably destroys such structural and configural information in images. A convolutional neural network (CNN) that typically has convolutional layers interspersed with pooling (or sub-sampling) layers and then followed by fully connected layers as in a standard multi-layer neural network (Fig. 1.3) is designed to better utilize such spatial and configuration information by taking 2D or 3D images as input. Unlike the conventional multi-layer neural networks, a CNN exploits extensive weight-sharing to reduce the degrees of freedom of models. A pooling layer helps reduce computation time and gradually builds up spatial and configural invariance.

Figure 1.3 An architecture of a convolutional neural network.

1.3.1 Convolution and pooling layer

The role of a convolution layer is to detect local features at different positions in the input feature maps with learnable kernels , i.e., connection weights between the feature map i at the layer and the feature map j at the layer l. Specifically, the units of the convolution layer l compute their activations based only on a spatially contiguous subset of units in the feature maps of the preceding layer by convolving the kernels as follows:

(1.10)

where denotes the number of feature maps in the layer , ⁎ denotes a convolution operator, is a bias parameter and is a nonlinear activation function. Due to the local connectivity and weight sharing, we can greatly reduce the number of parameters compared to a fully connected neural network, and it is possible to avoid overfitting. Further, when the input image is shifted, the activation of the units in the feature maps are also shifted by the same amount, which allows a CNN to be equivariant to small shifts, as illustrated in Fig. 1.4. In the figure, when the pixel values in the input image are shifted by one-pixel right and one-pixel down, the outputs after convolution are also shifted by one-pixel right and one-pixel down.

Figure 1.4 Illustration of translation invariance in convolution neural network. The bottom leftmost input is a translated version of the upper leftmost input image by one-pixel right and one-pixel down.

A pooling layer follows a convolution layer by downsampling the feature maps of the preceding convolution layer. Specifically, each feature map in a pooling layer is linked with a feature map in the convolution layer, and each unit in a feature map of the pooling layer is computed based on a subset of units in a receptive field. Similar to the convolution layer, the receptive field that finds a maximal value among the units in its field is convolved with the convolution map but with a stride of the size of the receptive field so that the contiguous receptive fields are not overlapped. The role of the pooling layer is to progressively reduce the spatial size of the feature maps and to reduce the number of parameters and computation involved in the network. Another important function of the pooling layer is for translation invariance over small spatial shifts in the input. In Fig. 1.4, while the bottom leftmost image is a translated version of the top leftmost image by one-pixel right and one-pixel down, their outputs after convolution and pooling operations are the same, especially for the units in green.

1.3.2 Computing gradients

Assume that a convolution layer is followed by a pooling layer. In such a case, units in a feature map of a convolution layer l are connected to a single unit of the corresponding feature map in the pooling layer . By up-sampling the feature maps of the pooling layer to recover the reduced size of maps, all we need to do is to multiply with the derivative of the activation function evaluated at the convolution layer's pre-activations as follows:

(1.11)

where ⊙ and denote an element-wise multiplication and up-sampling operation, respectively.

For the case when a current layer, whether it is a pooling layer or a convolution layer, is followed by a convolution layer, we must figure out which patch in the current layer's feature map corresponds to a unit in the next layer's feature map. The kernel weights multiplying the connections between the input patch and the output unit are exactly the weights of the convolutional kernel. The gradients for the kernel weights are computed by the chain rule similar to backpropagation. However, since the same weights are now shared across many connections, we need to sum the gradients for a given weight over all the connections using the kernel weights as follows:

(1.12)

where denotes the patch in the ith feature map of the layer , i.e., , which was multiplied by during convolution to compute the element at in the output feature map .

1.3.3 Deep convolutional neural networks

With the advances in computing hardware, recent works utilizing neural networks have grown in depth (i.e., convolution and pooling layers) and width (i.e., channel size). However, as CNNs get deeper and wider, the difficulties (e.g., computational cost, vanishing gradients and degradation) in training them also grow. Thus various methods for improving the computational efficiency of deep models are described in the following sections.

1.3.3.1 Skip connection

Skip connection, or shortcut connection, constructs an alternative path for gradients to flow from one layer to layers in the deeper part of the neural network. Specifically, skip connection constructs a path that jumps over one or more layers via addition or concatenation. For example, residual connections [9] constructs the skip connections via addition as follows:

(1.13)

where and is the kernel and bias parameter, respectively. Similarly, dense connection [10] constructs the skip connection via channel-wise concatenation. This construction via concatenation allows connections to receive from all previously connected feature maps, introducing connections in the lth skip connection instead of l connections in residual connection.

1.3.3.2 Inception module

Inception module [11] was introduced to significantly reduce the computational cost of deep neural networks via sparse multi-scale processing. Specifically, it reduces the number of arithmetic operations of convolution functions by reducing the filter size, i.e., introducing sparsity, via convolution layers followed by convolution layers in different kernel sizes. In practice, the inception module achieved state-of-the-art performance in image classification tasks while reducing the number of arithmetic operations by 100 times compared to its counterparts. Several improvements to the inception module were made throughout the past decade. For example, Inception-v2 and -v3 [12] utilize even sparser, i.e., smaller kernel sizes, convolution operations to improve the computational efficiency. Inception-v4, or Inception-ResNet-v1 and -v2 [13], include the skip connection construction in addition to the multi-scale convolution operations.

1.3.3.3 Attention

Attention mechanism in deep learning allows neural networks to attend to salient information from noisy data but also can act as a memory function. These attention methods can be broadly categorized by the form of the attention function: soft vs. hard attention, global vs. local attention, and multi-head attention. Soft attention [14], also commonly known as global attention [15], places attention over all patches of an image, while hard attention [16] selects a single patch at a time. To this end, soft attention is generally more favorable in terms of computational efficiency due to the fact that hard attention models are nondifferentiable and require special techniques such as reinforcement learning. As such, local attention [14] is a differential model that combines the advantages of soft and hard attention. Meanwhile, multi-head attention [17] attends to different information in a parallel manner.

In practice, attention mechanisms for medical image analysis typically utilize channel-wise and spatial-wise attention [18] as well as global attention [19] to improve the model performance. Note that such attention techniques have been used as a tool for visual interpretation methods [20] as well, where most attended regions can localize the features that support the decision made by a neural network.

1.4 Recurrent neural networks

The (medical) images are commonly embodied with corresponding attributes at the time of measurement. In the case of the medical domain, it could be the subject's clinical measurements (i.e., vital measurements, lab results, clinical notes, etc.). Then the (multi-modal) sequential data emerge as these data were measured periodically, which require distinguished deep models for effectively incorporating the entire timespan of such data. Thus we dispense this section to concisely cover the recurrent neural networks (RNNs) as it is reputably robust in handling variable-length sequential data over diverse (clinical) downstream tasks.

1.4.1 Recurrent cell

RNNs process the sequential data through the so-called recurrent cell, as illustrated in Fig. 1.5. Suppose that the T-length sequential data with comes with the corresponding labels , where . For each timestep, a typical recurrent cell integrates the input with the previous hidden state as

(1.14)

where , and b denote the input and hidden state transformation weights as well as the bias, respectively. Here, the hyperbolic tangent served as the activation function to transform the outcomes into . Meanwhile, the initial hidden state could be either initialized with zeros or inferred from auxiliary networks. Thus, as holds the summarization of the underlying information in the sequences so far, the prediction could be inferred as

(1.15)

(1.16)

with and c denote the weights and bias, correspondingly. Note that other circumstances may employ only the last hidden state for predicting the single label y. For now, let us assume that sequence of data and its label has an equal length so that we require the model to predict it for each tth timestep. Thus we train the RNNs by devising the following loss function over entire T-length sequences as

(1.17)

As RNNs incorporate the forward propagation over the whole timestep of sequences, the gradient should be evaluated via a backpropagation through time [21]. Furthermore, as the weights of the recurrent cell are shared across the sequences, starting from the output, we can calculate each gradient with respect to and c, respectively, as follows:

(1.18)

(1.19)

Subsequently, we could aggregate the gradients with respect to weights , and bias b across the entire timesteps as

(1.20)

(1.21)

(1.22)

Figure 1.5 Graphical illustration of RNNs with the (a) rolled and (b) unrolled computational graph over the timesteps.

1.4.2 Vanishing gradient problem

As the RNNs deal with the T-length sequential data, the more it has longer T, it involves the higher risk of getting either gradient vanishing or exploding issues due to the repetitive matrix multiplications in inferring the gradients across the timesteps [22]. To this end, an improvement upon the vanilla recurrent cell was proposed to address such issue and was pioneered by the long-short term memory (LSTM) [23], as it incorporates a dedicated memory cell and gating mechanism to adequately govern the information flow over the sequences, allowing for long-term dependencies learning.

Such an LSTM cell is comprised of several following gating operations:

(1.23)

(1.24)

(1.25)

(1.26)

(1.27)

(1.28)

with ⊕ denotes the concatenation operator. In a nutshell, given the input and , an LSTM cell introduces the and , which served as the forget and input gate, respectively. These factors then regulate which information shall be pruned or retained to be stored in the new cell state . It further incorporates the cell state to obtain the current hidden state by considering the output gate . Therefore due to incorporating these gating mechanisms and the novel memory cell, the meaningful information can then be conveyed over by the recurrence cell to the (distant) future timesteps, mitigating the issue of vanishing (or exploding) gradient, and also promising an improvement over the downstream task. Finally, similar to vanilla RNNs, we could further employ such hidden state for obtaining the predicted label for each timestep as follows:

(1.29)

In addition to the LSTM, these days numerous alternatives to the vanilla recurrent cell were proposed, with peephole LSTM [24] and gated recurrent unit (GRU) [25] being the two most popular among them.

1.5 Deep generative models

1.5.1 Restricted Boltzmann machine

A restricted Boltzmann machine (RBM) is a two-layer undirected graphical model with visible and hidden units in each layer. Note that the visible units are related to observations, and the hidden units represent the structures or dependencies over the visible units. It assumes symmetric connectivity between the visible layer and the hidden layer but no connections within the layers, and each layer has a bias term, and , respectively. Due to the symmetry of the weight matrix W, it is possible to reconstruct the input observations from the hidden representations. Hence, an RBM is naturally regarded as an autoencoder [26] and these favorable characteristics are used in RBM parameters learning [26]. In RBM, a joint probability of is given by

(1.30)

where , is an energy function, and is a partition function that can be obtained by summing over all possible pairs of v and h. For the sake of simplicity, by assuming binary visible and hidden units, which are the commonly studied case, the energy function is defined as

(1.31)

The conditional distribution of the hidden units given the visible units and also the conditional distribution of the visible units given the hidden units are respectively computed as

(1.32)

(1.33)

where is a logistic sigmoid function. Due to the unobservable hidden units, the objective function is defined as the marginal distribution of the visible units as

(1.34)

The RBM parameters are usually trained using a contrastive divergence algorithm [27] that maximizes the log-likelihood of observations.

1.5.2 Deep belief network

Since an RBM is a kind of an autoencoder, it is straightforward to stack multiple RBMs for deep architecture construction, similar to stacked autoencoder (SAE) (will be covered later in Section 1.6.1), which results in a single probabilistic model called a Deep Belief Network (DBN). That is, a DBN has one visible layer v and a series of hidden layers . Between any two consecutive layers, let denote the corresponding RBM parameters. Note that while the top two layers still form an undirected generative model, i.e., RBM, the lower layers form directed generative models. Hence, the joint distribution of the observed units v and the L hidden layers ( ) in DBN is given as follows:

(1.35)

where , corresponds to a conditional distribution for the units of the layer l given the units of the layer , and denotes the joint distribution of the units in the layers and L.

As for the parameter learning, the pretraining scheme described in Section 1.6.1 can also be applied as follows:

(i) Train the first layer as an RBM with .

(ii) Use the first layer to obtain a representation of the input that will be used as observation for the second layer, i.e., either the mean activations of or samples of .

(iii) Train the second layer as an RBM, taking the transformed data (samples or mean activations) as training examples (for the visible layer of the RBM).

(iv) Iterate (ii) and (iii) for the desired number of layers, each time propagating upward either samples or mean activations.

This greedy layerwise training of the DBN can be justified as increasing a variational lower bound on the log-likelihood of the data [26]. After the greedy layerwise procedure is completed, it is possible to perform generative fine-tuning using the wake-sleep algorithm [28]. However, in practice, no further procedure is made to train the whole DBN jointly. In order for the use of a DBN in classification, a trained DBN can also be directly used to initialize a deep neural network with the trained weights and biases. Then the deep neural network can be fine-tuned by means of backpropagation and (stochastic) gradient descent.

1.5.3 Deep Boltzmann machine

A deep Boltzmann machine (DBM) is also structured by stacking multiple RBMs in a hierarchical manner. However, unlike DBN, all the layers in DBM still form an undirected generative model after stacking RBMs. For a classification task, DBM replaces its RBM at the top hidden layer with a discriminative RBM [29]. That is, the top hidden layer is now connected to both the lower hidden layer and an additional label layer (the label of the input). In order to learn the parameters, including the connectivities among hidden layers and another connectivity between the top hidden layer and the label layer, we maximize the log-likelihood of the observed data (i.e., the visible data and a class-label) with a gradient-based optimization strategy. In this way, a DBM can be trained to discover hierarchical and discriminative feature representations [29]. Similar to DBN, it can be applied for a greedy layerwise pre-training strategy to provide a good initial configuration of the parameters, which helps the learning procedure converge much faster than random initialization. However, since the DBM integrates both bottom-up and top-down information, the first and last RBMs in the network need modification by using weights twice as big as in one direction. Then it is performed for iterative alternation of variational mean-field approximation to estimate the posterior probabilities of hidden units and stochastic approximation to update model parameters.

1.5.4 Variational autoencoder

1.5.4.1 Autoencoder

An autoencoder, also called as auto-associator, is a special type of a two-layer neural network composed of an input layer, a hidden layer and an output layer. The input layer is fully connected to the hidden layer (i.e., an encoder), which is further fully connected to the output layer (i.e., a decoder) as illustrated in Fig. 1.6(a). Depending on the nature of the input data, the choices upon the type of such neural network are quite broad, ranging from straightforward MLP, CNNs to graph neural networks (GNNs). In general, the aim of an autoencoder is to learn a latent or compressed representation of the input by minimizing the reconstruction error between the input and the reconstructed values from the learned representation.

Figure 1.6 A graphical illustration of (a) an autoencoder, (b) variational autoencoder, and (c) stacked autoencoder. Note that the dashed red arrows indicate the reparameterization trick.

Let and denote the number of hidden units and the number of input units in a neural network, respectively. An autoencoder maps an input to a latent representation through a linear mapping and then a nonlinear transformation with a nonlinear activation function f as follows:

(1.36)

where is an encoding weight matrix, and is a bias vector. The representation z of the hidden layer is then mapped back to a vector , which approximately reconstructs the input vector x by another mapping as follows:

(1.37)

where and are a decoding weight matrix and a bias vector, respectively. Structurally, the number of input units and the number of output units is determined by the dimension of an input vector. Meanwhile, the number of hidden units can be determined based on the nature of the data. If the number of hidden units is less than the dimension of the input data, then the autoencoder can be used for dimensionality reduction. However, it is worth noting that to obtain complicated nonlinear relations among input features, it is possible to allow the number of hidden units to be even larger than the input dimension, from which we can still find an interesting structure by imposing a sparsity constraint [30,31].

From a learning perspective, the goal of an autoencoder is to minimize the reconstruction error between the input x and the output with respect to the parameters. Given a training set , let denote a reconstruction error over training samples. To encourage sparseness of the hidden units, it is common to use Kullback–Leibler (KL) divergence to measure the difference between the average activation of the jth hidden unit over the training samples and the target average activation ρ defined as [32],

(1.38)

Then our objective function can be written as

(1.39)

where γ denotes a sparsity control parameter. With the introduction of the KL divergence, the error function is penalized by a large average activation of a hidden unit over the training samples by setting ρ to be small. This penalization drives the activation of many hidden units to be equal or close to zero by making sparse connections between layers.

1.5.4.2 Variational autoencoder

We could further extend the concept of the autoencoder as the deep generative model by means of variational autoencoder (VAE), as illustrated in Fig. 1.6(b). In contrast with vanilla autoencoder, a VAE takes into account the prior distributions of the latent representation z via , in which we assume that such latent vector governing the generation the data x through a conditional distribution . Furthermore, a typical VAE approximate the intractable true posterior by introducing an approximate posterior using a Gaussian distribution . Such mean and variance are inferred from the respective encoding neural networks (i.e., encoder ) as

(1.40)

with ϕ denoting the parameters of such networks. To obtain the latent representation z, we draw and further apply the reparameterization trick [33] such that

(1.41)

Such a trick is necessary for enabling the optimization of the network's parameters via gradient-based approaches. Furthermore, we could generate the x by utilizing the latent representation z through a decoding neural network with parameters θ as

(1.42)

Finally, VAE is trained to optimize the variational evidence lower bound (ELBO) through the objective function in Eq. (1.43), consisting of an expected reconstruction error as well as KL divergence to impose the approximate posterior as being close as possible to the prior :

(1.43)

1.5.5 Generative adversarial network

Recently, generative adversarial network (GAN), a deep learning-based implicit density estimation model, has demonstrated its characteristic caliber of generation by learning deep representations of data distribution without labels [34]. As conceptualized in Fig. 1.7, GAN is composed of two neural networks: (i) a generator which tries to synthesize realistic samples, , using a latent code vector z; and (ii) a discriminator that learns to discriminate the real sample x from the generated one, i.e., , by estimating a probability of whether the input is real. GAN, to simultaneously optimize those two neural networks and , uses a game theory-based min-max objective function:

(1.44)

where and denote the real data distribution and the latent code distribution, respectively. Mathematically, in Eq. (1.44), the Jensen–Shannon distance (JSD) estimates the distance between those two distributions, the real and the generated data distributions. Note that is minimized when is coming close to 1, i.e., the generator makes realistic samples, and is maximized when is going to 1 as well as reaches 0; therefore tries to correctly decide the real and fake samples. Since GAN has shown promising generation performance, there is still room for improvement with a modification of the loss function [35,36]. In this regard, attempts to exploit other distances for the GAN loss function instead of the JSD have gained widespread attention from deep learning researchers.

Figure 1.7 Illustration of a generative adversarial network.

Mao et al. [35] slightly modified the GAN loss function and named their method least-square GAN (LSGAN). More specifically, they minimized the Pearson- distance between the real and the synthesized data distributions. To do so, they modified the loss to

(1.45)

(1.46)

and set while . This modified objective function gives a greater gradient value to fake samples, which are farther from the decision boundary of real samples, thereby suppressing the gradient vanishing problem.

Similar to LSGAN [35], Arjovsky et al. [36] also focused on replacing the JSD to the other distance. They showed that the Wasserstein distance can be applied to the GAN objective function in a mathematically rigorous manner and proposed a modified loss function:

(1.47)

where , a critic, is the 1-Lipschitz function, which is used instead of the discriminator. In this objective, the critic scores the realness or fakeness of the input, whereas the discriminator estimates the probability of whether the input is real. To make the critic satisfy the Lipschitz constraint, Arjovsky et al. used weight clipping on the critic and this method is widely known as Wasserstein GAN (WGAN). On the other hand, Gulrajani et al. [37] removed the weight clipping by adding a regularization term, the so-called gradient penalty (GP). The objective function of WGAN with GP is

(1.48)

where is the -norm and is defined as

(1.49)

Here, Gulrajani et al. give penalization to weights of the critic network. By doing so, WGAN with GP also could gratify the Lipschitz condition.

1.6 Tricks for better learning

Earlier, LeCun et al. presented that by transforming data to have an identity covariance and a zero mean, i.e., data whitening, the network training could converge faster [38,39]. Besides such a simple trick, recent studies have devised other nice tricks to better train deep models.

1.6.1 Parameter initialization in autoencoder

In regards to the autoencoder, note that the outputs of units in the hidden layer of the encoding networks become the latent representation of the input vector. However, due to its simple shallow structural characteristic, the representational power of a single-layer autoencoder is known to be very limited. However, when stacked with multiple autoencoders by taking the activation values of hidden units of an autoencoder as the input to the following upper autoencoder and building a SAE (Fig. 1.6(c)), it is possible to improve the representational power greatly [40]. Thanks to the hierarchical structure, one of the most important characteristics of the SAE is to learn or discover highly nonlinear and complicated patterns such as the relations among input features. When an input vector is presented to a SAE, the different layers of the network represent different levels of information. That is, the lower the layer in the network, the simpler the patterns that are learned; the higher the layer, the more complicated or abstract patterns inherent in the input feature vector. With regard to training parameters of the weight matrices and the biases in an SAE, a straightforward way is to apply backpropagation with the gradient-based optimization technique starting from random initialization by regarding the SAE as a conventional multi-layer neural network. Unfortunately, it is generally known that deep networks trained in that manner perform worse than networks with a shallow architecture, suffering from falling into a poor local optimum [31]. A greedy layerwise learning [26] could be used to circumvent this problem. The key idea in a greedy layerwise learning is to train one layer at a time by maximizing the variational lower bound. That is, we first train the 1st hidden layer with the training data as input, and then train the 2nd hidden layer with the outputs from the 1st hidden layer as input, and so on. That is, the representation of the lth hidden layer is used as input for the -th hidden layer. This greedy layerwise learning is performed as "pretraining (Figs. 1.8(a)–1.8(c)). The important feature of pre-training is that it is conducted in an unsupervised manner with a standard backpropagation algorithm [41]. When it comes to a classification problem, we stack another output layer on top of the SAE (Fig. 1.8(d)) with an appropriate activation function. This top output layer is used to represent the class-label of an input sample. Then by taking the pretrained connection weights as the initial parameters for the hidden units and randomly initializing the connection weights between the top hidden layer and the output layer, it is possible to train the whole parameters jointly in a supervised manner by gradient descent with a backpropagation algorithm. Note that the initialization of the parameters via pretraining helps the supervised optimization, called fine-tuning," reduce the risk of falling into poor local optima [26,31].

Figure 1.8 Greedy layerwise pretraining (highlighted with the blue connections in (a–c)) and fine-tuning of the whole network. ( H i denotes the i th hidden layer in the network.)

1.6.2 Activation functions

In a deep learning framework, the main purpose of the activation function is to introduce nonlinearity into deep neural networks. The nonlinearity means that the output of the neural network cannot be reproduced from affine transformations, i.e., the output must be different from a linear combination of the input values.

There are many old-fashioned nonlinear activations such as the logistic sigmoid, and the hyperbolic tangent. But the gradient of those functions vanishes as the values of the respective inputs increases or decreases, which is known as one of the sources to cause the vanishing gradient problem. In this regard, Nair and Hinton suggested using a rectified linear unit (ReLU) function [2]. The ReLU function only takes positive input values:

(1.50)

thereby validating its usefulness to improve training time by resolving the vanishing gradient problem. However, the ReLU has two mathematical problems: (i) it is nondifferentiable at , and thus not valid to be used along with a gradient-based method; (ii) it is unbounded on the positive side and can be a potential problem to cause overfitting. Nonetheless, as for the first problem, since it is highly unlikely that the input to any hidden unit will be at exactly at any time, in practice, the gradient of the ReLU at is set either 0 or 1. Regarding the unboundedness, the application of a regularization technique is helpful to limit the magnitude of weights, thus circumventing the overfitting issue. The curve for the ReLU function is depicted in Fig. 1.9(a).

Figure 1.9 Plots of (a) rectified linear unit (ReLU), (b) leaky ReLU, (c) exponential linear unit (ELU) and (d) Swish activation functions.

Since the ReLU activation showed its power, many variations are proposed for more robust and sound learning. A leaky ReLU (lReLU) is one of the improved versions of the ReLU function [3]. For the ReLU function, the gradient is 0 for , which would deactivate the outputs in the negative region. The leaky ReLU function slightly activates negative outputs to address this problem and is defined

Enjoying the preview?

Page 1 of 1

Deep Learning for Medical Image Analysis

About this ebook

Read more from S. Kevin Zhou

Related to Deep Learning for Medical Image Analysis

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for Deep Learning for Medical Image Analysis

What did you think?

Book preview

Deep Learning for Medical Image Analysis - S. Kevin Zhou

Abstract

Keywords

1.1 Introduction

1.2 Feed-forward neural networks

1.2.1 Perceptron

1.2.2 Multi-layer perceptron

1.2.3 Learning in feed-forward neural networks

1.3 Convolutional neural networks

1.3.1 Convolution and pooling layer

1.3.2 Computing gradients

1.3.3 Deep convolutional neural networks

1.4 Recurrent neural networks

1.4.1 Recurrent cell

1.4.2 Vanishing gradient problem

1.5 Deep generative models

1.5.1 Restricted Boltzmann machine

1.5.2 Deep belief network

1.5.3 Deep Boltzmann machine

1.5.4 Variational autoencoder

1.5.5 Generative adversarial network

1.6 Tricks for better learning

1.6.1 Parameter initialization in autoencoder

1.6.2 Activation functions