Machine Learning - Advanced Concepts
()
About this ebook
Related to Machine Learning - Advanced Concepts
Related ebooks
Deep Learning for Vision Systems Rating: 5 out of 5 stars5/5Python: Deeper Insights into Machine Learning Rating: 0 out of 5 stars0 ratingsConvolutional Neural Networks in Python: Beginner's Guide to Convolutional Neural Networks in Python Rating: 0 out of 5 stars0 ratingsApplied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition) Rating: 0 out of 5 stars0 ratingsImage Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning Rating: 0 out of 5 stars0 ratingsTensorFlow in 1 Day: Make your own Neural Network Rating: 4 out of 5 stars4/5OpenCV: Computer Vision Projects with Python Rating: 0 out of 5 stars0 ratingsSupport Vector Machine: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsMachine Learning for Time Series Forecasting with Python Rating: 4 out of 5 stars4/5Neural Networks and Fuzzy Logic Rating: 0 out of 5 stars0 ratingsDeep Learning and Parallel Computing Environment for Bioengineering Systems Rating: 0 out of 5 stars0 ratingsIntroduction to Deep Learning and Neural Networks with Python™: A Practical Guide Rating: 0 out of 5 stars0 ratingsMachine Learning Algorithms for Data Scientists: An Overview Rating: 0 out of 5 stars0 ratingsDeep Belief Nets in C++ and CUDA C: Volume 2: Autoencoding in the Complex Domain Rating: 0 out of 5 stars0 ratingsMachine Learning in the AWS Cloud: Add Intelligence to Applications with Amazon SageMaker and Amazon Rekognition Rating: 0 out of 5 stars0 ratingsKeras to Kubernetes: The Journey of a Machine Learning Model to Production Rating: 0 out of 5 stars0 ratingsConvolutional neural network Second Edition Rating: 0 out of 5 stars0 ratingsPattern Recognition and Machine Learning Rating: 0 out of 5 stars0 ratingsGANs in Action: Deep learning with Generative Adversarial Networks Rating: 0 out of 5 stars0 ratingsDeep Learning for Computer Vision with SAS: An Introduction Rating: 0 out of 5 stars0 ratingsMachine Learning: Adaptive Behaviour Through Experience: Thinking Machines Rating: 4 out of 5 stars4/5
Computers For You
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 5 out of 5 stars5/5Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad Rating: 0 out of 5 stars0 ratingsElon Musk Rating: 4 out of 5 stars4/5The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet Rating: 4 out of 5 stars4/5ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology Rating: 0 out of 5 stars0 ratingsThe ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology Rating: 0 out of 5 stars0 ratingsThe Best Hacking Tricks for Beginners Rating: 4 out of 5 stars4/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Deep Search: How to Explore the Internet More Effectively Rating: 5 out of 5 stars5/5How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally Rating: 4 out of 5 stars4/5Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are Rating: 4 out of 5 stars4/5Practical Lock Picking: A Physical Penetration Tester's Training Guide Rating: 5 out of 5 stars5/5People Skills for Analytical Thinkers Rating: 5 out of 5 stars5/5Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls Rating: 4 out of 5 stars4/5CompTIA Security+ Practice Questions Rating: 2 out of 5 stars2/5The Designer's Web Handbook: What You Need to Know to Create for the Web Rating: 0 out of 5 stars0 ratingsLearning the Chess Openings Rating: 5 out of 5 stars5/5The Professional Voiceover Handbook: Voiceover training, #1 Rating: 5 out of 5 stars5/5Web Designer's Idea Book, Volume 4: Inspiration from the Best Web Design Trends, Themes and Styles Rating: 4 out of 5 stars4/5CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61 Rating: 0 out of 5 stars0 ratingsRemote/WebCam Notarization : Basic Understanding Rating: 3 out of 5 stars3/5Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands Rating: 5 out of 5 stars5/5101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters Rating: 4 out of 5 stars4/5
Reviews for Machine Learning - Advanced Concepts
0 ratings0 reviews
Book preview
Machine Learning - Advanced Concepts - Derrick Mwiti
Object Detection
Object detection is a computer vision technique whose aim is to detect objects such as cars, buildings, and human beings, just to mention a few. The objects can generally be identified from either pictures or video feeds. Object detection has been applied widely in video surveillance, self-driving cars, and object/people tracking. In this guide, we’ll look at the basics of object detection and review some of the most commonly-used algorithms and a few brand new approaches, as well.
How Object Detection Works
Object detection locates the presence of an object in an image and draws a bounding box around that object. This usually involves two processes; classifying and object’s type, and then drawing a box around that object. We’ve covered image classification before, so let’s now review some of the common model architectures used for object detection.
R-CNN Model
This technique combines two main approaches: applying high-capacity convolutional neural networks to bottom-up region proposals so as to localize and segment objects; and supervised pre-training for auxiliary tasks. This is followed by domain-specific fine-tuning that yields a high-performance boost. The authors of this paper named the algorithm R-CNN (Regions with CNN features) because it combines regional proposals with convolutional neural networks.
This model takes an image and extracts about 2000 bottom-up region proposals. It then computes the features for each proposal using a large CNN. Thereafter, it classifies each region using class-specific linear Support Vector Machines (SVMs). This model achieves a mean average precision of 53.7% on PASCAL VOC 2010.
The object detection system in this model has three modules. The first one is responsible for generating category-independent regional proposals that define the set of candidate detectors available to the model’s detector. The second module is a large convolutional neural network responsible for extracting a fixed-length feature vector from each region. The third module consists of a class of support vector machines.
This model uses selective search to generate regional categories. Selective search groups regions that are similar based on color, texture, shape, and size. For feature extraction, the model uses a 4096-dimensional feature vector by applying the Caffe CNN implementation on each regional proposal. Forward propagating a 227 × 227 RGB image through five convolutional layers and two fully connected layers computes the features. The model explained in this paper achieves a 30% relative improvement over the previous results on PASCAL VOC 2012.
Some of the drawbacks of R-CNN are:
● Training is a multi-stage pipeline. Tuning a convolutional neural network on object proposals, fitting SVMs to the ConvNet features, and finally learning bounding box regressors.
● Training is expensive in space and time because of deep networks such as VGG16, which take up huge amounts of space.
● Object detection is slow because it performs a ConvNet forward pass for each object proposal.
Fast R-CNN
In this paper, a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection is proposed. It’s implemented in Python and in C++ using Caffe. This model achieves a mean average precision of 66% on PASCAL VOC 2012, versus 62% for R-CNN.
In comparison to the R-CNN, Fast R-CNN has a higher mean average precision, single stage training, training that updates all network layers, and disk storage isn’t required for feature caching.
In its architecture, a Fast R-CNN, takes an image as input as well as a set of object proposals. It then processes the image with convolutional and max-pooling layers to produce a convolutional feature map. A fixed-layer feature vector is then extracted from each feature map by a region of interest pooling layer for each region proposal. The feature vectors are then fed to fully connected layers. These then branch into two output layers. One produces softmax probability estimates over several object classes, while the other produces four real-value numbers for each of the object classes. These 4 numbers represent the position of the bounding box for each of the objects.
Faster R-CNN
This paper proposes a training mechanism that alternates fine-tuning for regional proposal tasks and fine-tuning for object detection.
The Faster R-CNN model is comprised of two modules: a deep convolutional network responsible for proposing the regions, and a Fast R-CNN detector that uses the regions. The Region Proposal Network takes an image as input and generates an output of rectangular object proposals. Each of the rectangles has an objectness score.
Mask R-CNN
The model presented in this paper is an extension of the Faster R-CNN architecture described above. It also allows for the estimation of human poses.
In this model, objects are classified and localized using a bounding box and semantic segmentation that classifies each pixel into a set of categories. This model extends Faster R-CNN by adding the prediction of segmentation masks on each Region of Interest. The Mask R-CNN produces two outputs; a class label and a bounding box.
SSD: Single Shot MultiBox Detector
This paper presents a model to predict objects in images using a single deep neural network. The network generates scores for the presence of each object category using small convolutional filters applied to feature maps.
This approach uses a feed-forward convolutional neural network that produces a collection of bounding boxes and scores for the presence of certain objects. Convolutional feature layers are added to allow for feature detection at multiple scales. In this model, each feature map cell is linked to a set of default bounding boxes. The figure below shows how SSD512 performs on animals, vehicles, and furniture.
You Only Look Once (YOLO)
This paper proposes a single neural network to predict bounding boxes and class probabilities from an image in a single evaluation. The YOLO models process 45 frames per second in real-time. YOLO views image detection as a regression problem, which makes its pipeline quite simple. It’s extremely fast because of this simple pipeline. It can process a streaming video in real-time with a latency of less than 25 seconds. During the training process, YOLO sees the entire image and is, therefore, able to include the context in object detection.
In YOLO, each bounding box is predicted by features from the entire image. Each bounding box has 5 predictions; x, y, w, h, and confidence. (x, y) represents the center of the bounding box relative to the bounds of the grid cell. w and h are the predicted width and height of the whole image. This model is implemented as a convolutional neural network and evaluated on the PASCAL VOC detection dataset. The convolutional layers of the network are responsible for extracting the features, while the fully connected layers predict the coordinates and output probabilities.
The network architecture for this model is inspired by the GoogLeNet model for image classification. The network has 24 convolutional layers and 2 fully-connected layers. The main challenges of this model are that it can only predict one class, and it doesn’t perform well on small objects such as birds.
This model achieves a mean average precision of 52.7%, it is, however, able to go up to 63.4%.
Objects as Points
This paper proposes modeling an object as a single point. It uses keypoint estimation to find center points and regresses to all other object properties. These properties include 3D location, pose orientation, and size. It uses CenterNet, a center point based approach that’s faster and more accurate compared to other bounding box