Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Building Computer Vision Applications Using Artificial Neural Networks: With Step-by-Step Examples in OpenCV and TensorFlow with Python
Building Computer Vision Applications Using Artificial Neural Networks: With Step-by-Step Examples in OpenCV and TensorFlow with Python
Building Computer Vision Applications Using Artificial Neural Networks: With Step-by-Step Examples in OpenCV and TensorFlow with Python
Ebook723 pages4 hours

Building Computer Vision Applications Using Artificial Neural Networks: With Step-by-Step Examples in OpenCV and TensorFlow with Python

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Apply computer vision and machine learning concepts in developing business and industrial applications ​using a practical, step-by-step approach. 

The book comprises four main sections starting with setting up your programming environment and configuring your computer with all the prerequisites to run the code examples. Section 1 covers the basics of image and video processing with code examples of how to manipulate and extract useful information from the images. You will mainly use OpenCV with Python to work with examples in this section. 

Section 2 describes machine learning and neural network concepts as applied to computer vision. You will learn different algorithms of the neural network, such as convolutional neural network (CNN), region-based convolutional neural network (R-CNN), and YOLO. In this section, you will also learn how to train, tune, and manage neural networks for computer vision. Section 3 provides step-by-step examples of developing business and industrial applications, such as facial recognition in video surveillance and surface defect detection in manufacturing. 

The final section is about training neural networks involving a large number of images on cloud infrastructure, such as Amazon AWS, Google Cloud Platform, and Microsoft Azure. It walks you through the process of training distributed neural networks for computer vision on GPU-based cloud infrastructure. By the time you finish reading Building Computer Vision Applications Using Artificial Neural Networks and working through the code examples, you will have developed some real-world use cases of computer vision with deep learning. 

What You Will Learn

·         Employ image processing, manipulation, and feature extraction techniques

·         Work with various deep learning algorithms for computer vision

·         Train, manage, and tune hyperparameters of CNNs and object detection models, such as R-CNN, SSD, and YOLO

·         Build neural network models using Keras and TensorFlow

·         Discover best practices when implementing computer vision applications in business and industry

·         Train distributed models on GPU-based cloud infrastructure 

Who This Book Is For 

Data scientists, analysts, and machine learning and software engineering professionals with Python programming knowledge.



LanguageEnglish
PublisherApress
Release dateJul 15, 2020
ISBN9781484258873
Building Computer Vision Applications Using Artificial Neural Networks: With Step-by-Step Examples in OpenCV and TensorFlow with Python

Related to Building Computer Vision Applications Using Artificial Neural Networks

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for Building Computer Vision Applications Using Artificial Neural Networks

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Building Computer Vision Applications Using Artificial Neural Networks - Shamshad Ansari

    © Shamshad Ansari 2020

    S. AnsariBuilding Computer Vision Applications Using Artificial Neural Networkshttps://doi.org/10.1007/978-1-4842-5887-3_1

    1. Prerequisites and Software Installation

    Shamshad Ansari¹ 

    (1)

    Centreville, VA, USA

    This is a hands-on book that describes how to develop computer vision applications in the Python programming language. In this book, you will learn how to work with OpenCV to manipulate images and build machine learning models using TensorFlow.

    OpenCV, originally developed by Intel and written in C++, is an open source computer vision and machine learning library consisting of more than 2,500 optimized algorithms for working with images and videos. TensorFlow is an open source framework for high-performance numerical computation and large-scale machine learning. It is written in C++ and provides native support for GPUs. Python is the most widely used programming language for developing machine learning applications. It is designed to work with C++. Both TensorFlow and OpenCV provide Python interfaces to access their low-level functionality. Although TensorFlow and OpenCV provide interfaces in other programming languages, such as Java, C++, and MATLAB, we will use Python as the primary language because of its simplicity and its large community of support.

    The prerequisites for this book are practical knowledge of Python and familiarity with NumPy and Pandas. The book assumes that you are familiar with built-in data containers in Python, such as dictionaries, lists, sets, and tuples. Here are some resources that may be helpful to meet the prerequisites:

    Python: https://www.w3schools.com/python/

    Pandas: https://pandas.pydata.org/docs/getting_started/index.html

    NumPy: https://numpy.org/devdocs/user/quickstart.html

    Before we go any further, let’s prepare our working environment and set ourselves up for the exercises we will be doing as we move along. Here we will start by downloading and installing the required software libraries and packages.

    Python and PIP

    Python is our main programming language. PIP is a package installer for Python and a de facto standard for installing and managing Python packages. To set up our working environment, we will begin by installing Python and PIP on our working computer. The installation steps depend on the operating system (OS) you are using. Make sure you follow the instructions for your OS. If you already have Python and PIP installed, ensure that you are using Python version 3.6 or greater and PIP version 19 or greater. To check the version number of Python, execute the following command on your terminal:

    $ python3 --version

    The output of this command should be something like this: Python 3.6.5.

    To check the version number of PIP, execute the following command on your terminal:

    $ pip3 --version

    This command should show a version number of PIP 3, for example, PIP 19.1.

    Installing Python and PIP on Ubuntu

    Run the following commands in your Ubuntu terminal:

    sudo apt update

    sudo apt install python3-dev python3-pip

    Installing Python and PIP on macOS

    Run the following commands on macOS:

    brew update

    brew install python

    This will install both Python and PIP.

    Installing Python and PIP on CentOS 7

    Run the following commands on CentOS 7:

    sudo yum install rh-python36

    sudo yum groupinstall 'Development Tools'

    Installing Python and PIP on Windows

    Install the Microsoft Visual C++ 2015 Redistributable Update 3. This comes with Visual Studio 2015 but can be installed separately by following these steps:

    1.

    Go to the Visual Studio downloads at https://visualstudio.microsoft.com/vs/older-downloads/.

    2.

    Select Redistributables and Build Tools.

    3.

    Download and install the Microsoft Visual C++ 2015 Redistributable Update 3.

    Make sure long paths are enabled on Windows. Here are the instructions to do that: https://superuser.com/questions/1119883/windows-10-enable-ntfs-long-paths-policy-option-missing.

    Install the 64-bit Python 3 release for Windows from https://www.python.org/downloads/windows/ (select PIP as an optional feature).

    If these installation instructions do not work in your situation, refer to the official Python documentation at https://www.python.org/.

    virtualenv

    virtualenv is a tool to create isolated Python environments. virtualenv creates a directory containing all the necessary executables to use the packages that a Python project will need. virtualenv provides the following advantages:

    virtualenv allows you to have two versions of the same library so that both your programs continue to run. Say you have a program that requires version 1 of a Python library and another program needs version 2 of the same library; virtualenv will allow you to run both.

    virtualenv creates a useful stand-alone and self-contained environment for your development work that could be utilized for a production environment without needing to install dependencies.

    Next, we will install virtualenv and configure the environment with all the required software. For the remainder of the book, we will assume that our reference program dependencies will be contained in this virtualenv.

    Install virtualenv using the following PIP command (the command is the same on all OSs):

    $ sudo pip3 install -U virtualenv

    This will install virtualenv system-wide.

    Installing and Activating virtualenv

    First, create a directory where you want to set up virtualenv. I have named this directory cv (short for computer vision).

    $ mkdir cv

    Then create the virtualenv in this directory, cv

    $ virtualenv --system-site-packages -p python3 ./cv

    The following is a sample output from running this command (on my MacBook):

    Running virtualenv with interpreter /anaconda3/bin/python3

    Already using interpreter /anaconda3/bin/python3

    Using base prefix '/anaconda3'

    New python executable in /Users/sansari/cv/bin/python3

    Also creating executable in /Users/sansari/cv/bin/python

    Installing setuptools, pip, wheel...

    done.

    Activate the virtual environment using a shell-specific command.

    $ source ./cv/bin/activate  # for sh, bash, ksh, or zsh

    When virtualenv is active, your shell prompt is prefixed with (cv). Here’s an example:

    (cv) Shamshads-MacBook-Air:~ sansari$

    Install packages within a virtual environment without affecting the host system setup. Start by upgrading PIP (make sure you do not run any command as root or sudo while in virtualenv).

    $ pip install --upgrade pip

    $ pip list  # show packages installed within the virtual environment

    When you are done and you want to exit from virtualenv, do the following:

    $ deactivate  # don't exit until you're done with your programming

    TensorFlow

    TensorFlow is an open source library for numerical computation and large-scale machine learning. You will learn more about TensorFlow in subsequent chapters. Let’s first install it and get it ready for our deep learning exercises.

    Installing TensorFlow

    We will install the latest version of TensorFlow from PyPI (https://pypi.org/project/tensorflow/). We will install TensorFlow for CPUs. Make sure you are in virtualenv and run the following command:

    (cv) $ pip install --upgrade tensorflow

    Test your TensorFlow installation by running this command:

    (cv) $ python -c import tensorflow as tf

    If TensorFlow is successfully installed, the output should not show any errors.

    PyCharm IDE

    You can use your favorite IDE for writing and managing Python code, but for the purpose of this book, we will use the community version of PyCharm, a Python IDE.

    Installing PyCharm

    Go to the official website of PyCharm at https://www.jetbrains.com/pycharm/download/#section=linux, select the appropriate operating system, and click Download (under Community Version). After the download is completed, click the downloaded package, and follow the on-screen instructions. Here are the direct links for different operating systems:

    Linux: https://www.jetbrains.com/pycharm/download/download-thanks.html?platform=linux&code=PCC

    Mac: https://www.jetbrains.com/pycharm/download/download-thanks.html?platform=mac&code=PCC

    Windows: https://www.jetbrains.com/pycharm/download/download-thanks.html?platform=windows&code=PCC

    Configuring PyCharm to Use virtualenv

    Follow these steps to use the virtualenv, cv, we created earlier:

    1.

    Launch the PyCharm IDE and select File ➤ Settings for Windows and Linux or select PyCharm ➤ Preferences for macOS.

    2.

    In the Settings/Preferences dialog, select Project ➤ Project Interpreter.

    3.

    Click the ../images/493065_1_En_1_Chapter/493065_1_En_1_Figa_HTML.gif icon and click Add.

    4.

    In the left pane of the Add Python Interpreter dialog, select Existing Environment.

    5.

    Expand the Interpreter list and select any of the existing interpreters. Alternatively, click ../images/493065_1_En_1_Chapter/493065_1_En_1_Figb_HTML.gif and specify a path to the Python executable in your file system, for example, /Users/sansari/cv/bin/python3.6 (see Figure 1-1).

    6.

    Select the checkbox Make available to all projects, if you want.

    ../images/493065_1_En_1_Chapter/493065_1_En_1_Fig1_HTML.jpg

    Figure 1-1

    Selecting an interpreter

    OpenCV

    OpenCV is one of the most popular and widely used libraries for image processing. All code examples in this book are based on OpenCV 4. Therefore, our installation steps are for version 4 of OpenCV.

    Working with OpenCV

    OpenCV is written in C/C++, and because it’s platform dependent, the installation instructions vary from OS to OS. In other words, OpenCV needs to be built for your particular platform/OS to run smoothly. We will use Python bindings to call OpenCV for any image processing needs.

    Like any other library, OpenCV is evolving; therefore, if the following installation instructions do not work in your case, check the official website for the exact installation process.

    We will take an easy route to install OpenCV 4 and Python 3 bindings using PIP. We will install the opencv-python-contrib package from PyPI in the virtual environment that we created previously.

    So here we go!

    Installing OpenCV4 with Python Bindings

    Make sure you are in your virtual environment. Simply change directory to your virtualenv directory (the cv directory we created previously) and type the following command:

    $ source cv/bin/activate

    Install OpenCV in a snap using the following command:

    $ pip install opencv-contrib-python

    Additional Libraries

    There are some additional libraries that we will need as we work on some of the examples. Let’s install and keep them in our virtualenv.

    Installing SciPy

    Install SciPy with the following:

    $ pip install scipy

    Installing Matplotlib

    Install Matplotlib with the following:

    $ pip install matplotlib

    Please note that the libraries installed in this chapter are frequently updated. It is strongly advised to check the official websites for updates, new versions of these libraries, and the latest installation instructions.

    © Shamshad Ansari 2020

    S. AnsariBuilding Computer Vision Applications Using Artificial Neural Networkshttps://doi.org/10.1007/978-1-4842-5887-3_2

    2. Core Concepts of Image and Video Processing

    Shamshad Ansari¹ 

    (1)

    Centreville, VA, USA

    This chapter introduces the building blocks of an image and describes various methods to manipulate them. Our learning objectives in this chapter are as follows:

    To understand the smallest unit of an image (a pixel) and how colors are represented

    To learn how pixels are organized in an image and how to access and manipulate them

    To draw different shapes, such as lines, rectangles, and circles, on an image

    To write code in Python and use OpenCV to work with examples to access and manipulate images

    Image Processing

    Image processing is the technique of manipulating a digital image to either get an enhanced image or extract some useful information from it. In image processing, the input is an image, and the output may be an image or some characteristics or features associated with that image. A video is a series of images or frames. Therefore, the technique of image processing also applies to video processing. In this chapter, I will explain the core concepts of digital image processing. I will also show you how to work with images and write code to manipulate them.

    Image Basics

    A digital image is an electronic representation of an object/scene or scanned document. The digitalization of an image means converting it into a series of numbers and storing these numbers in a computer storage system. Understanding how these numbers are arranged and how to manipulate them is the primary objective of this chapter. In this chapter, I will explain what makes an image and how to manipulate it using OpenCV and Python.

    Pixels

    Imagine a series of dots arranged in rows and columns, and these dots have different colors. This is pretty much how an image is formed. The dots that form an image are called pixels . These pixels are represented by numbers, and the values of the numbers determine the color of a pixel. Think of an image as a grid of square cells with each cell consisting of one pixel of a particular color. For example, a 300×400-pixel image means that the image is organized into a grid of 300 rows and 400 columns. That means our image has 300×400 = 120,000 pixels.

    Pixel Color

    A pixel is represented in two ways: grayscale and color.

    Grayscale

    In a grayscale image, each pixel takes a value between 0 and 255. The value 0 represents black, and 255 represents white. The values in between are varying shades of gray. The values close to 0 are darker shades of gray, and values closer to 255 are brighter shades of gray.

    Color

    The RGB (which stands for Red, Blue, and Green) color model is one of the most popular color representations of a pixel. There are other color models, but we will stick to RGB in this book.

    In the RGB model, each pixel is represented as a tuple of three values, generally represented as follows: (value for red component, value for green component, value for blue component). Each of the three colors is represented by integers ranging from 0 to 255. Here are some examples:

    (0,0,0) is a black color.

    (255,0,0) is a pure red color.

    (0,255,0) is a pure green color.

    What color is represented by (0,0,255)?

    What color is represented by (255,255,255)?

    This w3school website (https://www.w3schools.com/colors/colors_rgb.asp) is a great place to play with different combinations of RGB tuples to explore more patterns.

    Explore what color is represented by each of the following tuples:

    (0,0,128)

    (128,0,128)

    (128,128,0)

    Let’s try to make yellow. Here is a clue: red and green make yellow. That means a pure red (255), a pure green (255), and no blue (0) will make yellow. So, our RGB tuple for yellow is (255,255,0).

    Now that we have a good understanding of pixels and their color, let’s understand how pixels are arranged in an image and how to access them. The following section will discuss the concept of coordinate systems in image processing.

    Coordinate Systems

    Pixels in an image are arranged in the form of a grid that is made of rows and columns. Imagine a square grid of eight rows and eight columns. This will form an 8×8 or 64-pixel image. This may be imagined as a 2D coordinate system in which (0,0) is the top-left corner. Figure 2-1 shows our example 8×8-pixel image.

    ../images/493065_1_En_2_Chapter/493065_1_En_2_Fig1_HTML.png

    Figure 2-1

    Pixel coordinate system

    The left-top corner is the start or origin of the image coordinate system. The pixel at the top-right corner is represented by (7,0), the bottom-left corner is (7,0), and the bottom-right pixel is (7,7). This may be generalized as (x,y), where x is the position of the cell from the left edge of the image and y is the vertical position down from the top edge of the image. In Figure 2-1, the red pixel is in the fifth position from the left and fourth from the top. Since the coordinate system begins at 0, the coordinate of the red pixel shown in Figure 2-1 is (4,3).

    To make it a little clearer, let’s imagine an image that is 8×8 pixels, with the letter H written on it (as shown in Figure 2-3). Also, for simplicity, assume this is a grayscale image with the letter H written in black and the rest of the area of the image in white.

    ../images/493065_1_En_2_Chapter/493065_1_En_2_Fig2_HTML.png

    Figure 2-2

    Pixel coordinate system example

    Remember, in the grayscale model, a black pixel is represented by 0, and a white one is represented by 255. Figure 2-3 shows the values of each pixel within the 8×8 grid.

    ../images/493065_1_En_2_Chapter/493065_1_En_2_Fig3_HTML.jpg

    Figure 2-3

    Pixel matrix and values

    So, what’s the value of the pixel at position (1,4)? And at position (2,2)?

    I hope you now have a clear picture of how images are represented by numbers arranged in a grid. These numbers are serialized and stored in the computer’s storage system and rendered as an image when displayed to the screen. By now you know how to access pixels using the coordinate system and how to assign colors to these pixels.

    We have established a solid foundation and learned the basic concepts of image representation. Let’s get ourselves some hands-on practice with some Python and OpenCV coding. In the following section, I will show you, step-by-step, how to write code to load images from the computer’s disk, access pixels, manipulate them, and write them back to the disk. Without further ado, let’s dive in!

    Python and OpenCV Code to Manipulate Images

    OpenCV represents the pixel values of an image as a NumPy array. (Not familiar with NumPy? You can find a getting started tutorial at https://numpy.org/devdocs/user/quickstart.html). In other words, when you load an image, OpenCV creates a NumPy array. The pixel values can be obtained from NumPy by simply supplying the (x,y) coordinates.

    When you give the (x,y) coordinates, NumPy will return the values of colors of the pixel at those coordinates as follows:

    For a grayscale image, the returned value from NumPy will be a single value between 0 and 255.

    For a color image, the returned value from NumPy will be a tuple for red, green, and blue. Note that OpenCV maintains the RGB sequence in the reverse order. Remember this important feature of OpenCV to avoid any confusion while working with OpenCV.

    In other words, OpenCV stores the colors in BGR sequence and not in RGB sequence.

    Before we write any code, let’s make sure we always use our virtualenv, in the ~/cv directory, that we already set up with PyCharm.

    Launch your PyCharm IDE and make a project (I named my project cviz, short for computer vision). Refer to Figure 2-4 and ensure that you have selected Existing Interpreter and have our virtualenv Python 3.6(cv) selected.

    ../images/493065_1_En_2_Chapter/493065_1_En_2_Fig4_HTML.jpg

    Figure 2-4

    PyCharm IDE, showing the setup of the project with virtualenv

    Program: Loading, Exploring, and Showing an Image

    Listing 2-1 shows the Python code to load, explore, and display an image.

    Filename: Listing_2_1.py

    1    from __future__ import print_function

    2    import cv2

    3

    4    # image path

    5    image_path = images/marsrover.png

    6    # Read or load image from its path

    7    image = cv2.imread(image_path)

    8    # image is a NumPy array

    9    print(Dimensions of the image: , image.ndim)

    10   print(Image height: , format(image.shape[0]))

    11   print(Image width: , format(image.shape[1]))

    12   print(Image channels: , format(image.shape[2]))

    13   print(Size of the image array: , image.size)

    14   # Display the image and wait until a key is pressed

    15   cv2.imshow(My Image, image)

    16   cv2.waitKey(0)

    Listing 2-1

    Python Code to Load, Explore, and Display an Image

    The code in Listing 2-1 is explained here.

    In lines 1 and 2, we import Python’s print_function from the __future__ package and cv2 of OpenCV.

    Line 5 is simply the path of the image that we are going to load from a directory. If your input path is in a different directory, you should give either the full or relative path to the image file.

    In line 7, using the cv2.imread() function of OpenCV, we are reading the image into a NumPy array and assigning to a variable called image (this variable can be anything you like).

    In lines 9 through 13, using NumPy features, we are displaying the dimension of the image array, height, width, number of channels, and size of the array (which is the number of pixels).

    Line 15 displays the image as is using OpenCV’s imshow() function.

    In line 16, the waitKey() function allows the program not to terminate immediately and wait for the user to press any key. When you see the image window that will display in line 15, press any key to terminate the program, else the program will block.

    Figure 2-5 shows the output of Listing 2-1.

    ../images/493065_1_En_2_Chapter/493065_1_En_2_Fig5_HTML.jpg

    Figure 2-5

    Output and image display

    The image NumPy array consists of three dimensions: height × width × channel. The first element of the array is the height, which tells us how many rows our pixel grid has. Similarly, the second element is the width, which represents the number of columns of the grid. The three channels represent the BGR (not RBG) color components. The size of the array is 400×640×3 = 768,000. This actually means that our image has 400×640 = 256,000 pixels, and each pixel has three color values.

    Program: OpenCV Code to Access and Manipulate Pixels

    In the next program, we will see how to access and modify pixel values using the coordinate system that we learned about earlier. Listing 2-2 shows the code example with the line-by-line explanation after it.

    Filename: Listing_2_2.py

    1    from __future__ import print_function

    2    import cv2

    3

    4    # image path

    5    image_path = images/marsrover.png

    6    # Read or load image from its path

    7    image = cv2.imread(image_path)

    8

    9    # Access pixel at (0,0) location

    10   (b, g, r) = image[0, 0]

    11   print(Blue, Green and Red values at (0,0): , format((b, g, r)))

    12

    13   # Manipulate pixels and show modified image

    14   image[0:100, 0:100] = (255, 255, 0)

    15   cv2.imshow(Modified Image, image)

    16   cv2.waitKey(0)

    Listing 2-2

    Code Example to Access and Manipulate Image Pixels

    Listing 2-2 is explained here.

    Lines 1 through 7 import and read the image from a directory path (as explained when discussing Listing 2-1).

    In line 10, we are getting the BGR (and not RBG) values of the pixel at coordinates (0,0) and assigning them to the (b,g,r) tuple using the NumPy syntax.

    Line 11 displays the BGR values.

    In line 14, we are taking a range of pixels from 0 to 100 along the y-axis and from 0 to 100 along the x-axis to form a 100×100 square and assigning the values (255,255,0) or pure blue, pure green, and no red to all the pixels within this square.

    Line 16 displays the modified image.

    Line 17 waits for the user to press any key for the program to exit.

    Figure 2-6 shows some sample output of Listing 2-2.

    ../images/493065_1_En_2_Chapter/493065_1_En_2_Fig6_HTML.png

    Figure 2-6

    Output and modified image display

    As shown in Figure 2-6, the modified image has a 100×100-pixel square at the top-left corner in aqua, represented by (255,255,0) of the BGR scheme.

    Drawing

    OpenCV provides convenient methods to draw shapes on an image. We will learn how to draw a line, rectangle, and circle on an image using the following methods:

    Line: cv2.line()

    Rectangle: cv2.rectangle()

    Circle: cv2.circle()

    Drawing a Line on an Image

    We will use a simple method of drawing a line on an image, shown here:

    1.

    Load the image into a NumPy array.

    2.

    Decide the coordinates of the starting position of the line.

    3.

    Decide the coordinates of the end position of the line.

    4.

    Set the color of the line.

    5.

    Optionally, set the thickness of the line.

    Listing 2-3 demonstrates how to draw a line on an image.

    Filename: Listing_2_3.py

    1    from __future__ import print_function

    2    import cv2

    3

    4    # image path

    5    image_path = images/marsrover.png

    6    # Read or load image from its path

    7    image = cv2.imread(image_path)

    8

    9    # set start and end coordinates

    10   start = (0, 0)

    11   end = (image.shape[1], image.shape[0])

    12   # set the color in BGR

    13   color = (255,0,0)

    14   # set thickness in pixel

    15   thickness = 4

    16   cv2.line(image, start, end, color, thickness)

    17

    18   #display the modified image

    19   cv2.imshow(Modified Image, image)

    20   cv2.waitKey(0)

    Listing 2-3

    Drawing a Line on an Image

    Here is the line-by-line explanation of the code.

    Lines 1 and 2 are the usual imports. From now on, I will not repeat the imports unless we have a new one to mention.

    Line 5 is the image path.

    Line 7 actually loads the image into a NumPy array called image.

    Line 10 defines the starting coordinates of the point from where the line will be drawn. Recall that the location (0,0) is the top-left corner of the image.

    Line 11 specifies the coordinates of the endpoint of the image. You will notice that the expression (image.shape[1], image.shape[0]) represents the coordinates of the bottom-right corner of the image.

    You have probably guessed by now that we are drawing a diagonal line.

    Line 13 sets the color of the line we are going to draw, and line 15 sets its thickness.

    The actual line is drawn in line 16. The cv2.line() function takes the following arguments:

    Image NumPy. This is the image where we are drawing the line.

    Start coordinates.

    End coordinates.

    Color.

    Thickness. (This is optional. If you do not pass this argument, our line will have a default thickness of 1.)

    Finally, the modified image is shown on line 19. Line 20 waits for the user to press any key to terminate the program. Figure 2-7 shows the sample output of the image we just drew a line on.

    ../images/493065_1_En_2_Chapter/493065_1_En_2_Fig7_HTML.jpg

    Figure 2-7

    Image with a diagonal line in blue

    Drawing a Rectangle on an Image

    Drawing a rectangle is easy with OpenCV. Let’s dive into the code directly (Listing 2-4). We will first load an image and draw a rectangle to it. We will save the modified image to the disk.

    Filename: Listing_2_4.py

    1    from __future__ import print_function

    2    import cv2

    3

    4    # image path

    5    image_path = images/marsrover.png

    6    # Read or load image from its path

    7    image = cv2.imread(image_path)

    8    # set the start and end coordinates

    9    # of the top-left and bottom-right corners of the rectangle

    10   start = (100,70)

    11   end = (350,380)

    12   # Set the color and thickness of the outline

    13   color = (0,255,0)

    14   thickness = 5

    15   # Draw the rectangle

    16   cv2.rectangle(image, start, end, color, thickness)

    17   # Save the modified image with the rectangle drawn to it.

    18   cv2.imwrite(rectangle.jpg, image)

    19   # Display the modified image

    20   cv2.imshow(Rectangle, image)

    21   cv2.waitKey(0)

    Listing 2-4

    Loading an Image, Drawing a Rectangle to It, Saving It, and Displaying the Modified Image

    Here is a line-by-line explanation of Listing 2-4.

    Lines 1 and 2 are our usual imports.

    Line 5 assigns the image path.

    Line 6 reads the image from its path.

    Line 10 sets the starting point of the rectangle we want to draw on the image. The starting point consists of the coordinates of the top-left corner of the rectangle.

    Line 11 sets the endpoint of the rectangle. This represents the coordinates of the bottom-right corner of the rectangle.

    Line 13 sets the color, and line 14 sets the thickness of the outline of the rectangle.

    Line 16 actually draws the rectangle. We are using OpenCV’s rectangle() function, which takes the following parameters:

    NumPy array that holds the pixel values of the image

    The start coordinates (top-left corner of the rectangle)

    The end coordinates (bottom-right of the rectangle)

    The color of the outline

    The thickness of the outline

    Notice that line 16 does not have any assignment operator. In other words, we did not assign the return value from the cv2.rectangle() function to any variable. The NumPy array, image, that is passed as an argument to the cv2.rectangle() function is modified.

    Line 18 saves the modified image, with rectangle drawn on it, to a file on the disk.

    Line 20 displays the modified image.

    Line 21 calls the waitKey() function to allow the image to remain displayed on the screen until a key is pressed. The function waitKey() waits for a key event infinitely or

    Enjoying the preview?
    Page 1 of 1