Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

OpenVX Programming Guide
OpenVX Programming Guide
OpenVX Programming Guide
Ebook605 pages4 hours

OpenVX Programming Guide

Rating: 0 out of 5 stars

()

Read preview

About this ebook

OpenVX is the computer vision API adopted by many high-performance processor vendors. It is quickly becoming the preferred way to write fast and power-efficient code on embedded systems. OpenVX Programming Guidebook presents definitive information on OpenVX 1.2 and 1.3, the Neural Network, and other extensions as well as the OpenVX Safety Critical standard.

This book gives a high-level overview of the OpenVX standard, its design principles, and overall structure. It covers computer vision functions and the graph API, providing examples of usage for the majority of the functions. It is intended both for the first-time user of OpenVX and as a reference for experienced OpenVX developers.

  • Get to grips with the OpenVX standard and gain insight why various options were chosen
  • Start developing efficient OpenVX code instantly
  • Understand design principles and use them to create robust code
  • Develop consumer and industrial products that use computer vision to understand and interact with the real world
LanguageEnglish
Release dateMay 22, 2020
ISBN9780128166192
OpenVX Programming Guide
Author

Frank Brill

Frank Brill manages OpenVX software development for Cadence’s Tensilica Imaging and Vision DSP organization. Frank obtained his PhD in Computer Science from the University of Virginia and started his career doing computer vision research and development for video security and surveillance applications at Texas Instruments, where he obtained 5 patents related to this work. He then moved into silicon device program management, where he was responsible for several digital still camera and multimedia chips, including the first device in TI’s DaVinci line of multimedia processors (the DM6446). Frank worked at NVIDIA from 2013 to 2014, where he managed the initial development of NVIDIA OpenVX-based VisionWorks toolkit, and then worked at Samsung from 2014 to 2016, where he managed a computer vision R&D team in Samsung Mobile Processor Innovation Lab.

Related to OpenVX Programming Guide

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for OpenVX Programming Guide

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    OpenVX Programming Guide - Frank Brill

    book.

    Chapter 1

    Introduction

    Abstract

    OpenVX is an Application Programming Interface (API) that was created to make computer vision programs run faster on mobile and embedded devices. It is different from other computer vision libraries, because from the very beginning it was designed as a Hardware Abstraction Layer (HAL) that helps software run efficiently on a wide range of hardware platforms. OpenVX is developed by the OpenVX committee, which is a part of the Khronos Group. Khronos is an open and nonprofit consortium; any company or individual can become its member and participate in the development of OpenVX and other standards, including OpenGL, Vulkan, and WebGL. OpenVX is royalty free, but at the same time, it is developed by the industry: some of the world largest silicon vendors are members of the OpenVX committee.

    Keywords

    OpenVX; API; HAL; Vulkan; WebGL; Bit-exact tests; Tolerance-based comparison; Algorithmic tests; Neural networks; Khronos Group

    Chapter Outline

    1.1  What is OpenVX and why do we need it?

    1.2  Portability

    1.3  OpenVX data objects

    1.3.1  Opaque memory model

    1.3.2  Object attributes

    1.4  Graph API

    1.5  Virtual objects

    1.6  Deep neural networks

    1.7  Immediate mode API

    1.8  OpenVX vs. OpenCV and OpenCL

    1.9  OpenVX versions

    1.10  Prerequisites

    1.11  Code samples

    OpenVX¹ [1] is an Application Programming Interface (API) that was created to make computer vision programs run faster on mobile and embedded devices. It is different from other computer vision libraries, because from the very beginning it was designed as a Hardware Abstraction Layer (HAL) that helps software run efficiently on a wide range of hardware platforms. OpenVX is developed by the OpenVX committee, which is a part of the Khronos Group [2]. Khronos is an open and nonprofit consortium; any company or individual can become its member and participate in the development of OpenVX and other standards, including OpenGL, Vulkan, and WebGL. OpenVX is royalty free, but at the same time, it is developed by the industry: some of the world largest silicon vendors are members of the OpenVX committee.

    If you are impatient to get started with OpenVX, you might want to skip to the next chapter. The introduction discusses the challenges of running computer vision algorithms in real-time that OpenVX attempts to solve. It will help you get familiar with the OpenVX high-level concepts, such as objects with opaque memory model and the OpenVX Graph.

    1.1 What is OpenVX and why do we need it?

    Computer vision can be defined as extracting high-level information from images and video. Nowadays it is in the process of changing many industries, including automotive (Advanced Driver Assistance Systems, self-driving cars), agriculture and logistics (robotics), surveillance and banking (face recognition), and many more. A significant part of these scenarios require a computer vision algorithm to run in real time on a low-power embedded hardware. A pedestrian detection algorithm running in a car has to process each input frame; a delay makes a car less safe. For example, in a car moving at 65 miles per hour, each skipped frame adds about 3 feet to the braking distance. A robot that makes decisions slowly can become a bottleneck in the manufacturing process. AR and VR glasses that track their position slower than 30 frames per second cause motion sickness for many users.

    Solving a practical computer vision problem in real time is usually a challenging task. The algorithms are much more complicated than running a single linear filter over an image, and image resolution is high enough to cause a bottleneck in a memory bus. In many cases, significant speedups can be reached by choosing the right algorithm. For example, the Viola–Jones face detector algorithm [3] enables detection of human faces in real time on relatively low-power hardware. The FAST [4] feature detector and BRIEF [5] and ORB [6] descriptors allow quick generation of features for tracking. However, in the most of cases, it is not possible to achieve the required performance by just algorithmic optimizations.

    The concept of optimizing a computer vision pipeline for specific hardware is not new to the community, to say the least. One of the first major efforts in this direction was done by Intel, which released the first version of OpenCV library [7] in 2000 along with the optimized layer for Intel CPUs. Other hardware vendors provided their libraries too (see, e.g., [8–10]), and others followed with developing dedicated hardware for computer vision processing, such as the Mobileye solution for automotive [11]. NVIDIA GPUs with their high level of parallelism, and a wide bandwidth memory bus enabled processing of deep learning algorithms that, at the time of writing this book, are the best-known methods of solving a range of computer vision tasks, from segmentation to object recognition.

    The computer vision applications on mobile/embedded platforms differ from server-based applications: real-time operation is critical. If a web image search algorithm keeps a user waiting, then it is still useful, and we can throw in more servers to make it faster. This is not possible for a car, where many computer vision tasks (pedestrian detection, forward car collision warning, lane departure warning, traffic sign recognition, driver monitoring) have to work in real time on a relatively low-power hardware. In this context, computer vision optimization for mobile/embedded applications is much more important than for server applications. This is why OpenVX is focused on mobile and embedded platforms, although it can be efficiently implemented for the cloud too.

    Low-power mobile and embedded architectures typically are heterogeneous, consisting of a CPU and a set of accelerators dedicated to solving a specific compute intense problem, such as 3D graphics rendering, video coding, digital signal processing, and so on. There are multiple challenges for computer vision developers who want to run their algorithms on such a system. A CPU is too slow to run compute- and data-intense algorithms such as pedestrian detection in real time. One of the reasons is that a memory bus has a relatively low throughput, which limits multicore processing. Executing code on GPU and DSP can help, but writing code that will employ such processing units efficiently across multiple vendors is next to impossible. GPUs and DSPs have no standard memory model. Some of them share RAM with CPU, and some have their own memory with a substantial latency for data input/output. There are ways to write code that will execute on both CPU and GPU, such as [12]. However, developers want higher-level abstraction, which may be implemented over low-level API, such as OpenCL, or coded directly in the hardware. Chapter 13 discusses how OpenVX and OpenCL code can coexist and benefit from each other. Also, there are architectures that do not have full IEEE 754 floating point calculations support (required in the current version of OpenCL); they implement fixed-point arithmetic instead (where a real number is represented with a fixed amount of digits after the radix point). Finding a balance between precision and speed on such architectures is a serious challenge, and we have to tune an algorithm for such platforms. Computer vision is too important to allow such an overhead, and there has to be a way to write an algorithm that will run efficiently on a wide variety of accelerators, implemented and tuned for a specific mobile or embedded platform. This is the problem solved by OpenVX. The OpenVX library is usually developed, optimized, and shipped by silicon vendors, just like a 3D driver for a GPU. It has a graph API, which is convenient to use and efficient for executing on heterogeneous platforms. One of the most important requirements for this API is its portability across platforms.

    1.2 Portability

    OpenVX is the API that allows a computer vision developer to write a program once and have it efficiently executed on hardware, assuming that an efficient OpenVX implementation exists for that hardware. A reasonable expectation for this API is that it produces the same results across different platforms. But what does same results actually mean? How do we test if two implementations of the same function return the same results? There are several ways to answer this question:

    •  Bit-exact tests: the output of an implementation should not differ from the ground truth by a single bit. Such a test is useful to ensure that the results are going to be the same. However, there are many cases where bit-exact tests make no sense. For example, a simple function for correcting camera lens distortion [13] uses floating point calculations, and even if a method to compute an image transformation is well specified, a bit exact requirement may be too strong. For instance, a conformance to IEEE 754 floating point calculation standard may result in a very inefficient implementation on a fixed-point architecture. So, in many cases, it is beneficial to allow for a limited variety in the results.

    •  Tolerance-based comparison: for example, we can require that the output image is not different from the reference implementation output by more than ϵ in each pixel and each channel. There is no exact science in choosing the value of the threshold ϵ. The higher the threshold, the more space there is for optimizing functions for a specific platform, and the higher the variability of the results across different platforms. Computer vision means extracting high-level information from images, and if greater variability in this high-level information prevents solving the problem, then the thresholds are too high. For instance, an image arithmetic operation involved in an image stitching algorithm that results in a high-quality image on one platform and produces a visible stitch line on another should not pass a tolerance-based

    Enjoying the preview?
    Page 1 of 1