OpenVX Programming Guide
()
About this ebook
OpenVX is the computer vision API adopted by many high-performance processor vendors. It is quickly becoming the preferred way to write fast and power-efficient code on embedded systems. OpenVX Programming Guidebook presents definitive information on OpenVX 1.2 and 1.3, the Neural Network, and other extensions as well as the OpenVX Safety Critical standard.
This book gives a high-level overview of the OpenVX standard, its design principles, and overall structure. It covers computer vision functions and the graph API, providing examples of usage for the majority of the functions. It is intended both for the first-time user of OpenVX and as a reference for experienced OpenVX developers.
- Get to grips with the OpenVX standard and gain insight why various options were chosen
- Start developing efficient OpenVX code instantly
- Understand design principles and use them to create robust code
- Develop consumer and industrial products that use computer vision to understand and interact with the real world
Frank Brill
Frank Brill manages OpenVX software development for Cadence’s Tensilica Imaging and Vision DSP organization. Frank obtained his PhD in Computer Science from the University of Virginia and started his career doing computer vision research and development for video security and surveillance applications at Texas Instruments, where he obtained 5 patents related to this work. He then moved into silicon device program management, where he was responsible for several digital still camera and multimedia chips, including the first device in TI’s DaVinci line of multimedia processors (the DM6446). Frank worked at NVIDIA from 2013 to 2014, where he managed the initial development of NVIDIA OpenVX-based VisionWorks toolkit, and then worked at Samsung from 2014 to 2016, where he managed a computer vision R&D team in Samsung Mobile Processor Innovation Lab.
Related to OpenVX Programming Guide
Related ebooks
Deep Learning on Edge Computing Devices: Design Challenges of Algorithm and Architecture Rating: 0 out of 5 stars0 ratingsDemystifying Embedded Systems Middleware Rating: 4 out of 5 stars4/5Real-Time Systems Development Rating: 0 out of 5 stars0 ratingsComputer Vision with Maker Tech: Detecting People With a Raspberry Pi, a Thermal Camera, and Machine Learning Rating: 0 out of 5 stars0 ratingsParallel Computing Rating: 0 out of 5 stars0 ratingsUsing Yocto Project with BeagleBone Black Rating: 0 out of 5 stars0 ratingsEmbedded DSP Processor Design: Application Specific Instruction Set Processors Rating: 0 out of 5 stars0 ratingsIoT Communication Protocols Second Edition Rating: 0 out of 5 stars0 ratingsLearning BeagleBone Rating: 0 out of 5 stars0 ratingsPCIe Standard Requirements Rating: 0 out of 5 stars0 ratingsComputational Number Theory and Modern Cryptography Rating: 3 out of 5 stars3/5Diffuse Algorithms for Neural and Neuro-Fuzzy Networks: With Applications in Control Engineering and Signal Processing Rating: 0 out of 5 stars0 ratingsPrinciples of Semiconductor Network Testing Rating: 0 out of 5 stars0 ratingsThe System Designer's Guide to VHDL-AMS: Analog, Mixed-Signal, and Mixed-Technology Modeling Rating: 5 out of 5 stars5/5The Art of Designing Embedded Systems Rating: 0 out of 5 stars0 ratingsCommercial and Industrial Internet of Things Applications with the Raspberry Pi: Prototyping IoT Solutions Rating: 0 out of 5 stars0 ratingsProgramming the BeagleBone Rating: 0 out of 5 stars0 ratingsDesigning SOCs with Configured Cores: Unleashing the Tensilica Xtensa and Diamond Cores Rating: 5 out of 5 stars5/5Operational Amplifiers Rating: 3 out of 5 stars3/5Implementing 802.11 with Microcontrollers: Wireless Networking for Embedded Systems Designers Rating: 0 out of 5 stars0 ratingsWow! What a Ride!: A Quick Trip Through Early Semiconductor and Personal Computer Development Rating: 0 out of 5 stars0 ratingsMastering BeagleBone Robotics Rating: 5 out of 5 stars5/5Intelligent Image and Video Compression: Communicating Pictures Rating: 5 out of 5 stars5/5Modern Arm Assembly Language Programming: Covers Armv8-A 32-bit, 64-bit, and SIMD Rating: 0 out of 5 stars0 ratingsModern Embedded Computing: Designing Connected, Pervasive, Media-Rich Systems Rating: 5 out of 5 stars5/5Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools Rating: 0 out of 5 stars0 ratingsDesigning Secure IoT Devices with the Arm Platform Security Architecture and Cortex-M33 Rating: 0 out of 5 stars0 ratingsAdversarial Robustness for Machine Learning Rating: 0 out of 5 stars0 ratingsApplication-Specific Integrated Circuit ASIC A Complete Guide Rating: 0 out of 5 stars0 ratings
Intelligence (AI) & Semantics For You
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5Artificial Intelligence: A Guide for Thinking Humans Rating: 4 out of 5 stars4/52084: Artificial Intelligence and the Future of Humanity Rating: 4 out of 5 stars4/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 5 out of 5 stars5/5Summary of Super-Intelligence From Nick Bostrom Rating: 5 out of 5 stars5/5101 Midjourney Prompt Secrets Rating: 3 out of 5 stars3/5ChatGPT For Fiction Writing: AI for Authors Rating: 5 out of 5 stars5/5The Secrets of ChatGPT Prompt Engineering for Non-Developers Rating: 5 out of 5 stars5/5Our Final Invention: Artificial Intelligence and the End of the Human Era Rating: 4 out of 5 stars4/5Dark Aeon: Transhumanism and the War Against Humanity Rating: 5 out of 5 stars5/5Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures Rating: 4 out of 5 stars4/5Midjourney Mastery - The Ultimate Handbook of Prompts Rating: 5 out of 5 stars5/5Discovery Writing with ChatGPT: AI-Powered Storytelling: Three Story Method, #6 Rating: 0 out of 5 stars0 ratingsImpromptu: Amplifying Our Humanity Through AI Rating: 5 out of 5 stars5/5What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions Rating: 5 out of 5 stars5/5ChatGPT For Dummies Rating: 0 out of 5 stars0 ratingsThe Algorithm of the Universe (A New Perspective to Cognitive AI) Rating: 5 out of 5 stars5/5ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology Rating: 0 out of 5 stars0 ratingsAI for Educators: AI for Educators Rating: 5 out of 5 stars5/5Ways of Being: Animals, Plants, Machines: The Search for a Planetary Intelligence Rating: 4 out of 5 stars4/5The Business Case for AI: A Leader's Guide to AI Strategies, Best Practices & Real-World Applications Rating: 0 out of 5 stars0 ratingsTHE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION Rating: 5 out of 5 stars5/5
Reviews for OpenVX Programming Guide
0 ratings0 reviews
Book preview
OpenVX Programming Guide - Frank Brill
book.
Chapter 1
Introduction
Abstract
OpenVX is an Application Programming Interface (API) that was created to make computer vision programs run faster on mobile and embedded devices. It is different from other computer vision libraries, because from the very beginning it was designed as a Hardware Abstraction Layer (HAL) that helps software run efficiently on a wide range of hardware platforms. OpenVX is developed by the OpenVX committee, which is a part of the Khronos Group. Khronos is an open and nonprofit consortium; any company or individual can become its member and participate in the development of OpenVX and other standards, including OpenGL, Vulkan, and WebGL. OpenVX is royalty free, but at the same time, it is developed by the industry: some of the world largest silicon vendors are members of the OpenVX committee.
Keywords
OpenVX; API; HAL; Vulkan; WebGL; Bit-exact tests; Tolerance-based comparison; Algorithmic tests; Neural networks; Khronos Group
Chapter Outline
1.1 What is OpenVX and why do we need it?
1.2 Portability
1.3 OpenVX data objects
1.3.1 Opaque memory model
1.3.2 Object attributes
1.4 Graph API
1.5 Virtual objects
1.6 Deep neural networks
1.7 Immediate mode API
1.8 OpenVX vs. OpenCV and OpenCL
1.9 OpenVX versions
1.10 Prerequisites
1.11 Code samples
OpenVX¹ [1] is an Application Programming Interface (API) that was created to make computer vision programs run faster on mobile and embedded devices. It is different from other computer vision libraries, because from the very beginning it was designed as a Hardware Abstraction Layer (HAL) that helps software run efficiently on a wide range of hardware platforms. OpenVX is developed by the OpenVX committee, which is a part of the Khronos Group [2]. Khronos is an open and nonprofit consortium; any company or individual can become its member and participate in the development of OpenVX and other standards, including OpenGL, Vulkan, and WebGL. OpenVX is royalty free, but at the same time, it is developed by the industry: some of the world largest silicon vendors are members of the OpenVX committee.
If you are impatient to get started with OpenVX, you might want to skip to the next chapter. The introduction discusses the challenges of running computer vision algorithms in real-time that OpenVX attempts to solve. It will help you get familiar with the OpenVX high-level concepts, such as objects with opaque memory model and the OpenVX Graph.
1.1 What is OpenVX and why do we need it?
Computer vision can be defined as extracting high-level information from images and video. Nowadays it is in the process of changing many industries, including automotive (Advanced Driver Assistance Systems, self-driving cars), agriculture and logistics (robotics), surveillance and banking (face recognition), and many more. A significant part of these scenarios require a computer vision algorithm to run in real time on a low-power embedded hardware. A pedestrian detection algorithm running in a car has to process each input frame; a delay makes a car less safe. For example, in a car moving at 65 miles per hour, each skipped frame adds about 3 feet to the braking distance. A robot that makes decisions slowly can become a bottleneck in the manufacturing process. AR and VR glasses that track their position slower than 30 frames per second cause motion sickness for many users.
Solving a practical computer vision problem in real time is usually a challenging task. The algorithms are much more complicated than running a single linear filter over an image, and image resolution is high enough to cause a bottleneck in a memory bus. In many cases, significant speedups can be reached by choosing the right algorithm. For example, the Viola–Jones face detector algorithm [3] enables detection of human faces in real time on relatively low-power hardware. The FAST [4] feature detector and BRIEF [5] and ORB [6] descriptors allow quick generation of features for tracking. However, in the most of cases, it is not possible to achieve the required performance by just algorithmic optimizations.
The concept of optimizing a computer vision pipeline for specific hardware is not new to the community, to say the least. One of the first major efforts in this direction was done by Intel, which released the first version of OpenCV library [7] in 2000 along with the optimized layer for Intel CPUs. Other hardware vendors provided their libraries too (see, e.g., [8–10]), and others followed with developing dedicated hardware for computer vision processing, such as the Mobileye solution for automotive [11]. NVIDIA GPUs with their high level of parallelism, and a wide bandwidth memory bus enabled processing of deep learning algorithms that, at the time of writing this book, are the best-known methods of solving a range of computer vision tasks, from segmentation to object recognition.
The computer vision applications on mobile/embedded platforms differ from server-based applications: real-time operation is critical. If a web image search algorithm keeps a user waiting, then it is still useful, and we can throw in more servers to make it faster. This is not possible for a car, where many computer vision tasks (pedestrian detection, forward car collision warning, lane departure warning, traffic sign recognition, driver monitoring) have to work in real time on a relatively low-power hardware. In this context, computer vision optimization for mobile/embedded applications is much more important than for server applications. This is why OpenVX is focused on mobile and embedded platforms, although it can be efficiently implemented for the cloud too.
Low-power mobile and embedded architectures typically are heterogeneous, consisting of a CPU and a set of accelerators dedicated to solving a specific compute intense problem, such as 3D graphics rendering, video coding, digital signal processing, and so on. There are multiple challenges for computer vision developers who want to run their algorithms on such a system. A CPU is too slow to run compute- and data-intense algorithms such as pedestrian detection in real time. One of the reasons is that a memory bus has a relatively low throughput, which limits multicore processing. Executing code on GPU and DSP can help, but writing code that will employ such processing units efficiently across multiple vendors is next to impossible. GPUs and DSPs have no standard memory model. Some of them share RAM with CPU, and some have their own memory with a substantial latency for data input/output. There are ways to write code that will execute on both CPU and GPU, such as [12]. However, developers want higher-level abstraction, which may be implemented over low-level API, such as OpenCL, or coded directly in the hardware. Chapter 13 discusses how OpenVX and OpenCL code can coexist and benefit from each other. Also, there are architectures that do not have full IEEE 754 floating point calculations support (required in the current version of OpenCL); they implement fixed-point arithmetic instead (where a real number is represented with a fixed amount of digits after the radix point). Finding a balance between precision and speed on such architectures is a serious challenge, and we have to tune an algorithm for such platforms. Computer vision is too important to allow such an overhead, and there has to be a way to write an algorithm that will run efficiently on a wide variety of accelerators, implemented and tuned for a specific mobile or embedded platform. This is the problem solved by OpenVX. The OpenVX library is usually developed, optimized, and shipped by silicon vendors, just like a 3D driver for a GPU. It has a graph API, which is convenient to use and efficient for executing on heterogeneous platforms. One of the most important requirements for this API is its portability across platforms.
1.2 Portability
OpenVX is the API that allows a computer vision developer to write a program once and have it efficiently executed on hardware, assuming that an efficient OpenVX implementation exists for that hardware. A reasonable expectation for this API is that it produces the same results across different platforms. But what does same results
actually mean? How do we test if two implementations of the same function return the same results? There are several ways to answer this question:
• Bit-exact tests: the output of an implementation should not differ from the ground truth by a single bit. Such a test is useful to ensure that the results are going to be the same. However, there are many cases where bit-exact tests make no sense. For example, a simple function for correcting camera lens distortion [13] uses floating point calculations, and even if a method to compute an image transformation is well specified, a bit exact requirement may be too strong. For instance, a conformance to IEEE 754 floating point calculation standard may result in a very inefficient implementation on a fixed-point architecture. So, in many cases, it is beneficial to allow for a limited variety in the results.
• Tolerance-based comparison: for example, we can require that the output image is not different from the reference implementation output by more than ϵ in each pixel and each channel. There is no exact science in choosing the value of the threshold ϵ. The higher the threshold, the more space there is for optimizing functions for a specific platform, and the higher the variability of the results across different platforms. Computer vision means extracting high-level information from images, and if greater variability in this high-level information prevents solving the problem, then the thresholds are too high. For instance, an image arithmetic operation involved in an image stitching algorithm that results in a high-quality image on one platform and produces a visible stitch line on another should not pass a tolerance-based