Accelerating MATLAB with GPU Computing: A Primer with Examples
By Jung W. Suh and Youngmin Kim
3/5
()
About this ebook
Beyond simulation and algorithm development, many developers increasingly use MATLAB even for product deployment in computationally heavy fields. This often demands that MATLAB codes run faster by leveraging the distributed parallelism of Graphics Processing Units (GPUs). While MATLAB successfully provides high-level functions as a simulation tool for rapid prototyping, the underlying details and knowledge needed for utilizing GPUs make MATLAB users hesitate to step into it. Accelerating MATLAB with GPUs offers a primer on bridging this gap.
Starting with the basics, setting up MATLAB for CUDA (in Windows, Linux and Mac OS X) and profiling, it then guides users through advanced topics such as CUDA libraries. The authors share their experience developing algorithms using MATLAB, C++ and GPUs for huge datasets, modifying MATLAB codes to better utilize the computational power of GPUs, and integrating them into commercial software products. Throughout the book, they demonstrate many example codes that can be used as templates of C-MEX and CUDA codes for readers’ projects. Download example codes from the publisher's website: http://booksite.elsevier.com/9780124080805/
- Shows how to accelerate MATLAB codes through the GPU for parallel processing, with minimal hardware knowledge
- Explains the related background on hardware, architecture and programming for ease of use
- Provides simple worked examples of MATLAB and CUDA C codes as well as templates that can be reused in real-world projects
Jung W. Suh
Jung W. Suh is a senior algorithm engineer and research scientist at KLA-Tencor. Dr. Suh received his Ph.D. from Virginia Tech in 2007 for his 3D medical image processing work. He was involved in the development of MPEG-4 and Digital Mobile Broadcasting (DMB) systems in Samsung Electronics. He was a senior scientist at HeartFlow, Inc., prior to joining KLA-Tencor. His research interests are in the fields of biomedical image processing, pattern recognition, machine learning and image/video compression. He has more than 30 journal and conference papers and 6 patents.
Related to Accelerating MATLAB with GPU Computing
Related ebooks
CUDA Application Design and Development Rating: 0 out of 5 stars0 ratingsHeterogeneous Computing with OpenCL 2.0 Rating: 0 out of 5 stars0 ratingsPractical Scientific Computing Rating: 0 out of 5 stars0 ratingsParallel Programming with OpenACC Rating: 5 out of 5 stars5/5Practical MATLAB Deep Learning: A Project-Based Approach Rating: 0 out of 5 stars0 ratingsGPU Computing Gems Jade Edition Rating: 5 out of 5 stars5/5Introduction to Parallel Programming Rating: 0 out of 5 stars0 ratingsNumerical Python: Scientific Computing and Data Science Applications with Numpy, SciPy and Matplotlib Rating: 0 out of 5 stars0 ratingsCUDA Programming: A Developer's Guide to Parallel Computing with GPUs Rating: 4 out of 5 stars4/5CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming Rating: 0 out of 5 stars0 ratingsAdvances in GPU Research and Practice Rating: 0 out of 5 stars0 ratingsKeras to Kubernetes: The Journey of a Machine Learning Model to Production Rating: 0 out of 5 stars0 ratingsMATLAB Machine Learning Recipes: A Problem-Solution Approach Rating: 0 out of 5 stars0 ratingsQt 5 Blueprints Rating: 4 out of 5 stars4/5A Guidebook to Fortran on Supercomputers Rating: 0 out of 5 stars0 ratingsTopological Data Structures for Surfaces: An Introduction to Geographical Information Science Rating: 0 out of 5 stars0 ratingsRx.NET in Action Rating: 0 out of 5 stars0 ratingsAndroid Sensor Programming By Example Rating: 0 out of 5 stars0 ratingsRust In Practice: A Programmers Guide to Build Rust Programs, Test Applications and Create Cargo Packages Rating: 0 out of 5 stars0 ratingsExploring C++20: The Programmer's Introduction to C++ Rating: 0 out of 5 stars0 ratingsPython How-To: 63 techniques to improve your Python code Rating: 0 out of 5 stars0 ratingsIntel Xeon Phi Processor High Performance Programming: Knights Landing Edition Rating: 0 out of 5 stars0 ratingsDeploy Machine Learning Models to Production: With Flask, Streamlit, Docker, and Kubernetes on Google Cloud Platform Rating: 0 out of 5 stars0 ratingsBoost.Asio C++ Network Programming Cookbook Rating: 0 out of 5 stars0 ratingsOptimizing Visual Studio Code for Python Development: Developing More Efficient and Effective Programs in Python Rating: 0 out of 5 stars0 ratingsMicrosoft .NET Framework 4.5 Quickstart Cookbook Rating: 0 out of 5 stars0 ratingsTrends in Functional Programming 10 Rating: 0 out of 5 stars0 ratingsMastering Three.js: A Journey Through 3D Web Development Rating: 0 out of 5 stars0 ratingsCross-Platform Desktop Applications: Using Node, Electron, and NW.js Rating: 0 out of 5 stars0 ratingsPractical Game AI Programming Rating: 0 out of 5 stars0 ratings
Programming For You
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps Rating: 4 out of 5 stars4/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5HTML & CSS: Learn the Fundaments in 7 Days Rating: 4 out of 5 stars4/5Coding All-in-One For Dummies Rating: 4 out of 5 stars4/5Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer. Rating: 5 out of 5 stars5/5Hacking: Ultimate Beginner's Guide for Computer Hacking in 2018 and Beyond: Hacking in 2018, #1 Rating: 4 out of 5 stars4/5PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project Rating: 5 out of 5 stars5/5Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5SQL All-in-One For Dummies Rating: 3 out of 5 stars3/5Java for Beginners: A Crash Course to Learn Java Programming in 1 Week Rating: 5 out of 5 stars5/5Learn PowerShell in a Month of Lunches, Fourth Edition: Covers Windows, Linux, and macOS Rating: 0 out of 5 stars0 ratingsPython Projects for Beginners: A Ten-Week Bootcamp Approach to Python Programming Rating: 0 out of 5 stars0 ratingsThe Unofficial Guide to Open Broadcaster Software: OBS: The World's Most Popular Free Live-Streaming Application Rating: 0 out of 5 stars0 ratingsPokemon Go: Guide + 20 Tips and Tricks You Must Read Hints, Tricks, Tips, Secrets, Android, iOS Rating: 5 out of 5 stars5/5Teach Yourself C++ Rating: 4 out of 5 stars4/5SQL: For Beginners: Your Guide To Easily Learn SQL Programming in 7 Days Rating: 5 out of 5 stars5/5The Little SAS Book: A Primer, Sixth Edition Rating: 5 out of 5 stars5/5Python: For Beginners A Crash Course Guide To Learn Python in 1 Week Rating: 4 out of 5 stars4/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5101 Amazing Nintendo NES Facts: Includes facts about the Famicom Rating: 4 out of 5 stars4/5
Reviews for Accelerating MATLAB with GPU Computing
1 rating0 reviews
Book preview
Accelerating MATLAB with GPU Computing - Jung W. Suh
Preface
MATLAB is a widely used simulation tool for rapid prototyping and algorithm development. Many laboratories and research institutions face growing demands to run their MATLAB codes faster for computationally heavy projects after simple simulations. Since MATLAB uses a vector/matrix representation of data, which is suitable for parallel processing, it can benefit a lot from GPU acceleration.
Target Readers and Contents
This book is aimed primarily at the graduate students and researchers in the field of engineering, science, and technology who need huge data processing without losing the many benefits of MATLAB. However, MATLAB users come from various backgrounds and do not necessarily have much programming experience. For those whose backgrounds are not from programming, GPU acceleration for MATLAB may distract their algorithm development and introduce unnecessary hassles, even when setting the environment. This book targets the readers who have some or a lot of experience on MATLAB coding but not enough depth in either C coding or the computer architecture for parallelization. So readers can focus more on their research and work by avoiding non-algorithmic hassles in using GPU and CUDA in MATLAB.
As a primer, the book will start with the basics, walking through the process of setting MATLAB for CUDA (in Windows and Mac OSX), creating c-mex and m-file profiling, then guide the users through the expert-level topics such as third-party CUDA libraries. It also provides many practical ways to modify users’ MATLAB codes to better utilize the immense computational power of graphics processors.
This book guides the reader to dramatically maximize the MATLAB speed using NVIDIA’s Graphics Processing Unit (GPU). NVIDIA’s Compute Unified Device Architecture (CUDA) is a parallel computing architecture originally designed for computer games but is getting a reputation in the general science and technology fields for its efficient massive computation power. From this book, the reader can take advantage of the parallel processing power of GPU and abundant CUDA scientific libraries for accelerating MATLAB code with no or less effort and time, and bring readers’ researches and works to a higher level.
Directions of this Book
GPU Utilization Using c-mex Versus Parallel Computing Toolbox
This book deals with Mathworks’s Parallel Computing Toolbox in Chapter 5. Although Mathworks’s Parallel Computing Toolbox is a useful tool for speeding up MATLAB, the current version still has its limitation in making the Parallel Computing Toolbox a general speeding-up solution, in addition to the extra cost of purchasing the toolbox. Especially, since the Parallel Computing Toolbox targets distributed computing over multicore, multiple computers and/or cluster machines as well as GPU processing, GPU optimization for speeding up the user’s code is comparatively limited both in speeding-up and supporting MATLAB functions. Furthermore, if we limit to Mathworks’s the Parallel Computing Toolbox only, then it is difficult to find an efficient way to utilize the abundant CUDA libraries to their maximum. In this book, we address both the strengths and the limitations of the current Parallel Computing Toolbox in Chapter 5. For the purpose of general speeding up, GPU-utilization through c-mex proves a better approach and provides more flexibility in current situation.
Tutorial Approach Versus Case Study Approach
As the book’s title says, we take more of a tutorial approach. MATLAB users may come from many different backgrounds, and web resources are scattered over Mathworks, NVIDIA, and private blogs as fragmented information. The tutorial approach from setting the GPU environment to acquiring critical (but compressed) hardware knowledge for GPU would be beneficial to prospective readers over a wide spectrum. However, this book also has two chapters (Chapters 7 and 8) that include case examples with working codes.
CUDA Versus OpenCL
When we prepared the proposal of this book, we also considered OpenCL as a topic, because the inclusion of OpenCL would attract a wider range of readers. However, while CUDA is more consistent and stable, because it is solely driven by NVIDIA, the current OpenCL has no unified development environment and is still unstable in some areas, because OpenCL is not governed by one company or institution. For this reason, installing, profiling, and debugging OpenCL are not yet standardized. As a primer, this may distract the focus of this book. More importantly, for some reason Mathworks is very conservative in its support of OpenCL, unlike CUDA. Therefore, we decided not to include OpenCL in this edition of our book. However, we will again consider whether to include OpenCL in future editions if increased needs come from market or Mathworks’ direction changes.
After reading this book, the reader, in no time, will experience an amazing performance boost in utilizing reader’s MATLAB codes and be better equipped in research to enjoy the useful open-source resources for CUDA. The features this book covers are available on Windows and Mac.
1
Accelerating MATLAB without GPU
This chapter deals with basic accelerating methods for MATLAB codes in an intrinsic way, which means simple code optimization without using GPU or C-MEX. This chapter covers vectorization for parallel processing, preallocation for efficient memory management, tips to increase your MATLAB codes, and step-by-step examples that show the code improvements.
Keywords
Vectorization; elementwise operation; vector/matrix operation; memory preallocation; sparse matrix form
1.1 Chapter Objectives
In this chapter, we deal with the basic accelerating methods for MATLAB codes in an intrinsic way – a simple code optimization without using GPU or C-MEX. You will learn about the following:
• The vectorization for parallel processing.
• The preallocation for efficient memory management.
• Other useful tips to increase your MATLAB codes.
• Examples that show the code improvements step by step.
1.2 Vectorization
Since MATLAB has the vector/matrix representation of its data, vectorization
can help to make your MATLAB codes run faster. The key for vectorization is to minimize the usage of a for-loop.
Consider the following two m files, which are functionally the same:
The left nonVec1.m has a for-loop to calculate the sum, while the right Vec1.m has no for-loop in the code.
>> nonVec1
Elapsed time is 0.944395 seconds.
y =
−1.3042e+48
>> Vec1
Elapsed time is 0.330786 seconds.
y =
−1.3042e+48
The results are same but the elapsed time for Vec1.m is almost three times less than that for nonVec1.m. For better vectorization, utilize the elementwise operation and vector/matrix operation.
1.2.1 Elementwise Operation
The * symbol is defined as matrix multiplication when it is used on two matrices. But the .* symbol specifies an elementwise multiplication. For example, if x = [1 2 3] and v = [4 5 6],
>> k = x .* v
k =
4 10 18
Many other operations can be performed elementwise:
>> k = x .^2
k =
1 4 9
>> k = x ./ v
k =
0.2500 0.4000 0.5000
Many functions also support this elementwise operation:
>> k = sqrt(x)
k =
1.0000 1.4142 1.7321
>> k = sin(x)
k =
0.8415 0.9093 0.1411
>> k = log(x)
k =
0 0.6931 1.0986
>> k = abs(x)
k =
1 2 3
Even the relational operators can be used elementwise:
>> R = rand(2,3)
R =
0.8147 0.1270 0.6324
0.9058 0.9134 0.0975
>> (R > 0.2) & (R < 0.8)
ans =
0 0 1
0 0 0
>> x = 5
x =
5
>> x >= [1 2 3; 4 5 6; 7 8 9]
ans =
1 1 1
1 1 0
0 0 0
We can do even more complicated elementwise operations together:
>> A = 1:10;
>> B = 2:11;
>> C = 0.1:0.1:1;
>> D = 5:14;
>> M = B ./ (A .* D .* sin(C));
1.2.2 Vector/Matrix Operation
Since MATLAB is based on a linear algebra software package, employing vector/matrix operation in linear algebra can effectively replace the for-loop, and result in speeding up. Most common vector/matrix operations are matrix multiplication for combining multiplication and addition for each element.
If we consider two column vectors, a and b, the resulting dot product is the 1×1 matrix, as follows:
If two vectors, a and bto get the 1×1 matrix, resulting from the combination of multiplication and addition, as follows.
as the dot products of x with the rows of A:
1.2.3 Useful Tricks
In many applications, we need to set upper and lower bounds on each element. For that purpose, we often use if and elseif statements, which easily break vectorization. Instead of if and elseif statements for bounding elements, we may use min and max built-in functions:
>> ifExample
Elapsed time is 0.878781 seconds.
y =
5.8759e+47
>> nonifExample
Elapsed time is 0.309516 seconds.
y =
5.8759e+47
Similarly, if you need to find and replace some values in elements, you can also avoid if and elseif statements by using the find function to keep vectorization.