Deep Learning through Sparse and Low-Rank Modeling
By Zhangyang Wang, Yun Fu and Thomas S. Huang
()
About this ebook
Deep Learning through Sparse Representation and Low-Rank Modeling bridges classical sparse and low rank models—those that emphasize problem-specific Interpretability—with recent deep network models that have enabled a larger learning capacity and better utilization of Big Data. It shows how the toolkit of deep learning is closely tied with the sparse/low rank methods and algorithms, providing a rich variety of theoretical and analytic tools to guide the design and interpretation of deep learning models. The development of the theory and models is supported by a wide variety of applications in computer vision, machine learning, signal processing, and data mining.
This book will be highly useful for researchers, graduate students and practitioners working in the fields of computer vision, machine learning, signal processing, optimization and statistics.
- Combines classical sparse and low-rank models and algorithms with the latest advances in deep learning networks
- Shows how the structure and algorithms of sparse and low-rank methods improves the performance and interpretability of Deep Learning models
- Provides tactics on how to build and apply customized deep learning models for various applications
Zhangyang Wang
Dr. Zhangyang (Atlas) Wang is an Assistant Professor of Computer Science and Engineering (CSE), at the Texas A&M University (TAMU), since August 2017. During 2012-2016, he was a Ph.D. student in the Electrical and Computer Engineering (ECE) Department, at the University of Illinois at Urbana-Champaign (UIUC). He was a former research intern with Microsoft Research (2015), Adobe Research (2014), and US Army Research Lab (2013). Dr. Wang has published over 70 papers in top-tier venues, in the broad fields of machine learning, computer vision, artificial intelligence, and interdisciplinary data science. He has published 2 books and 1 chapter, has been granted 3 patents, and has received over 20 research awards and scholarships. Dr. Wang regularly serves as tutorial speakers, guest editors, area chairs, session chairs, TPC members, and workshop organizers at leading conferences and journals.
Related to Deep Learning through Sparse and Low-Rank Modeling
Related ebooks
Artificial Intelligence-Based Brain-Computer Interface Rating: 0 out of 5 stars0 ratingsComputational Learning Approaches to Data Analytics in Biomedical Applications Rating: 5 out of 5 stars5/5Introduction to Algorithms for Data Mining and Machine Learning Rating: 0 out of 5 stars0 ratingsMachine Learning for Future Fiber-Optic Communication Systems Rating: 0 out of 5 stars0 ratingsDeep Learning for Chest Radiographs: Computer-Aided Classification Rating: 0 out of 5 stars0 ratingsGenerative Adversarial Networks for Image-to-Image Translation Rating: 0 out of 5 stars0 ratingsDeep Learning Techniques for Biomedical and Health Informatics Rating: 0 out of 5 stars0 ratingsNature-Inspired Computation and Swarm Intelligence: Algorithms, Theory and Applications Rating: 0 out of 5 stars0 ratingsDigital Image Enhancement and Reconstruction Rating: 0 out of 5 stars0 ratingsAdaptive Learning Methods for Nonlinear System Modeling Rating: 0 out of 5 stars0 ratingsAdvances in Computational Techniques for Biomedical Image Analysis: Methods and Applications Rating: 0 out of 5 stars0 ratingsThe Natural Language for Artificial Intelligence Rating: 0 out of 5 stars0 ratingsTrends in Deep Learning Methodologies: Algorithms, Applications, and Systems Rating: 0 out of 5 stars0 ratingsArtificial Neural Systems: Principle and Practice Rating: 0 out of 5 stars0 ratingsA First Course in Artificial Intelligence Rating: 0 out of 5 stars0 ratingsSocial Media Data Mining and Analytics Rating: 0 out of 5 stars0 ratingsDeep Learning and Parallel Computing Environment for Bioengineering Systems Rating: 0 out of 5 stars0 ratingsAgent-Based Computational Sociology Rating: 0 out of 5 stars0 ratingsOntologies with Python: Programming OWL 2.0 Ontologies with Python and Owlready2 Rating: 0 out of 5 stars0 ratingsMultimodal Signal Processing: Theory and Applications for Human-Computer Interaction Rating: 0 out of 5 stars0 ratingsDynamic Random Walks: Theory and Applications Rating: 0 out of 5 stars0 ratingsFoundations of Deductive Databases and Logic Programming Rating: 5 out of 5 stars5/5Intentional Behaviorism: Philosophical Foundations of Economic Psychology Rating: 0 out of 5 stars0 ratingsThe Computer Graphics Interface: Computer Graphics Standards Series Rating: 5 out of 5 stars5/5Natural language understanding A Complete Guide Rating: 0 out of 5 stars0 ratingsPrinciples and Labs for Deep Learning Rating: 0 out of 5 stars0 ratingsPython Deep Learning Complete Self-Assessment Guide Rating: 0 out of 5 stars0 ratingsArtificial and Mathematical Theory of Computation: Papers in Honor of John McCarthy Rating: 0 out of 5 stars0 ratingsToward Human-Level Artificial Intelligence: Representation and Computation of Meaning in Natural Language Rating: 0 out of 5 stars0 ratings
Computers For You
Deep Search: How to Explore the Internet More Effectively Rating: 5 out of 5 stars5/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 5 out of 5 stars5/5How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally Rating: 4 out of 5 stars4/5Network+ Study Guide & Practice Exams Rating: 4 out of 5 stars4/5Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad Rating: 0 out of 5 stars0 ratingsThe ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology Rating: 0 out of 5 stars0 ratings101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters Rating: 4 out of 5 stars4/5Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands Rating: 5 out of 5 stars5/5AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice Rating: 0 out of 5 stars0 ratingsCompTIA Security+ Practice Questions Rating: 2 out of 5 stars2/5Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are Rating: 4 out of 5 stars4/5CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61 Rating: 0 out of 5 stars0 ratingsChildhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance Rating: 0 out of 5 stars0 ratingsChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology Rating: 0 out of 5 stars0 ratingsPractical Lock Picking: A Physical Penetration Tester's Training Guide Rating: 5 out of 5 stars5/5Elon Musk Rating: 4 out of 5 stars4/5Dark Aeon: Transhumanism and the War Against Humanity Rating: 5 out of 5 stars5/5The Professional Voiceover Handbook: Voiceover training, #1 Rating: 5 out of 5 stars5/5Master Builder Roblox: The Essential Guide Rating: 4 out of 5 stars4/5Hacking: Ultimate Beginner's Guide for Computer Hacking in 2018 and Beyond: Hacking in 2018, #1 Rating: 4 out of 5 stars4/5
Reviews for Deep Learning through Sparse and Low-Rank Modeling
0 ratings0 reviews
Book preview
Deep Learning through Sparse and Low-Rank Modeling - Zhangyang Wang
Authors
Chapter 1
Introduction
Zhangyang Wang⁎; Ding Liu† ⁎Department of Computer Science and Engineering, Texas A&M University, College Station, TX, United States
†Beckman Institute for Advanced Science and Technology, Urbana, IL, United States
Abstract
Deep learning has achieved prevailing success in a wide domain of machine learning and computer vision fields. On the other hand, sparsity and low-rankness have been popular regularizations in classical machine learning. This section is intended as a brief introduction to the basics if deep learning, and then focuses on its inherent connections to the concepts of sparsity and low-rankness.
Keywords
Sparsity; Low rank; Deep learning
Chapter Outline
1.1 Basics of Deep Learning
1.2 Basics of Sparsity and Low-Rankness
1.3 Connecting Deep Learning to Sparsity and Low-Rankness
1.4 Organization
References
1.1 Basics of Deep Learning
Machine learning makes computers learn from data without explicitly programming them. However, classical machine learning algorithms often find it challenging to extract semantic features directly from raw data, e.g., due to the well-known semantic gap
[1], which calls for the assistance from domain experts to hand-craft many well-engineered feature representations, on which the machine learning models operate more effectively. In contrast, the recently popular deep learning relies on multilayer neural networks to derive semantically meaningful representations, by building multiple simple features to represent a sophisticated concept. Deep learning requires less hand-engineered features and expert knowledge. Taking image classification as an example [2], a deep learning-based image classification system represents an object by gradually extracting edges, textures, and structures, from lower to middle-level hidden layers, which becomes more and more associated with the target semantic concept as the model grows deeper. Driven by the emergence of big data and hardware acceleration, the intricacy of data can be extracted with higher and more abstract level representation from raw inputs, gaining more power for deep learning to solve complicated, even traditionally intractable problems. Deep learning has achieved tremendous success in visual object recognition [2–5], face recognition and verification [6,7], object detection [8–11], image restoration and enhancement [12–17], clustering [18], emotion recognition [19], aesthetics and style recognition [20–23], scene understanding [24,25], speech recognition [26], machine translation [27], image synthesis [28], and even playing Go [29] and poker [30].
A basic neural network is composed of a set of perceptrons (artificial neurons), each of which maps inputs to output values with a simple activation function. Among recent deep neural network architectures, convolutional neural networks (CNNs) and recurrent neural networks (RNNs) are the two main streams, differing in their connectivity patterns. CNNs deploy convolution operations on hidden layers for weight sharing and parameter reduction. CNNs can extract local information from grid-like input data, and have mainly shown successes in computer vision and image processing, with many popular instances such as LeNet [31], AlexNet [2], VGG [32], GoogLeNet [33], and ResNet [34]. RNNs are dedicated to processing sequential input data with variable length. RNNs produce an output at each time step. The hidden neuron at each time step is calculated based on input data and hidden neurons at the previous time step. To avoid vanishing/exploding gradients of RNNs in long term dependency, long short-term memory (LSTM) [35] and gated recurrent unit (GRU) [36] with controllable gates are widely used in practical applications. Interested readers are referred to a comprehensive deep learning textbook [37].
1.2 Basics of Sparsity and Low-Rankness
In signal processing, the classical way to represent a multidimensional signal is to express it as a linear combination of the components in a (chosen in advance and also learned) basis. The goal of linearly transforming a signal with respect to a basis is to have a more predictable pattern in the resultant linear coefficients. With an appropriate basis, such coefficients often exhibit some desired characteristics for signals. One important observation is that, for most natural signals such as image and audio, most of the coefficients are zero or close to zero if the basis is properly selected: the technique is usually termed as sparse coding, and the basis is called the dictionary -norm. Beyond the element-wise sparsity model, more elaborate structured sparse models have also been developed [40,41]. The learning of basis (called dictionary) further boosts the power of sparse coding [42–44].
More generally, the sparsity belongs to the well-received principle of parsimony, i.e., preferring a simple representation to a more complex one. The sparsity level (number of nonzero elements) is a natural measure of representation complexity of vector-valued features. In the case of matrix-valued features, the matrix rank provides another notion of parsimony, assuming high-dimensional data lies close to a low-dimensional subspace or manifold. Similarly to sparse optimization, a series of works have shown that rank minimization can be achieved through convex optimization [45] or efficient heuristics [46], paving the path to high-dimensional data analysis such as video processing [47–52].
1.3 Connecting Deep Learning to Sparsity and Low-Rankness
Beyond their proven success in conventional machine learning algorithms, the sparse and low-rank structures are widely found to be effective for regularizing deep learning, for improving model generalization, training behaviors, data efficiency ) decay term limits the weights of the neurons. Another popular tool to avoid overfitting, dropout [2], is a simple regularization approach that improves the generalization of deep networks, by randomly putting hidden neurons to zero in the training stage, which could be viewed as a stochastic form of enforcing sparsity. Besides, the inherent sparse properties of both deep network weights and activations have also been widely observed and utilized for compressing deep models [55] and improving their energy efficiency [56,57]. As for low-rankness, much research has also been devoted to learning low-rank convolutional filters [58] and network compression [59].
Our focus of this book is to explore a deeper structural connection between sparse/low-rank models and deep models. While many examples will be detailed in the remainder of the book, we here briefly state the main idea. We start from the following regularized regression form, which represents a large family of feature learning models, such as ridge regression, sparse coding, and low-rank representation
(1.1)
further incorporates the problem-specific prior knowledge. Not surprisingly, many instances of Eq. (1.1) could be solved by a similar class of iterative algorithms
(1.2)
denotes the intermediate output of the kis a simple nonlinear operator. Equation (1.2) could be expressed by a recursive system, whose fixed point is expected to be the solution a of Eq. (1.1). Furthermore, the recursive system could be unfolded and truncated to k iterations, to construct a (k+1)-layer feed-forward network. Without any further tuning, the resulting architecture will output a k-iteration approximation of the exact solution aby default. Then, the concrete function forms are given as (u is its ith element)
(1.3)
is an element-wise soft shrinkage function. The unfolded and truncated version of Eq. (1.3) was first proposed in [60], called the learned iterative shrinkage and thresholding algorithm (LISTA). Recent works [61,18,62–64] followed LISTA and developed various models, and many jointly optimized the unfolded model with discriminative tasks [65].
, Eq. (1.2) could be adapted to solve the nonnegative sparse coding problem
(1.4)
A by-product of applying nonnegativity is that the original sparsity coefficient λ as in Eq. , and have
(1.5)
is assumed, it could be absorbed into the bias term −λ. Equation (1.5) is exactly a fully-connected layer followed by ReLU neurons, one of the most standard building blocks in existing deep models. Convolutional layers could be derived similarly by looking at a convolutional sparse coding model [66] rather than a linear one. Such a hidden structural resemblance reveals the potential to bridge many sparse and low-rank models with current successful deep models, potentially enhancing the generalization, compactness and interpretability of the latter.
1.4 Organization
In the remainder of this book, Chapter 2 will first introduce the bi-level sparse coding model, using the example of hyperspectral image classification. Chapters 3, 4 and 5 will then present three concrete examples (classification, superresolution, and clustering), to show how (bi-level) sparse coding models could be naturally converted to and trained as deep networks. From Chapter 6 to Chapter 9, we will delve into the extensive applications of deep learning aided by sparsity and low-rankness, in signal processing, dimensionality reduction, action recognition, style recognition and kinship understanding, respectively.
References
[1] R. Zhao, W.I. Grosky, Narrowing the semantic gap-improved text-based web document retrieval using visual features, IEEE Transactions on Multimedia 2002;4(2):189–200.
[2] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, NIPS. 2012.
[3] Z. Wang, S. Chang, Y. Yang, D. Liu, T.S. Huang, Studying very low resolution recognition using deep networks, Proceedings of the IEEE conference on computer vision and pattern recognition. 2016:4792–4800.
[4] D. Liu, B. Cheng, Z. Wang, H. Zhang, T.S. Huang, Enhance visual recognition under adverse conditions via deep networks, arXiv preprint arXiv:1712.07732; 2017.
[5] Z. Wu, Z. Wang, Z. Wang, H. Jin, Towards privacy-preserving visual recognition via adversarial training: a pilot study, arXiv preprint arXiv:1807.08379; 2018.
[6] N. Bodla, J. Zheng, H. Xu, J. Chen, C.D. Castillo, R. Chellappa, Deep heterogeneous feature fusion for template-based face recognition, 2017 IEEE winter conference on applications of computer vision, WACV 2017. Santa Rosa, CA, USA, March 24–31, 2017. 2017:586–595.
[7] R. Ranjan, A. Bansal, H. Xu, S. Sankaranarayanan, J. Chen, C.D. Castillo, et al., Crystal loss and quality pooling for unconstrained face verification and recognition, CoRR 2018. arXiv:1804.01159 [abs].
[8] S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: towards real-time object detection with region proposal networks, Advances in neural information processing systems. 2015:91–99.
[9] J. Yu, Y. Jiang, Z. Wang, Z. Cao, T. Huang, Unitbox: an advanced object detection network, Proceedings of the 2016 ACM on multimedia conference. ACM; 2016:516–520.
[10] J. Gao, Q. Wang, Y. Yuan, Embedding structured contour and location prior in siamesed fully convolutional networks for road detection, Robotics and automation (ICRA), 2017 IEEE international conference on. IEEE; 2017:219–224.
[11] H. Xu, X. Lv, X. Wang, Z. Ren, N. Bodla, R. Chellappa, Deep regionlets for object detection, The European conference on computer vision (ECCV). 2018.
[12] R. Timofte, E. Agustsson, L. Van Gool, M.H. Yang, L. Zhang, B. Lim, et al., NTIRE 2017 challenge on single image super-resolution: methods and results, Computer vision and pattern recognition workshops (CVPRW), 2017 IEEE conference on. IEEE; 2017:1110–1121.
[13] B. Li, X. Peng, Z. Wang, J. Xu, D. Feng, AOD-Net: all-in-one dehazing network, Proceedings of the IEEE international conference on computer vision. 2017:4770–4778.
[14] B. Li, X. Peng, Z. Wang, J. Xu, D. Feng, An all-in-one network for dehazing and beyond, arXiv preprint arXiv:1707.06543; 2017.
[15] B. Li, X. Peng, Z. Wang, J. Xu, D. Feng, End-to-end united video dehazing and detection, arXiv preprint arXiv:1709.03919; 2017.
[16] D. Liu, B. Wen, J. Jiao, X. Liu, Z. Wang, T.S. Huang, Connecting image denoising and high-level vision tasks via deep learning, arXiv preprint arXiv:1809.01826; 2018.
[17] R. Prabhu, X. Yu, Z. Wang, D. Liu, A. Jiang, U-finger: multi-scale dilated convolutional network for fingerprint image denoising and inpainting, arXiv preprint arXiv:1807.10993; 2018.
[18] Z. Wang, S. Chang, J. Zhou, M. Wang, T.S. Huang, Learning a task-specific deep architecture for clustering, SDM 2016.
[19] B. Cheng, Z. Wang, Z. Zhang, Z. Li, D. Liu, J. Yang, et al., Robust emotion recognition from low quality and low bit rate video: a deep learning approach, arXiv preprint arXiv:1709.03126; 2017.
[20] Z. Wang, J. Yang, H. Jin, E. Shechtman, A. Agarwala, J. Brandt, et al., DeepFont: identify your font from an image, Proceedings of the 23rd ACM international conference on multimedia. ACM; 2015:451–459.
[21] Z. Wang, J. Yang, H. Jin, E. Shechtman, A. Agarwala, J. Brandt, et al., Real-world font recognition using deep network and domain adaptation, arXiv preprint arXiv:1504.00028; 2015.
[22] Z. Wang, S. Chang, F. Dolcos, D. Beck, D. Liu, T.S. Huang, Brain-inspired deep networks for image aesthetics assessment, arXiv preprint arXiv:1601.04155; 2016.
[23] T.S. Huang, J. Brandt, A. Agarwala, E. Shechtman, Z. Wang, H. Jin, et al., Deep learning for font recognition and retrieval, Applied cloud deep semantic recognition. Auerbach Publications; 2018:109–130.
[24] C. Farabet, C. Couprie, L. Najman, Y. LeCun, Learning hierarchical features for scene labeling, IEEE Transactions on Pattern Analysis and Machine Intelligence 2013;35(8):1915–1929.
[25] Q. Wang, J. Gao, Y. Yuan, A joint convolutional neural networks and context transfer for street scenes labeling, IEEE Transactions on Intelligent Transportation Systems 2017.
[26] G. Saon, H.K.J. Kuo, S. Rennie, M. Picheny, The IBM 2015 English conversational telephone speech recognition system, arXiv preprint arXiv:1505.05899; 2015.
[27] I. Sutskever, O. Vinyals, Q.V. Le, Sequence to sequence learning with neural networks, Advances in neural information processing systems. 2014:3104–3112.
[28] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., Generative adversarial nets, Advances in neural information processing systems. 2014:2672–2680.
[29] D. Silver, A. Huang, C.J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, et al., Mastering the game of go with deep neural networks and tree search, Nature 2016;529(7587):484–489.
[30] M. Moravčík, M. Schmid, N. Burch, V. Lisỳ, D. Morrill, N. Bard, et al., DeepStack: expert-level artificial intelligence in no-limit poker, arXiv preprint arXiv:1701.01724; 2017.
[31] Y. LeCun, et al., LeNet-5, convolutional neural networks, URL: http://yann.lecun.com/exdb/lenet; 2015.
[32] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556; 2014.
[33] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, et al., Going deeper with convolutions, Proceedings of the IEEE conference on computer vision and pattern recognition. 2015:1–9.
[34] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition. 2016:770–778.
[35] F.A. Gers, J. Schmidhuber, F. Cummins, Learning to forget: continual prediction with LSTM, Neural Computation 2000;12(10):2451–2471.
[36] J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv preprint arXiv:1412.3555; 2014.
[37] I. Goodfellow, Y. Bengio, A. Courville, Deep learning. MIT Press; 2016.
[38] Z. Wang, J. Yang, H. Zhang, Z. Wang, Y. Yang, D. Liu, et al., Sparse coding and its applications in computer vision. World Scientific; 2015.
[39] R.G. Baraniuk, Compressive sensing [lecture notes], IEEE Signal Processing Magazine 2007;24(4):118–121.
[40] J. Huang, T. Zhang, D. Metaxas, Learning with structured sparsity, Journal of Machine Learning Research Nov. 2011;12:3371–3412.
[41] H. Xu, J. Zheng, A. Alavi, R. Chellappa, Template regularized sparse coding for face verification, 23rd International conference on pattern recognition, ICPR 2016. Cancún, Mexico, December 4–8, 2016. 2016:1448–1454.
[42] H. Xu, J. Zheng, A. Alavi, R. Chellappa, Cross-domain visual recognition via domain adaptive dictionary learning, CoRR 2018. arXiv:1804.04687 [abs].
[43] H. Xu, J. Zheng, R. Chellappa, Bridging the domain shift by domain adaptive dictionary learning, Proceedings of the British machine vision conference 2015, BMVC 2015. Swansea, UK, September 7–10, 2015. 2015 p. 96.1–96.12.
[44] H. Xu, J. Zheng, A. Alavi, R. Chellappa, Learning a structured dictionary for video-based face recognition, 2016 IEEE winter conference on applications of computer vision, WACV 2016. Lake Placid, NY, USA, March 7–10, 2016. 2016:1–9.
[45] E.J. Candès, X. Li, Y. Ma, J. Wright, Robust principal component analysis? Journal of the ACM (JACM) 2011;58(3):11.
[46] Z. Wen, W. Yin, Y. Zhang, Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm, Mathematical Programming Computation 2012:1–29.
[47] Z. Wang, H. Li, Q. Ling, W. Li, Robust temporal-spatial decomposition and its applications in video processing, IEEE Transactions on Circuits and Systems for Video Technology