Restricted Boltzmann Machine: Fundamentals and Applications for Unlocking the Hidden Layers of Artificial Intelligence
By Fouad Sabry
()
About this ebook
What Is Restricted Boltzmann Machine
A restricted Boltzmann machine, often known as an RBM, is an example of an artificial neural network that is stochastic and generative and has the ability to develop a probability distribution over its own set of inputs.
How You Will Benefit
(I) Insights, and validations about the following topics:
Chapter 1: Restricted Boltzmann Machine
Chapter 2: Boltzmann Distribution
Chapter 3: Entropy (Information Theory)
Chapter 4: Unsupervised Learning
Chapter 5: Mutual Information
Chapter 6: Boltzmann Machine
Chapter 7: Cross Entropy
Chapter 8: Softmax Function
Chapter 9: Autoencoder
Chapter 10: Deep Belief Network
(II) Answering the public top questions about restricted boltzmann machine.
(III) Real world examples for the usage of restricted boltzmann machine in many fields.
Who This Book Is For
Professionals, undergraduate and graduate students, enthusiasts, hobbyists, and those who want to go beyond basic knowledge or information for any kind of restricted boltzmann machine.
What Is Artificial Intelligence Series
The Artificial Intelligence eBook series provides comprehensive coverage in over 200 topics. Each ebook covers a specific Artificial Intelligence topic in depth, written by experts in the field. The series aims to give readers a thorough understanding of the concepts, techniques, history and applications of artificial intelligence. Topics covered include machine learning, deep learning, neural networks, computer vision, natural language processing, robotics, ethics and more. The ebooks are written for professionals, students, and anyone interested in learning about the latest developments in this rapidly advancing field.
The Artificial Intelligence eBook series provides an in-depth yet accessible exploration, from the fundamental concepts to the state-of-the-art research. With over 200 volumes, readers gain a thorough grounding in all aspects of Artificial Intelligence. The ebooks are designed to build knowledge systematically, with later volumes building on the foundations laid by earlier ones. This comprehensive series is an indispensable resource for anyone seeking to develop expertise in artificial intelligence.
Related to Restricted Boltzmann Machine
Titles in the series (100)
Statistical Classification: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsMultilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks Rating: 0 out of 5 stars0 ratingsRecurrent Neural Networks: Fundamentals and Applications from Simple to Gated Architectures Rating: 0 out of 5 stars0 ratingsRestricted Boltzmann Machine: Fundamentals and Applications for Unlocking the Hidden Layers of Artificial Intelligence Rating: 0 out of 5 stars0 ratingsArtificial Neural Networks: Fundamentals and Applications for Decoding the Mysteries of Neural Computation Rating: 0 out of 5 stars0 ratingsNouvelle Artificial Intelligence: Fundamentals and Applications for Producing Robots With Intelligence Levels Similar to Insects Rating: 0 out of 5 stars0 ratingsHebbian Learning: Fundamentals and Applications for Uniting Memory and Learning Rating: 0 out of 5 stars0 ratingsPerceptrons: Fundamentals and Applications for The Neural Building Block Rating: 0 out of 5 stars0 ratingsLong Short Term Memory: Fundamentals and Applications for Sequence Prediction Rating: 0 out of 5 stars0 ratingsLearning Intelligent Distribution Agent: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsRadial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks Rating: 0 out of 5 stars0 ratingsFeedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs Rating: 0 out of 5 stars0 ratingsConvolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery Rating: 0 out of 5 stars0 ratingsHopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories Rating: 0 out of 5 stars0 ratingsCompetitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition Rating: 0 out of 5 stars0 ratingsAttractor Networks: Fundamentals and Applications in Computational Neuroscience Rating: 0 out of 5 stars0 ratingsBackpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning Rating: 0 out of 5 stars0 ratingsLogic Programming: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsGroup Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis Rating: 0 out of 5 stars0 ratingsEmbodied Cognitive Science: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsBio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World Rating: 0 out of 5 stars0 ratingsArtificial Immune Systems: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsNaive Bayes Classifier: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsHybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models Rating: 0 out of 5 stars0 ratingsKernel Methods: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsArtificial Intelligence Systems Integration: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsNeuroevolution: Fundamentals and Applications for Surpassing Human Intelligence with Neuroevolution Rating: 0 out of 5 stars0 ratingsEmbodied Cognition: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsDistributed Artificial Intelligence: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsHierarchical Control System: Fundamentals and Applications Rating: 0 out of 5 stars0 ratings
Related ebooks
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks Rating: 0 out of 5 stars0 ratingsBackpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning Rating: 0 out of 5 stars0 ratingsSupport Vector Machine: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsDynamic Bayesian Networks: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsHopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories Rating: 0 out of 5 stars0 ratingsMultilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks Rating: 0 out of 5 stars0 ratingsPerceptrons: Fundamentals and Applications for The Neural Building Block Rating: 0 out of 5 stars0 ratingsKernel Methods: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsAlgorithmic Probability: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsCompetitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition Rating: 0 out of 5 stars0 ratingsK Nearest Neighbor Algorithm: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsStatics with MATLAB® Rating: 0 out of 5 stars0 ratingsFeedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs Rating: 0 out of 5 stars0 ratingsDigital Modulations using Matlab Rating: 4 out of 5 stars4/5Top Numerical Methods With Matlab For Beginners! Rating: 0 out of 5 stars0 ratingsApproximation and Optimization of Discrete and Differential Inclusions Rating: 0 out of 5 stars0 ratingsRandom Optimization: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsAttractor Networks: Fundamentals and Applications in Computational Neuroscience Rating: 0 out of 5 stars0 ratingsSimulation of Digital Communication Systems Using Matlab Rating: 4 out of 5 stars4/5The Mimetic Finite Difference Method for Elliptic Problems Rating: 0 out of 5 stars0 ratingsBayesian Decision Networks: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsExercises of Limits Rating: 0 out of 5 stars0 ratingsCombs Method: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsSpline and Spline Wavelet Methods with Applications to Signal and Image Processing: Volume III: Selected Topics Rating: 0 out of 5 stars0 ratingsMarkov Decision Process: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsConstrained Conditional Model: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsConvolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery Rating: 0 out of 5 stars0 ratingsClosed World Assumption: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsAiming for Quantum Computer Rating: 0 out of 5 stars0 ratings
Intelligence (AI) & Semantics For You
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5Artificial Intelligence: A Guide for Thinking Humans Rating: 4 out of 5 stars4/52084: Artificial Intelligence and the Future of Humanity Rating: 4 out of 5 stars4/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 5 out of 5 stars5/5Summary of Super-Intelligence From Nick Bostrom Rating: 5 out of 5 stars5/5101 Midjourney Prompt Secrets Rating: 3 out of 5 stars3/5ChatGPT For Fiction Writing: AI for Authors Rating: 5 out of 5 stars5/5The Secrets of ChatGPT Prompt Engineering for Non-Developers Rating: 5 out of 5 stars5/5Our Final Invention: Artificial Intelligence and the End of the Human Era Rating: 4 out of 5 stars4/5Dark Aeon: Transhumanism and the War Against Humanity Rating: 5 out of 5 stars5/5Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures Rating: 4 out of 5 stars4/5Midjourney Mastery - The Ultimate Handbook of Prompts Rating: 5 out of 5 stars5/5Discovery Writing with ChatGPT: AI-Powered Storytelling: Three Story Method, #6 Rating: 0 out of 5 stars0 ratingsImpromptu: Amplifying Our Humanity Through AI Rating: 5 out of 5 stars5/5What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions Rating: 5 out of 5 stars5/5ChatGPT For Dummies Rating: 0 out of 5 stars0 ratingsThe Algorithm of the Universe (A New Perspective to Cognitive AI) Rating: 5 out of 5 stars5/5ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology Rating: 0 out of 5 stars0 ratingsAI for Educators: AI for Educators Rating: 5 out of 5 stars5/5Ways of Being: Animals, Plants, Machines: The Search for a Planetary Intelligence Rating: 4 out of 5 stars4/5The Business Case for AI: A Leader's Guide to AI Strategies, Best Practices & Real-World Applications Rating: 0 out of 5 stars0 ratingsTHE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION Rating: 5 out of 5 stars5/5
Reviews for Restricted Boltzmann Machine
0 ratings0 reviews
Book preview
Restricted Boltzmann Machine - Fouad Sabry
Chapter 1: Restricted Boltzmann machine
A restricted Boltzmann machine, also known as an RBM, is a type of artificial neural network that is stochastic and generative, and it is capable of learning a probability distribution over its own set of inputs.
RBMs were initially invented under the name Harmonium by Paul Smolensky in 1986, and rose to prominence after Geoffrey Hinton and collaborators invented fast learning algorithms for them in the mid-2000. RBMs have been used in dimensionality reduction and even in many-body quantum mechanics to name a couple of their applications. They can be trained in either supervised or unsupervised ways, depending on the task.
As their name implies, RBMs are a variant of Boltzmann machines, with the restriction that their neurons must form a bipartite graph: a pair of nodes from each of the two groups of units (commonly referred to as the visible
and hidden
units respectively) may have a symmetric connection between them; and there are no connections between nodes within a group. By contrast, unrestricted
Boltzmann machines may have connections between hidden units. This restriction allows for more efficient training algorithms than are available for the general class of Boltzmann machines, in particular the gradient-based contrastive divergence algorithm.
The typical implementation of RBM uses Boolean-valued hidden and visible units that have binary values, and consists of a matrix of weights W of size m\times n .
Each weight element {\displaystyle (w_{i,j})} of the matrix is associated with the connection between the visible (input) unit v_{i} and the hidden unit h_{j} .
In addition, there are bias weights (offsets) a_{i} for v_{i} and b_{j} for h_{j} .
Taking into account the various weights and biases, the energy of a configuration (pair of boolean vectors) (v,h) is defined as
E(v,h) = -\sum_i a_i v_i - \sum_j b_j h_j -\sum_i \sum_j v_i w_{i,j} h_jor, in matrix notation,
{\displaystyle E(v,h)=-a^{\mathrm {T} }v-b^{\mathrm {T} }h-v^{\mathrm {T} }Wh.}A Hopfield network's energy function can be thought of as an analog for this one. As with general Boltzmann machines, the joint probability distribution for the visible and hidden vectors is defined in terms of the energy function as follows, P(v,h) = \frac{1}{Z} e^{-E(v,h)}
where Z is a partition function defined as the sum of e^{-E(v,h)} over all possible configurations, This might be viewed as a normalizing constant to guarantee that the probabilities sum to 1.
The marginal probability of a visible vector is the sum of {\displaystyle P(v,h)} over all possible hidden layer configurations, {\displaystyle P(v)={\frac {1}{Z}}\sum _{\{h\}}e^{-E(v,h)}} , and vice versa. Given the visible unit activations, the RBM's underlying graph structure is bipartite, which means that there are no connections between the layers of the graph itself; hence, the hidden unit activations are mutually independent of the apparent unit activations. Conversely, the visible unit activations are mutually independent given the hidden unit activations. That is, for m visible units and n hidden units, the conditional probability of a configuration of the visible units v, given a configuration of the hidden units h, is
P(v|h) = \prod_{i=1}^m P(v_i|h) .
Conversely, the conditional probability of h given v is
P(h|v) = \prod_{j=1}^n P(h_j|v) .
The individual activation probabilities can be found by using the formula:
{\displaystyle P(h_{j}=1|v)=\sigma \left(b_{j}+\sum _{i=1}^{m}w_{i,j}v_{i}\right)}and
\,P(v_i=1|h) = \sigma \left(a_i + \sum_{j=1}^n w_{i,j} h_j \right)where \sigma denotes the logistic sigmoid.
The Restricted Boltzmann Machine may use multinomial for its apparent units, but its Bernoulli units are always in operation behind the scenes. In this particular instance, the softmax function takes the role of the logistic function for visible units.
{\displaystyle P(v_{i}^{k}=1|h)={\frac {\exp(a_{i}^{k}+\Sigma _{j}W_{ij}^{k}h_{j})}{\Sigma _{k'=1}^{K}\exp(a_{i}^{k'}+\Sigma _{j}W_{ij}^{k'}h_{j})}}}where K is the total number of discrete values that are present in the viewable values. They find use in the process of topic modeling, Restricted Markov random fields are a special case of Boltzmann machines. Boltzmann machines are also a type of Markov random fields. The graphical representation of their model is the same as that of factor analysis.
Restricted Boltzmann machines are trained to maximize the product of probabilities assigned to some training set V (a matrix, each row of which is treated as a visible vector v ), \arg\max_W \prod_{v \in V} P(v)
or equivalently, to maximize the expected log probability of a training sample v selected randomly from V :
{\displaystyle \arg \max _{W}\mathbb {E} \left[\log P(v)\right]}The method most typically used to train RBMs, that is, to optimize the weight matrix W , Is Hinton responsible for developing the contrastive divergence (CD) algorithm?, originally developed to train PoE (product of experts) models.
The algorithm performs Gibbs sampling and is used inside a gradient descent procedure (similar to the way backpropagation is used inside such a procedure when training feedforward neural nets) to compute weight update.
The fundamental, one-step contrastive divergence (CD-1) process for a single sample may be described as follows:
First, we take a training sample v, then we compute the probabilities of the hidden units, and finally, we sample a hidden activation vector h from this probability distribution.
Compute the outer product of v and h and call this the positive gradient.
First, sample a reconstruction of the visible units using value v', and then use this value to resample the hidden activations using value h'. (Gibbs sampling step)
Determine the negative gradient by computing the outer product of v' and h' and referring to the result as such.
Let the update to the weight matrix W be the positive gradient minus the negative gradient, times some learning rate: {\displaystyle \Delta W=\epsilon (vh^{\mathsf {T}}-v'h'^{\mathsf {T}})} .
Update the biases a and b analogously: {\displaystyle \Delta a=\epsilon (v-v')} , {\displaystyle \Delta b=\epsilon (h-h')} .
On his website, Hinton has posted a guide that he has written called A Practical Guide to Training RBMs.
The difference between the Stacked Restricted Boltzmann Machines and RBM is that RBM has lateral connections within a layer that are prohibited to make analysis tractable. On the other hand, the Stacked Boltzmann consists of a combination of an unsupervised three-layer network with symmetric weights and a supervised fine-tuned top layer for recognizing three classes.
The usage of Stacked Boltzmann is to understand Natural languages, retrieve documents, image generation, and classification. These functions are learned via unsupervised pre-training and/or supervised fine-tuning. In contrast to the undirected symmetric top layer, RBM's connection layer is an unsymmetric layer that can go in either direction. The restricted Boltzmann's connection is three-layers with asymmetric weights, and two networks are combined into one.
There are some parallels to be drawn between RBM and stacked Boltzmann, the neuron for Stacked Boltzmann is a stochastic binary Hopfield neuron, that is synonymous with the Restricted Boltzmann Machine.
The energy from both Restricted Boltzmann and RBM is given by Gibb's probability measure:
E=-{\frac 12}\sum _{{i,j}}{w_{{ij}}{s_{i}}{s_{j}}}+\sum _{i}{\theta _{i}}{s_{i}}.
The training process of Restricted Boltzmann is similar to RBM.
Restricted Boltzmann train one layer at a time and approximate equilibrium state with a 3-segment pass, not performing back propagation.
Restricted Boltzmann uses both supervised and unsupervised on different RBM for pre-training for classification and recognition.
The training uses contrastive divergence with Gibbs sampling: Δwij = e*(pij - p'ij)
The restricted Boltzmann's strength is it performs a non-linear transformation so it's easy to expand, and can give a hierarchical layer of features. The Weakness is that it has complicated calculations of integer and real-valued neurons. It does not follow the gradient of any function, so the approximation of Contrastive divergence to maximum likelihood is improvised.
Fischer, Asja; Igel, Christian (2012), An Introduction to Restricted Boltzmann Machines
, Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Lecture Notes in Computer Science, Berlin, Heidelberg: Springer Berlin Heidelberg, vol. 7441, pp. 14–36, doi:10.1007/978-3-642-33275-3_2, ISBN 978-3-642-33274-6, retrieved 2021-09-19
{End Chapter 1}
Chapter 2: Boltzmann distribution
In statistical mechanics and mathematics, a Boltzmann distribution, which is also known as a Gibbs distribution, is a probability distribution or probability measure that gives the probability that a system will be in a certain state as a function of the energy of that state and the temperature of the system. In other words, the probability that a system will be in a certain state is proportional to the energy of the state and the temperature of the system. The distribution is expressed in the form:
{\displaystyle p_{i}\propto \exp \left(-{\frac {\varepsilon _{i}}{kT}}\right)}where pi is the probability of the system being in state i, exp is the exponential function, εi is the energy of that state, The Boltzmann constant k is multiplied by the thermodynamic temperature T to get the constant kT of the distribution.
The symbol {\textstyle \propto } denotes proportionality (see § The distribution for the proportionality constant).
In this context, the word system
may refer to a variety of different things, ranging from a