Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Restricted Boltzmann Machine: Fundamentals and Applications for Unlocking the Hidden Layers of Artificial Intelligence
Restricted Boltzmann Machine: Fundamentals and Applications for Unlocking the Hidden Layers of Artificial Intelligence
Restricted Boltzmann Machine: Fundamentals and Applications for Unlocking the Hidden Layers of Artificial Intelligence
Ebook145 pages1 hour

Restricted Boltzmann Machine: Fundamentals and Applications for Unlocking the Hidden Layers of Artificial Intelligence

Rating: 0 out of 5 stars

()

Read preview

About this ebook

What Is Restricted Boltzmann Machine


A restricted Boltzmann machine, often known as an RBM, is an example of an artificial neural network that is stochastic and generative and has the ability to develop a probability distribution over its own set of inputs.


How You Will Benefit


(I) Insights, and validations about the following topics:


Chapter 1: Restricted Boltzmann Machine


Chapter 2: Boltzmann Distribution


Chapter 3: Entropy (Information Theory)


Chapter 4: Unsupervised Learning


Chapter 5: Mutual Information


Chapter 6: Boltzmann Machine


Chapter 7: Cross Entropy


Chapter 8: Softmax Function


Chapter 9: Autoencoder


Chapter 10: Deep Belief Network


(II) Answering the public top questions about restricted boltzmann machine.


(III) Real world examples for the usage of restricted boltzmann machine in many fields.


Who This Book Is For


Professionals, undergraduate and graduate students, enthusiasts, hobbyists, and those who want to go beyond basic knowledge or information for any kind of restricted boltzmann machine.


What Is Artificial Intelligence Series


The Artificial Intelligence eBook series provides comprehensive coverage in over 200 topics. Each ebook covers a specific Artificial Intelligence topic in depth, written by experts in the field. The series aims to give readers a thorough understanding of the concepts, techniques, history and applications of artificial intelligence. Topics covered include machine learning, deep learning, neural networks, computer vision, natural language processing, robotics, ethics and more. The ebooks are written for professionals, students, and anyone interested in learning about the latest developments in this rapidly advancing field.
The Artificial Intelligence eBook series provides an in-depth yet accessible exploration, from the fundamental concepts to the state-of-the-art research. With over 200 volumes, readers gain a thorough grounding in all aspects of Artificial Intelligence. The ebooks are designed to build knowledge systematically, with later volumes building on the foundations laid by earlier ones. This comprehensive series is an indispensable resource for anyone seeking to develop expertise in artificial intelligence.

LanguageEnglish
Release dateJun 23, 2023
Restricted Boltzmann Machine: Fundamentals and Applications for Unlocking the Hidden Layers of Artificial Intelligence

Related to Restricted Boltzmann Machine

Titles in the series (100)

View More

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for Restricted Boltzmann Machine

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Restricted Boltzmann Machine - Fouad Sabry

    Chapter 1: Restricted Boltzmann machine

    A restricted Boltzmann machine, also known as an RBM, is a type of artificial neural network that is stochastic and generative, and it is capable of learning a probability distribution over its own set of inputs.

    RBMs were initially invented under the name Harmonium by Paul Smolensky in 1986, and rose to prominence after Geoffrey Hinton and collaborators invented fast learning algorithms for them in the mid-2000. RBMs have been used in dimensionality reduction and even in many-body quantum mechanics to name a couple of their applications. They can be trained in either supervised or unsupervised ways, depending on the task.

    As their name implies, RBMs are a variant of Boltzmann machines, with the restriction that their neurons must form a bipartite graph: a pair of nodes from each of the two groups of units (commonly referred to as the visible and hidden units respectively) may have a symmetric connection between them; and there are no connections between nodes within a group. By contrast, unrestricted Boltzmann machines may have connections between hidden units. This restriction allows for more efficient training algorithms than are available for the general class of Boltzmann machines, in particular the gradient-based contrastive divergence algorithm.

    The typical implementation of RBM uses Boolean-valued hidden and visible units that have binary values, and consists of a matrix of weights W of size m\times n .

    Each weight element {\displaystyle (w_{i,j})} of the matrix is associated with the connection between the visible (input) unit v_{i} and the hidden unit h_{j} .

    In addition, there are bias weights (offsets) a_{i} for v_{i} and b_{j} for h_{j} .

    Taking into account the various weights and biases, the energy of a configuration (pair of boolean vectors) (v,h) is defined as

    E(v,h) = -\sum_i a_i v_i - \sum_j b_j h_j -\sum_i \sum_j v_i w_{i,j} h_j

    or, in matrix notation,

    {\displaystyle E(v,h)=-a^{\mathrm {T} }v-b^{\mathrm {T} }h-v^{\mathrm {T} }Wh.}

    A Hopfield network's energy function can be thought of as an analog for this one. As with general Boltzmann machines, the joint probability distribution for the visible and hidden vectors is defined in terms of the energy function as follows, P(v,h) = \frac{1}{Z} e^{-E(v,h)}

    where Z is a partition function defined as the sum of e^{-E(v,h)} over all possible configurations, This might be viewed as a normalizing constant to guarantee that the probabilities sum to 1.

    The marginal probability of a visible vector is the sum of {\displaystyle P(v,h)} over all possible hidden layer configurations, {\displaystyle P(v)={\frac {1}{Z}}\sum _{\{h\}}e^{-E(v,h)}} , and vice versa. Given the visible unit activations, the RBM's underlying graph structure is bipartite, which means that there are no connections between the layers of the graph itself; hence, the hidden unit activations are mutually independent of the apparent unit activations. Conversely, the visible unit activations are mutually independent given the hidden unit activations. That is, for m visible units and n hidden units, the conditional probability of a configuration of the visible units v, given a configuration of the hidden units h, is

    P(v|h) = \prod_{i=1}^m P(v_i|h) .

    Conversely, the conditional probability of h given v is

    P(h|v) = \prod_{j=1}^n P(h_j|v) .

    The individual activation probabilities can be found by using the formula:

    {\displaystyle P(h_{j}=1|v)=\sigma \left(b_{j}+\sum _{i=1}^{m}w_{i,j}v_{i}\right)}

    and

    \,P(v_i=1|h) = \sigma \left(a_i + \sum_{j=1}^n w_{i,j} h_j \right)

    where \sigma denotes the logistic sigmoid.

    The Restricted Boltzmann Machine may use multinomial for its apparent units, but its Bernoulli units are always in operation behind the scenes. In this particular instance, the softmax function takes the role of the logistic function for visible units.

    {\displaystyle P(v_{i}^{k}=1|h)={\frac {\exp(a_{i}^{k}+\Sigma _{j}W_{ij}^{k}h_{j})}{\Sigma _{k'=1}^{K}\exp(a_{i}^{k'}+\Sigma _{j}W_{ij}^{k'}h_{j})}}}

    where K is the total number of discrete values that are present in the viewable values. They find use in the process of topic modeling, Restricted Markov random fields are a special case of Boltzmann machines. Boltzmann machines are also a type of Markov random fields. The graphical representation of their model is the same as that of factor analysis.

    Restricted Boltzmann machines are trained to maximize the product of probabilities assigned to some training set V (a matrix, each row of which is treated as a visible vector v ), \arg\max_W \prod_{v \in V} P(v)

    or equivalently, to maximize the expected log probability of a training sample v selected randomly from V :

    {\displaystyle \arg \max _{W}\mathbb {E} \left[\log P(v)\right]}

    The method most typically used to train RBMs, that is, to optimize the weight matrix W , Is Hinton responsible for developing the contrastive divergence (CD) algorithm?, originally developed to train PoE (product of experts) models.

    The algorithm performs Gibbs sampling and is used inside a gradient descent procedure (similar to the way backpropagation is used inside such a procedure when training feedforward neural nets) to compute weight update.

    The fundamental, one-step contrastive divergence (CD-1) process for a single sample may be described as follows:

    First, we take a training sample v, then we compute the probabilities of the hidden units, and finally, we sample a hidden activation vector h from this probability distribution.

    Compute the outer product of v and h and call this the positive gradient.

    First, sample a reconstruction of the visible units using value v', and then use this value to resample the hidden activations using value h'. (Gibbs sampling step)

    Determine the negative gradient by computing the outer product of v' and h' and referring to the result as such.

    Let the update to the weight matrix W be the positive gradient minus the negative gradient, times some learning rate: {\displaystyle \Delta W=\epsilon (vh^{\mathsf {T}}-v'h'^{\mathsf {T}})} .

    Update the biases a and b analogously: {\displaystyle \Delta a=\epsilon (v-v')} , {\displaystyle \Delta b=\epsilon (h-h')} .

    On his website, Hinton has posted a guide that he has written called A Practical Guide to Training RBMs.

    The difference between the Stacked Restricted Boltzmann Machines and RBM is that RBM has lateral connections within a layer that are prohibited to make analysis tractable. On the other hand, the Stacked Boltzmann consists of a combination of an unsupervised three-layer network with symmetric weights and a supervised fine-tuned top layer for recognizing three classes.

    The usage of Stacked Boltzmann is to understand Natural languages, retrieve documents, image generation, and classification. These functions are learned via unsupervised pre-training and/or supervised fine-tuning. In contrast to the undirected symmetric top layer, RBM's connection layer is an unsymmetric layer that can go in either direction. The restricted Boltzmann's connection is three-layers with asymmetric weights, and two networks are combined into one.

    There are some parallels to be drawn between RBM and stacked Boltzmann, the neuron for Stacked Boltzmann is a stochastic binary Hopfield neuron, that is synonymous with the Restricted Boltzmann Machine.

    The energy from both Restricted Boltzmann and RBM is given by Gibb's probability measure:

    E=-{\frac 12}\sum _{{i,j}}{w_{{ij}}{s_{i}}{s_{j}}}+\sum _{i}{\theta _{i}}{s_{i}}

    .

    The training process of Restricted Boltzmann is similar to RBM.

    Restricted Boltzmann train one layer at a time and approximate equilibrium state with a 3-segment pass, not performing back propagation.

    Restricted Boltzmann uses both supervised and unsupervised on different RBM for pre-training for classification and recognition.

    The training uses contrastive divergence with Gibbs sampling: Δwij = e*(pij - p'ij)

    The restricted Boltzmann's strength is it performs a non-linear transformation so it's easy to expand, and can give a hierarchical layer of features. The Weakness is that it has complicated calculations of integer and real-valued neurons. It does not follow the gradient of any function, so the approximation of Contrastive divergence to maximum likelihood is improvised.

    Fischer, Asja; Igel, Christian (2012), An Introduction to Restricted Boltzmann Machines, Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Lecture Notes in Computer Science, Berlin, Heidelberg: Springer Berlin Heidelberg, vol. 7441, pp. 14–36, doi:10.1007/978-3-642-33275-3_2, ISBN 978-3-642-33274-6, retrieved 2021-09-19

    {End Chapter 1}

    Chapter 2: Boltzmann distribution

    In statistical mechanics and mathematics, a Boltzmann distribution, which is also known as a Gibbs distribution, is a probability distribution or probability measure that gives the probability that a system will be in a certain state as a function of the energy of that state and the temperature of the system. In other words, the probability that a system will be in a certain state is proportional to the energy of the state and the temperature of the system. The distribution is expressed in the form:

    {\displaystyle p_{i}\propto \exp \left(-{\frac {\varepsilon _{i}}{kT}}\right)}

    where pi is the probability of the system being in state i, exp is the exponential function, εi is the energy of that state, The Boltzmann constant k is multiplied by the thermodynamic temperature T to get the constant kT of the distribution.

    The symbol {\textstyle \propto } denotes proportionality (see § The distribution for the proportionality constant).

    In this context, the word system may refer to a variety of different things, ranging from a

    Enjoying the preview?
    Page 1 of 1