Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Essentials of Deep Learning and AI: Experience Unsupervised Learning, Autoencoders, Feature Engineering, and Time Series Analysis with TensorFlow, Keras, and scikit-learn
Essentials of Deep Learning and AI: Experience Unsupervised Learning, Autoencoders, Feature Engineering, and Time Series Analysis with TensorFlow, Keras, and scikit-learn
Essentials of Deep Learning and AI: Experience Unsupervised Learning, Autoencoders, Feature Engineering, and Time Series Analysis with TensorFlow, Keras, and scikit-learn
Ebook703 pages9 hours

Essentials of Deep Learning and AI: Experience Unsupervised Learning, Autoencoders, Feature Engineering, and Time Series Analysis with TensorFlow, Keras, and scikit-learn

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Essentials of Deep Learning and AI' curates the essential knowledge of working on deep neural network techniques and advanced machine learning concepts. This book is for those who want to know more about how deep neural networks work and advanced machine learning principles including real-world examples.

This book includes implemented code snippets and step-by-step instructions for how to use them. You'll be amazed at how SciKit-Learn, Keras, and TensorFlow are used in AI applications to speed up the learning process and produce superior results. With the help of detailed examples and code templates, you'll be running your scripts in no time. You will practice constructing models and optimise performance while working in an AI environment.

Readers will be able to start writing their programmes with confidence and ease. Experts and newcomers alike will have access to advanced methodologies. For easier reading, concept explanations are presented straightforwardly, with all relevant facts included.
LanguageEnglish
Release dateNov 25, 2021
ISBN9789391030025
Essentials of Deep Learning and AI: Experience Unsupervised Learning, Autoencoders, Feature Engineering, and Time Series Analysis with TensorFlow, Keras, and scikit-learn

Related to Essentials of Deep Learning and AI

Related ebooks

Software Development & Engineering For You

View More

Related articles

Reviews for Essentials of Deep Learning and AI

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Essentials of Deep Learning and AI - Shashidhar Soppin

    CHAPTER 1

    Introduction

    The usage of different instruments and equipment by mankind to run the day-to-day activities dates back to the Stone Age . Since then, tens of thousands of devices have evolved to cater for household activities, infrastructure, healthcare, entertainment, communication, transportation, and the like. While most of these devices are dumb, smart devices such as personal assist has started to hit the market. The vision of such devices is not new and have found their place in the stories and legends. The magic lamp of Aladdin to the robots of Isaac Asimov has led to several inventions. Over a period of time, the " dumb " devices have been infused with the required intelligence to understand the surroundings and act accordingly. For example, a fan gets switched on when the ambient temperature exceeds 35 degree Celsius. Today, the expectation of people has increased several folds. The devices are required to understand not only the surroundings but also the end-user! They need to consider the static preferences of the users as well as the dynamic emotional state. This requirement led to large-scale research in data processing and the related fields under the term Artificial Intelligence and resulted in a massive proliferation of smart devices.

    Structure

    In this chapter, we will discuss the following topics:

    Artificial intelligence

    What is Artificial Intelligence?

    Definitions of AI

    Applications of AI

    Use cases of AI

    Broad classification of what is AI, ML, FL, and DL?

    Machine learning

    History and definitions of ML

    ML and its applications

    Classification of ML algorithms

    Deep Learning

    Prerequisite to understand deep learning

    Difference between machine learning & deep learning

    Tools and frameworks for AI, ML, and DL

    Languages used for AI, ML, and DL

    Datasets for AI, ML, and DL development

    Objectives

    After studying this chapter, you should be able to:

    Understand the concepts of AI, ML, DL and their components

    History of AI, ML, and DL

    Defined AI, ML, DL along with some examples and applications

    Benefits of using AI, ML, and DL in overall

    Programming languages used for ML & DL

    Standard tools used for ML & DL

    Reference database used for model development

    1.1 Artificial intelligence

    Making computers to "think like humans" was the main motivation behind the development of Artificial Intelligence. With this goal in mind, many pioneers as described later in this chapter, have contributed to AI and its advancement. Many advancements in the field of computer science also helped AI indirectly. All these put together, AI prospered over a period of years.

    1.1.1 What is Artificial Intelligence?

    Artificial Intelligence (AI) is the interdisciplinary umbrella term that spans the tools and techniques required to incorporate human-like intelligence into a system. The system can be a piece of software such as a dialog system and a hardware component such as a smart Internet of Things (IoT) or a machine such as a robotic arm. Alan Turing has defined the test criteria for a device or a program to be called as intelligent, to indicate how closely they can imitate human beings.

    We are in the era of the "information age (also sometimes called as Computer age, Digital Age or New media age"), this is the golden age of modern human history. From taking selfies to storing files in the cloud, the growth of IoT based devices, and social media usage, we deal with quintillion bytes of data each day. The amount of data each individual produces in the 20th century is mind-boggling and this will be a whopping number. As of late 2019, it is estimated that human beings along generated approximately 2.5 quintillion bytes of data.

    This sudden rise of data, which the human brain has to deal with, will be quite challenging and cumbersome. Most of the data analytics now is done by artificial intelligence-based machines/systems, because of this huge data gathering and lots of insights are generated which assist humans in many predictive trends that are arising in the business and software industry.

    1.1.2 Definitions of Artificial Intelligence

    AI has evolved from the 1950 of Alan Turing days to date. So many definitions of AI came up all these years. To make things simpler and easier, all the standard definitions are captured of AI from the pioneers of this industry. The following table explains the evolution of AI, the people who coined/defined the AI definition terminology during which year is described and when along with links.

    Table 1.1

    When we say artificial intelligence, immediately a question comes to our mind, "Can computers think like human beings?" Many of the above definitions try to answer this question briefly. But one more pair of recent year’s pioneers of the AI industry, Peter Norvig and Stuart J Russel, made it simpler to understand than defining it. What they say is that with the combinations of "Thinking like humans and Thinking Rationally" one can come to a conclusion about how AI works. They went on to say that

    "A computer agent is able to act autonomously, perceive its environment, and persist for an extended period of time, adapt to change, and adopt goals."

    "A rational agent is an agent that is able to act to achieve the expected best outcome (e.g., to persist or reach a goal)."

    With the above definitions, it is clear that the computer not only works autonomously but also works with the rational agent which gives the best outcome.

    1.1.3 Applications of Artificial Intelligence

    Today machines can play brain-demanding games such as Pogo Atari, chess and defeat the champions in the fields. After the Second World War, the changing geopolitical landscape and the available technologies resulted in a revolution in all kinds of industries including healthcare and manufacturing. Large volumes of data started to pour in resulting in the development of machines with large computing power, built with intelligence. Smart algorithms were developed in languages such as Prolog and Lisp to process the data for robots and satellites. The AI systems took their origin there and continued the journey, undergoing metamorphism to reach the Python or Java-driven deep learning algorithms over the cloud supporting the data from the mobile phone or a gaming box. The availability of computing power and the development of new and revolutionary algorithms have made it move out of the industries, space exploration, and laboratories and reach out to the masses.

    The AI systems are often modeled after human beings both, structurally and functionally. The information is organized into hierarchies with varying abstractions, clustered, classified, interpolated, and predicted as it happens in the brain. Several disciplines within AI such as convolutional neural networks, neuromorphic chips Self-organizing Map (SoM) have striking similarities with the human brain. The various functionalities of an AI system fall into the abstraction levels of an AI system indicated in Figure 1.1:

    Figure 1.1: Abstraction levels of AI system

    Let AI span several disciplines including machine learning, robotics, cognitive computing, neuroscience, computer vision, natural language processing, feature engineering, data science, and so on. Also, there is a certain degree of overlap among these disciplines. Some of the technologies overlapping with the AI are indicated in Figure 1.2:

    Figure 1.2: Abstraction levels of the AI system

    While the anticipation is that the machines are to think (and act) on behalf of the user, it is still fiction. However, the inventions happening in hardware architecture and algorithms are taking it closer. Today, devices are in a position to operate based on the context of the user and "personal assist" can provide in-time advice based on the large chunks of data they are able to mine and extract the information relevant to the user. They are able to provide in-time guidance to the user. For example, when the user is in a car in a by lane, provides an exit path or advice the user to take an umbrella when it is likely to rain in a few hours. One of the requirements to build an AI system is the availability of a large data corpus. While it is good to have more data, it leads to overfitting of the models where they are overexposed to specific examples and results in a poor show when alien data is ingested. To ensure that the devices or applications learn with adequate data, the discipline of machine learning is developed within AI.

    Figure 1.3 provides differentiation between AI, ML, FL, and DL. This onion-layered diagram clearly explains what is what and with dependency on each other. The right side of the diagram lightly explains the definition of each:

    Figure 1.3: Difference between AI, ML, FL, and DL

    The above diagram explains how AI, ML, FL, and DL are differentiated. All these mechanisms are dependent on each other, or to say they are like in an onion structure.

    1.1.4 Industry domains and sectors along with sample use cases

    The concepts introduced in the next sections are used in a variety of real-life solutions. It spans the domains such as finance, healthcare, automation, automotive, energy, consumers, telecom, and the technologies such as IoT, 5G, cloud computing, big data, chatbot, brain computing interface, augmented reality, virtual reality, and drones.

    The applications can be categorized as real-time and offline. This difference has a say on the execution platform. The real-time applications require the algorithms to be executed and render the results in a stipulated time. For example, in threat detection, the images from a surveillance camera are to be analyzed in real-time. Based on the inference, the user needs to be alarmed of potential threats. This is in contrast with the example of credit card allocation based on user data. The model can work offline and throw out the decision of allocate/not allocate based on the data supplied.

    In the categorization based on data size, two types of applications exist - one heavily making use of the data and the other one working with lesser data.

    For example, face detection or optical character detection models require a large database for training and rigorous validation. Since such databases are available and accessible, AI models with a high degree of accuracy can be built.

    On the other hand, the (image) data available for the identification of a bacteria is very few. This is partly because a single view of the bacteria is available and there is a limited variation across the images.

    The applications are also classified based on the features required for inferencing.

    For example, in the staging of a malignant tumor, the region of interest (RoI) is well defined and the relevant features are searched within those boundaries.

    On the other hand, in the classification of an apple into good or bad, such boundaries on the apple are not known apriori. Also, the nature of the damage is not known in advance making it difficult to develop the model.

    The applications are also grouped based on the nature of the data - sequential or parallel. The model architecture changes accordingly. For example, natural language processing problems increasingly make use of the sequential models because of the interdependency between the consecutive words and sentences. In contrast, image data gets processed better in a model with parallel architecture.

    Some of the applications require continuous learning from the environment. For example, in brain-teasing games such as Pogo Atari, Chess, the model needs to learn the playing style of the opponent. This data from the game environment is used to fine-tune the model for better predictions. This is in contrast with the fully-trained models, for example, used for differentiating a cat from a dog.

    The emerging applications such as 5G communications require prediction of the traffic patterns for effective and dynamic allocation of resources to meet the agreed Quality of Service (QoS). Here, the model needs to handle huge volumes of data that changes dynamically as the topography of the network itself gets hazy. On the other hand, applications such as fingerprint identifiers work better with static data collected apriori. Some of the common applications used in different business domains are indicated in example 1.4. The different applications and the models to realize the same is detailed in the rest of the chapters of this book.

    Figure 1.4: Use Cases of AI system

    Figure 1.4 points out the various use cases of AI from a broader perspective. Domain-specific use cases are listed accordingly.

    1.1.5 Broad classification of what is AI, ML, FL, and DL?

    Figure 1.4 explains in detail the various applications of AI, ML, and DL in various domains and industry sectors.

    Following is a detailed explanation of each of these applications:

    Finance: AI/ML-based algorithms can be used for credit card fraud detection and while approving loans based on credit history, and so on. Also, the most advanced and latest ML/DL algorithms can be used for stock prediction, by feeding in lots of training data from the stock market.

    Telecom: AI/ML-based algorithms can be used for call quality and to predict customer churn.

    Automation: Many of the challenges in an interviewing system, track detection using robots based on AI/ML algorithms find their application in automation.

    Similarly, healthcare, automotive, and retail industries are extensively using AI, ML, and DL-based algorithms for solving many challenging problems.

    Applications like machine translation and signature verification needs advanced AI and ML-based algorithms.

    As you peel the layers one-by-one, they open up. AI is the broader classification of all these mechanisms put together.

    Since AI has a long history, it has seen something called "AI winter", which is a downward trend in an AI application and development, during that period of time and after a certain number of years, it has again seen an upward trend. This is represented by a curve from the period 1950 onwards as shown in figure 1.5.

    "AI Winter cycle" as it was called, is downward and then a sudden spike in AI development/research activities over a period of 10 years cycle has been observed. This cycle clearly indicates the various phases of research and developmental activities that were seen in AI, ML, and DL over a period of last 60-70 years of time:

    Figure 1.5: AI Winter Cycle

    Note: There is a wafer-thin difference between the terminologies in Artificial Intelligence. The terms machine learning, deep learning, etc. share many things in common but differ fundamentally in the execution architecture.

    Deep Learning is a subset of the broader family of machine learning methods. It was introduced with the objective of moving machine learning closer to its main goal—that of artificial intelligence. Similarly, representation learning or feature learning is the subset of the machine learning mechanism that deals with extracting features or understanding the representation of a provided dataset.

    1.2 Machine learning

    Machine Learning as explained earlier is a subset of Artificial Intelligence.

    In 1959, Arthur Samuel defined Machine learning as, "The field of study that gives computers the ability to learn without being explicitly programmed." This opened up lots of studies for a few years but later was stalled.

    In recent times, ML is considered an integral part of data science, which can be used to build predictive models. In any machine learning mechanism or system, computers are taught to solve certain problems with massive datasets and are provided with models.

    Machine learning is also defined as one of the branches of computer science in which algorithms (running inside computers) learn from data or datasets available to them. With this learning mechanism, the output of these various mechanisms will be used to build predictive models.

    1.2.1 History and definition of ML

    Historically, machine learning has been defined and developed by many industry experts. Various machine learning definitions/explanations by industry stalwarts are in the following tabular form:

    Table 1.2

    1.2.2 Machine learning and its applications

    Many times, machine learning is depicted as an intersection between math & statistics, hacking skills and substantive expertise. Drew Conway popularly depicted this with a venn diagram. This is known as "The Data Science Venn Diagram".

    The attached figure clearly shows how machine learning is intersected with various aspects. Therefore, it is challenging to put across a standard definition for machine learning. So, one can easily say that machine learning is an amalgamation of "Hacking Skills and Math and Statistics. Data Science in a broader perspective, cuts across all the three aspects, Hacking Skills, Math and Statistics and Domain Expertise". So, there is a narrow difference between ML and DS. The following figure explains this beautifully:

    Figure 1.6: Data Science and Machine Learning Venn Diagram

    Figure 1.6 is some sort of a venn diagram that shows how data science and machine learning is dependent on statistics and hacking skills.

    One of the most common applications of the machine learning-based algorithm is used for spam email classification. Some of the commonly used terms in spam emails are used as labels to train the model to learn and classify the emails as "spam and non-spam (sometimes called ham) emails. The common keywords used are for example, Subject lines with all capital letters, Loan approved, Won Lottery", etc. There is a huge dataset itself created using these commonly found keywords by UC-Berkley, which can be found in the following link.

    Ref: http://archive.ics.uci.edu/ml/datasets/spambase

    Creators: Mark Hopkins, Erik Reeber, George Forman, Jaap Suermondt

    Using this dataset, the email engine can be trained (keyword-based learning), and all the emails that are received by an email address are first fed into this engine, which will classify them as "Spam or Non-spam". If there is any new type of spam found, this can be fed into the dataset and this entire model can be re-trained.

    Compared to traditional programming where only a "set of predefined Rules" are used, which are static and the rule engine works based on these set of instructions only. Whenever there is any change in the keyword, then changing the rules is cumbersome and challenging. It takes a lot of energy and maintaining this will be difficult.

    In the case of machine learning models, only this dataset has to be updated and no need to change the rules unless there is a huge change that has been observed.

    Similarly, machine learning methods or mechanisms are used for various applications in and across various industry domains. Some of the examples are as follows:

    Speech recognition

    Language translation

    Financial services and applications

    Trading algorithms

    Management of portfolios

    Fraud detection and alerting

    Transport services and applications

    Driverless vehicles

    Safety monitoring and alerting

    Air-traffic controlling

    Smart navigation systems

    Healthcare

    Discovery of new drugs

    Analysis and diagnosis of diseases

    Advanced robotic precision surgeries

    E-commerce OR Retail shopping assistants

    IoT

    Prediction and correction of sensor behavior

    Prediction and correction of gateways and network servers

    Helps in building smart homes and smart cities

    Social media

    Sentiment analysis

    Fake news classification

    Anomaly detection

    There is no limit to machine learning applications. They are found in every industry domain and sector. Based on the collaboration of both academia and industry, they are evolving day by day. It is pretty hard to find an area where their application is limited. In the last decade alone, ML models are being used across the industry.

    1.2.3 Classification of ML algorithms

    The machine learning systems (or algorithms) can be broadly classified into many categories, based on various factors and methods. The most popular of these are as follows:

    Supervised learning

    Unsupervised learning

    Semi-supervised learning

    Reinforcement learning

    Online/batch learning

    Each of these machine learning systems types is explained in detail in the upcoming chapters. But lighter definition and explanation is provided here to make sure that the reader has basic knowledge of them.

    Supervised Learning: These algorithms are typically trained by human "supervision. That is the reason these are called supervised learning algorithms. Most of the training data that is used in supervised learning algorithms are labeled data. One can say that there is a teacher" involved in training the model.

    "Labeled data" here means which have been tagged with one more name associated with it. For example, an image is labeled as it is a cat's photo and it's a dog's photo based on the character and features of that photo.

    This helps in later classifying and training the "unlabeled data" with further classification and categorization.

    Based on certain features, these labels are predicted/guessed. Once predicted, that sample group is labeled with that name. By continuously providing input, the output is generated, these outputs are nothing but "labels". The training involved various types of datasets and model building. These will be explained in detail in later chapters.

    Some of the well-known and popularly used supervised learning algorithms are as follows:

    Regression (continuous target variables)

    Linear Regression

    Logistic Regression

    Ensemble Methods

    Decision Trees and Random Forests

    Support Vector Regression (SVR)

    Gaussian Process Regression (GPR)

    Classification (categorical target variables)

    Support Vector Machines (SVMs)

    Naïve Bayes

    K-nearest Neighbors (KNN)

    Common neural networks

    Multi-Layer Perceptron (MLP)

    Unsupervised Learning: These algorithms as the name suggests are not supervised by humans. What this means is that the training data is "unlabeled".

    In unsupervised learning, there is no teacher involved at all. What the system or computer learns is different patterns/structures in data. Based on these patterns the data can be grouped into with pattern as output category. Based on these patterns one can come across a set of rules, which can be later used in deriving meaningful insights. These algorithms also can be used for building descriptive modeling.

    These algorithms unlike supervised don’t predict or find anything specific to the data set.

    Some of the well-known and popularly used unsupervised learning algorithms are as follows:

    Clustering (continuous) and Dimensionality

    K-means

    K-medoids

    Hierarchical Clustering Analysis (HCA)

    Expectation Maximization

    Hidden Markov Model

    Neural Networks

    PCA (Principal Component Analysis)

    SVD

    Kernel PCA

    t-Distributed Stochastic Neighbour Embedding (t-SNE)

    Locally Linear Embedding (LLE)

    Association Analysis (categorical)

    Apriori

    Eclat

    FP-Growth

    Semi-Supervised Learning: Frankly speaking this is not a type of algorithm. But to categorize some of the algorithms which fall either as supervised algorithms OR unsupervised algorithms OR both, this mechanism is used. So, one can put it like this, either there are no labels for the observation in the data sets or labels are present only for certain observations.

    In some cases, where data labelling is a costly affair but needs a combination of both partial data labeling and partial non-labeling these algorithms can be used.

    Some of the well-known and popularly used unsupervised learning algorithms are as follows:

    Uclassify

    GATE

    Reinforcement Learning: This is nowadays one of the popular type algorithm learning mechanisms. Many complex models are being built using the combination of supervised learning or unsupervised learning along with reinforcement algorithms.

    This is a different category itself where so-called agent observes the environment, select and perform various actions and generate rewards or penalties accordingly.

    Q-Learning: Q-Learning is a model-free RL algorithm, Q in Q-Learning stands for quality, which is used to determine how useful a given action is in gaining some future reward.

    Temporal Difference (TD): TD is an approach to learning how to predict a quantity that may be dependent on future values of a given signal.

    Deep Adversarial Networks: Deep Adversarial Belief Networks also sometimes called DBN, are highly parallelizable numerical algorithms.

    Markov decision process (MDP): MDP works using the methodology involving agent, environment, state and action, each depending on the other.

    Monte Carlo Tree Search: this is a heuristic search algorithm used for decision-making processes.

    Asynchronous Actor-Critic Agents (AAAC): this algorithm is beneficial in experiments that involve some global network optimization with different environments in parallel for generalization purposes.

    1.3 Deep Learning

    Before jumping on to defining deep learning, let us first learn how the human brain works? How the central neuron system of humans is wired together with billions of neurons communicating with each other?

    There is some analogy between the human neuron system and deep learning networks.

    The human brain is made up of billions of nerve cells or neurons. These neurons have a great ability in transmitting electrochemical signals using "synapse" which is found as a connection between any two neuron cells. These are also sometimes thought of as similar to gates and wires in the computer world.

    Each of our daily experiences, senses and various normal functions trigger a lot of neurons-based activity which is then converted to so-called reaction/communication channels.

    The first-ever neuron (nerve cell) from the human brain was extracted and peering through the microscope "Santiago Ramon y Cajal (1852-1934), he saw neurons — individual cells and he went to establish that the neurons communicate using what is well known as synapse". He was awarded the Nobel Prize in 1906 in the field of physiology or medicine for this discovery.

    Ref: - https://www.nobelprize.org/prizes/medicine/1906/cajal/facts/

    Please refer to the figure as mentioned in the reference link which shows the first-ever drawings of neurons by Santiago Ramon. This picture is one of the greatest discoveries, as it has led medical science to a new path.

    Ref: - https://www.quantamagazine.org/why-the-first-drawings-of-neurons-were-defaced-20170928/

    Santiago Ramon also extracted individual neurons from his experiments and he came out with the first-ever drawing of an individual neuron along with explaining functions of all subparts of this neuron cell. The figure as in the reference link is the first neuron cell extracted and drawn by him. This neuron cell consists of a cell body, nucleus, axon, and so on. They are explained in the more recent diagram.

    Ref: - https://www.brainfacts.org/brain-anatomy-and-function/cells-and-circuits/2018/santiago-ramon-y-cajal-the-first-modern-neuroscientist

    This discovery broadly opened up many pieces of research for nearly a century on the human brain, cognitive abilities, human perception, human reaction to senses, human memory and types, human thinking, and so on.

    For the last several years, the extensive research focus was on the human brain and its internal neural network. This research along with advancements in the computer industry has helped in the development of AI and ML technologies. Many researchers in both the computer industry and academia have a long dream of building intelligent machines similar to human brains. Hope this will be achieved one day.

    Figure 1.7 explains the single neuron, how it looks, what are the various parts and functionalities are depicted as follows:

    Figure 1.7: Single Neuron Cell

    In 1958, Frank Rosenblatt who is an American psychologist attempted to build "a machine which senses, recognizes, remembers, and responds like the human mind". He went to call this machine a Perceptron.

    During the same time around 1956, Hubel and Wiesel carried out what is popularly known experiment as the "Cat Experiment". In this experiment, they

    Enjoying the preview?
    Page 1 of 1