Explore 1.5M+ audiobooks & ebooks free for days

From $11.99/month after trial. Cancel anytime.

Continuous Machine Learning with Kubeflow: Performing Reliable MLOps with Capabilities of TFX, Sagemaker and Kubernetes (English Edition)
Continuous Machine Learning with Kubeflow: Performing Reliable MLOps with Capabilities of TFX, Sagemaker and Kubernetes (English Edition)
Continuous Machine Learning with Kubeflow: Performing Reliable MLOps with Capabilities of TFX, Sagemaker and Kubernetes (English Edition)
Ebook494 pages4 hours

Continuous Machine Learning with Kubeflow: Performing Reliable MLOps with Capabilities of TFX, Sagemaker and Kubernetes (English Edition)

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Continuous Machine Learning with Kubeflow' introduces you to the modern machine learning infrastructure, which includes Kubernetes and the Kubeflow architecture. This book will explain the fundamentals of deploying various AI/ML use cases with TensorFlow training and serving with Kubernetes and how Kubernetes can help with specific projects from start to finish.

This book will help demonstrate how to use Kubeflow components, deploy them in GCP, and serve them in production using real-time data prediction. With Kubeflow KFserving, we'll look at serving techniques, build a computer vision-based user interface in streamlit, and then deploy it to the Google cloud platforms, Kubernetes and Heroku. Next, we also explore how to build Explainable AI for determining fairness and biasness with a What-if tool. Backed with various use-cases, we will learn how to put machine learning into production, including training and serving.

After reading this book, you will be able to build your ML projects in the cloud using Kubeflow and the latest technology. In addition, you will gain a solid knowledge of DevOps and MLOps, which will open doors to various job roles in companies.
LanguageEnglish
PublisherBPB Online LLP
Release dateNov 20, 2021
ISBN9789389898514
Continuous Machine Learning with Kubeflow: Performing Reliable MLOps with Capabilities of TFX, Sagemaker and Kubernetes (English Edition)

Related to Continuous Machine Learning with Kubeflow

Related ebooks

Intelligence (AI) & Semantics For You

View More

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Continuous Machine Learning with Kubeflow - Aniruddha Choudhury

    CHAPTER 1

    Introduction to Kubeflow & Kubernetes Cloud Architecture

    In this chapter, we will learn about the complete features of Kubeflow, how it works, and why we need Kubeflow. We will also learn about the architecture functionality of Kubernetes, like service, pod, Ingress, and so on, and how to build the docker image, and how it works. Here, we will see the components of Kubeflow advantage, which we will be using in the upcoming chapters. Then, we will proceed towards the complete setup of Kubeflow in the Google Cloud Platform and Jupyter notebook setup. We have an optional item – how to create the Persistent Volume Claim, and attach it to the File store to save your codes and data.

    Structure

    In this chapter, we will cover the following topics:

    Docker understanding

    Kubernetes concepts and architecture

    Kubernetes components

    Introduction on Kubeflow Orchestration for ML Deployment

    Components of Kubeflow

    Setting Up for Kubeflow in GCP

    Jupyter Notebook setup

    Optional: PVC setup for Jupyter Notebook

    Objectives

    This chapter will help you learn the following:

    The core understanding of Docker and Kubernetes, and its application to be used in Cloud.

    Kubernetes and Kubeflow Architecture and its functionality and advantages.

    The components of Kubeflow and how to setup Kubeflow IAP Cluster in Google Cloud Platform.

    Docker image of CPU for setting up the Jupyter notebook, alongside the PVC Setup in Cloud.

    1.1 Docker understanding

    Docker is a platform for the developers and system admins to build, run, and share the applications with the containers. The containers used to deploy the applications is called containerization. To deploy the applications, the containers are making things easier and flexible.

    Containerization is increasingly becoming popular because the containers have the following features:

    Flexible: We can containerize the most complex applications as well.

    Scalable: We can distribute the container replica’s process across a data center automatically.

    Lightweight: The containers make things more efficient than the Virtual machines by sharing and leveraging the host kernel.

    Portable: Due it’s portable nature, we can build them locally, deploy to the cloud, and run it from anywhere.

    Loosely coupled: They are highly independent and encapsulated, which allows us to replace or upgrade anyone without disrupting others.

    Secure: Without any required configuration on the part of the user, it applies aggressive isolations and constraints to the processes.

    Images and containers:

    Logically, a container is a running process, with some added encapsulated features which applies for the Host to be isolated from it and from the other containers. The most important aspects is that each container interacts with its own private file system, which is called container isolation; the Docker image provides this file system. So, an image includes most of the things which are needed to run an application – the code, runtimes, dependencies, and any other file system objects required.

    Figure 1.1: Docker Architecture

    The Docker has two concepts which are almost the same with its VM containers as the idea of an image and a container. An image, which is the definition of that will be executed, just like an operating system image, and for a given image, the container is the running instance.

    1.1.1 Dockerfile

    To get our Python or any language code running in a container is to warp it in a package as a Docker image, after that run a container based on it. The steps are sketched as follows:

    Figure 1.2: Docker file process

    Next, for generating a Docker image, we need to create a Dockerfile which contains some set of instructions needed to build the image. The Dockerfile is then processed by the Docker builder which generates the Docker image. At last, with a simple Docker run command, we can create and run a container with the Python service.

    Figure 1.3: Dockerfile code

    Let’s split this file into the following lines:

    It uses the Python base image with the tag Python:3.8, which is a specific version of Python.

    Then, it creates a working directory, where we will copy our local files to that directory; here we have created an app folder and copied the requirements file which contains the Python library.

    Then, we have installed all the Python libraries with the pip command.

    Next, we use CMD, which is a command to run the Python file whenever the container gets started.

    Figure 1.4: Dockerfile command

    Now, run the preceding code in the terminal to push the image, which you have built in local to cloud (GCP/AWS/AZURE) and Docker hub.

    1.2 Kubernetes Architecture

    In this section, we will see how the Kubernetes work, and learn about its architecture.

    1.2.1 What is Kubernetes?

    Kubernetes is an open-source container management system used in large-scale enterprises in several dynamic industries to perform a mission-critical task or any orchestration task. Some of its capabilities include the following:

    It manages the containers inside cluster.

    It deploys applications to which it provides tools.

    It scales the applications as per requirement.

    It manages the existing containerized application changes.

    It optimizes the use of the underlying hardware complexity beneath our container.

    It enables an application component to restart and move across multiple systems as per need.

    1.2.2 Why do we need Kubernetes?

    We need Kubernetes to manage the containers when we run our production grade environments using a pattern of microservice with many containers. We need to track features such as health check, version control, scaling, and rollback mechanism among other things. It can be quite challenging and frustrating to make sure that all of these things are running alright. Kubernetes gives us the orchestration and management capabilities required to deploy the containers at scale. To build the application services with the Kubernetes orchestration allows us to span multiple containers and timely schedule those containers across a cluster, scale those containers when it’s not in use, and manage the health of those containers from time to time. In a nutshell, Kubernetes is more like a Master manager that has many subordinates (containers). What a manager does is maintain what the subordinates need to do.

    1.2.3 What are the Advantages of Kubernetes?

    The following are the advantages of Kubernetes:

    Portable and Open-Source:

    Kubernetes can run the containers on one or more public cloud environments, virtual machines, or bare metal, which means it can be deployed on any infrastructure. Moreover, Kubernetes is compatible across multiple platforms, making a multi-cloud strategy a highly flexible and usable component.

    Workload Scalability:

    Kubernetes course offers the following useful features for scaling purpose:

    Horizontal Infrastructure Scaling: Operates on the individual server level to implement horizontal scaling. New servers can be added or removed easily.

    Auto-Scaling: We can alter the number of containers running, based on the usage of CPU resources or other application-metrics.

    Manual Scaling: The number of running containers through a command or the interface can be manually scaled.

    Replication Controller: The replication controller makes sure that the cluster has a specified number of equivalent pods in a running condition. If there are too many pods, the replication controller can remove the extra pods or vice-versa.

    High Availability:

    Kubernetes can handle the availability of both the applications and the infrastructure. It tackles the following:

    Health Checks: The application doesn’t fail by constantly checking with the health of modes and containers. Kubernetes offers self-healing and auto replacement if a pod crashes due to an error.

    Traffic Routing and Load Balancing: Kubernetes’ load balancer distributes the load across multiple loads, enabling us to balance the resources quickly during incidental traffic or batch processing.

    Designed for Deployment:

    Containerization has an ability to speed up the process of building, testing, and releasing the software, and the useful features include the following:

    Automated Rollouts and Rollbacks: It can handle the new version and update our app without any downtime, while we monitor the health during the roll-out process. If any failure occurs during the process, it can automatically roll back to the previous version.

    Canary Deployments: So, the production of the new deployment and the previous version can be tested in parallel, that is, before scaling up the new deployment and parallelly scaling down the previous deployment.

    Programming Language and Framework Support: Most of the programming languages and frameworks like Java, Python, and so on, are supported by Kubernetes. If an application has the ability to run in a container, it can run in Kubernetes as well.

    Kubernetes and Stateful Containers:

    Kubernetes’ Stateful Sets provides resources like volumes, stable network ids, and ordinal indexes from 0 to N, and so on, to deal with the stateful containers. Volume is one such key feature that enables us to run the stateful application. The two main types of volume supported are as follows:

    Ephermal Storage Volume: Ephermal data storage is different from Docker. In Kubernetes, the volume is taken into account in any containers that run within the pod, and the data is stored across the container. But, if the pods get killed, the volume is automatically removed.

    Persistent Storage: The data remains for the lifetime. So, when the pod dies or it is moved to another node, that data will still remain until it is deleted by the user. Hence, the data is stored remotely.

    1.2.4 How do Kubernetes work?

    A cluster is the foundation of Google Kubernetes Engine (GKE); the Kubernetes objects that represent your containerized applications all run on top of a cluster. In GKE, a cluster consists of at least one control plane and multiple worker machines, called nodes. These control plane and node machines run the Kubernetes cluster orchestration system.

    Figure 1.5: Kubernetes Architecture

    Master:

    The master is the controlling element of the cluster. The master has the following three parts:

    API Server: The application that serves Kubernetes’ functionality through a RESTful interface and stores the state of the cluster.

    Scheduler: The scheduler watches the API server for the new Pod requests. It communicates with the Nodes to create the new pods and assign work to the nodes while allocating the resources or imposing constraints.

    Controller Manager: The component on the master runs the controllers. It includes the Node controller, Endpoint Controller, Namespace Controller, and so on.

    Slave (Nodes):

    These machines perform the requested, assigned tasks. The Kubernetes master controls them. There are the following four components inside the Nodes:

    Pod: All containers will run in a pod. Pods abstract the network and storage away from the underlying containers. Your app will run

    Enjoying the preview?
    Page 1 of 1