Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Applications of Big Data in Healthcare: Theory and Practice
Applications of Big Data in Healthcare: Theory and Practice
Applications of Big Data in Healthcare: Theory and Practice
Ebook536 pages8 hours

Applications of Big Data in Healthcare: Theory and Practice

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Applications of Big Data in Healthcare: Theory and Practice begins with the basics of Big Data analysis and introduces the tools, processes and procedures associated with Big Data analytics. The book unites healthcare with Big Data analysis and uses the advantages of the latter to solve the problems faced by the former. The authors present the challenges faced by the healthcare industry, including capturing, storing, searching, sharing and analyzing data. This book illustrates the challenges in the applications of Big Data and suggests ways to overcome them, with a primary emphasis on data repositories, challenges, and concepts for data scientists, engineers and clinicians.

The applications of Big Data have grown tremendously within the past few years and its growth can not only be attributed to its competence to handle large data streams but also to its abilities to find insights from complex, noisy, heterogeneous, longitudinal and voluminous data. The main objectives of Big Data in the healthcare sector is to come up with ways to provide personalized healthcare to patients by taking into account the enormous amounts of already existing data.

  • Provides case studies that illustrate the business processes underlying the use of big data and deep learning health analytics to improve health care delivery
  • Supplies readers with a foundation for further specialized study in clinical analysis and data management
  • Includes links to websites, videos, articles and other online content to expand and support the primary learning objectives for each major section of the book
LanguageEnglish
Release dateMar 10, 2021
ISBN9780128204511
Applications of Big Data in Healthcare: Theory and Practice

Related to Applications of Big Data in Healthcare

Related ebooks

Science & Mathematics For You

View More

Related articles

Related categories

Reviews for Applications of Big Data in Healthcare

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Applications of Big Data in Healthcare - Ashish Khanna

    IEEE.

    Preface

    This book begins with the basics of Big Data Analysis and introduces the tools, processes, and procedures associated with the same. It unites healthcare with a leading technology, that is, Big Data Analysis and uses the advantages of the latter to solve the problems faced by the former. The book starts with the basics of Big Data and progresses toward the challenges faced by the healthcare industry, including capturing, storing, searching, sharing, and analyzing data. The book highlights the reasons for the growing abundance and complexity of data in this sector. The applications of Big Data have grown tremendously within the past few years, and its growth can be attributed not only to its competence to handle large data sizes but also to its abilities to find insights from complex, noisy, heterogeneous, longitudinal, and voluminous data. This helps Big Data to answer the previously unanswered questions, and this is preciously what helps it find its applications in the healthcare industry. Big Data is nowadays the requirement of almost all the technologies/applications, and there is a separate and special need to address its association with healthcare. The main objective of Big Data in this sector is to come up with ways to provide personalized healthcare to the patients by taking into account the enormous amount of the already existing data. The book further illustrates the possible challenges in its applications and suggests ways to overcome them. The topic is vast and, hence, every technique and/or solution cannot be discussed in detail. The primary emphasis of this book is to introduce healthcare data repositories, challenges, and concepts to data scientists, students, and academicians at large.

    Objective of the book

    The main aim of this book is to provide a detailed understanding of Big Data and focus on its applications in the field of healthcare. The ultimate goal is to bridge data mining and medical informatics communities to foster interdisciplinary works that can be put to good use.

    Organization of the book

    The book is organized in 11 chapters, a brief description of which is given in the following:

    1. Big Data Classification: Techniques and Tools

    An enormous volume of data, known as Big Data, of varied properties, is continuously being generated from several sources. For efficient and consequential use of this huge amount of data, automated and correct categorization is very important. This chapter attempts to discuss various technicalities of Big Data classification, comprehensively.

    2. Big Data Analytics for Healthcare: Theory and Applications

    In this chapter, the procedure of big data analytics in the healthcare sector with some practical applications along with its challenges has been discussed. The work has been concluded with a discussion on potential opportunities for analytics in the healthcare sector.

    3. Application of Tools and Techniques for Big Data Analytics of Healthcare System

    In the past, various data-analysis tools and methods have been adopted to improve the services provided in a plethora of areas. This chapter highlights the improvements in terms of the effectiveness of predictions and inferences drawn so that future usage may be eased.

    4. Healthcare and Medical Big Data Analytics

    This chapter discusses effective data analysis, suitable classification and standardization of big data in medicine and healthcare, as well as excellent design and implementation of healthcare information systems.

    5. Big Data Analytics in Medical Imaging

    This chapter discusses the various medical image–processing tools and frameworks, Hadoop, MapReduce, Yarn, Spark, and Hive, used to solve the purpose. Machine-learning and deep-learning techniques are extensively used for carrying out the required analytics. Genetic algorithms and association rule learning techniques are considerably used for the purpose.

    6. Big Data Analytics and Artificial Intelligence in Mental Healthcare

    In this chapter, the authors discuss the major opportunities, limitations, and techniques used for improving mental healthcare through AI and big-data. They explore both the computational, clinical, and ethical considerations and best practices as well as layout the major researcher directions for the near future.

    7. Big Data Based Breast Cancer Prediction Using Kernel Support Vector Machine With the Gray Wolf Optimization Algorithm

    Today, big data in healthcare is often used to predict disease. Breast cancer is one of the primary cancers that a woman suffers. If we recognize this disease at an early stage, there is a greater chance of recovery. In this chapter, an optimal feature is selected using Oppositional Grasshopper Optimization (OGHO); further, these features are processed in the training phase using the kernel support vector machine with the Gray Wolf Optimization algorithm (KSVMGWO) to predict breast cancer.

    8. Big Data Based Medical Data Classification Using Oppositional Gray Wolf Optimization With Kernel Ridge Regression

    The classification of medical data is an important data mining issue that has been discussed for nearly a decade and has attracted numerous researchers around the world. Selection procedures provide the pathologist with valuable information for diagnosing and treating diseases. In this chapter, the authors aim to develop machine-learning algorithms to effectively predict the outbreak of chronic disease in general communities.

    9. An Analytical Hierarchical Process Evaluation on Parameters Apps-Based Big Data Analytics for Healthcare Services

    Any healthcare management system can be studied in terms of access, integration, privacy and security, confidentiality, sharing, assurance/relevancy, reliability, and cost involvement for the data/documents in the system. It can be a concern to the healthcare centers. Accessibility is a complex concept, and at least four aspects—that is, availability, utilization, relevance, and equity of access—require evaluation. These parameters are discussed and evaluated in this chapter.

    10. FireflyBinary Cuckoo Search Technique Based Heart Disease Prediction in Big Data Analytics

    Nowadays, big data analysis is given more attention in complex healthcare settings. Fetal growth curves, the classic case of big health data, are used to predict coronary heart disease (CAD). This work aims to predict the risk of CAD using machine-learning algorithms such as Firefly—Binary Cuckoo Search (FFBCS). The authors also suggest a preliminary analysis of the performance of the framework.

    11. Hybrid Technique for Heart Diseases Diagnosis Based on Convolution Neural Network and Long Short-Term Memory

    In this chapter, a hybrid deep neural network using the dataset with 14 features as input and they are trained to utilize the convolution neural network (CNN) and long short-term memory (LSTM) hybrid algorithms to predict the presence or absence of disease in patients with the highest accuracy reaching 0.937%. The result of the study showed that the CNN-LSTM hybrid model had the best results in accuracy, recall, precession, F1 score, and AUC compared to other techniques.

    1

    Big Data classification: techniques and tools

    Pijush Kanti Dutta Pramanik¹, Saurabh Pal², Moutan Mukhopadhyay² and Simar Preet Singh³,    ¹1National Institute of Technology, Durgapur, India,    ²2Bengal Institute of Technology, Kolkata, India,    ³3Chandigarh Engineering College, Landran, India

    Abstract

    An enormous volume of data, known as Big Data, of varied properties, is continuously being generating from several sources. For efficient and consequential use of this huge amount of data, automated and correct categorization is very important. The precise categorization can find the correlations, hidden patterns, and other valuable insights. The process of categorization of mixed heterogeneous data is known as data classification and is done based on some predefined features. Various algorithms and techniques are proposed for Big Data classification. This chapter attempts to discuss various technicalities of Big Data classification, comprehensively. To start with, the basics of Big Data classifications such as need, types, patterns, phases, approaches, etc. are explained aptly. Different classification techniques, including traditional, evolutionary, and advanced machine learning technique, are discussed with suitable examples, along with citing their advantages and disadvantages. Finally, a survey of various open-source and commercial libraries, platforms, and tools for Big Data classification is presented.

    Keywords

    Big Data; machine learning; classification techniques; evolutionary algorithms; classification tools; Big Data platforms

    1.1 Introduction

    The modern digitized and smart world is continuously generating data of enormous volume from various sources, such as smart healthcare [1,2], smart cities [3,4], smart agriculture [5,6], smart buildings [7,8], smart learning [9,10], modern industries [11,12], social media [13,14], autonomous vehicles [15,16], cognitive systems [17,18], and so on. These data are generated and transmitted in various forms, volume and speed to different sinks. To have actionable usage of these data, they are required to be mined for extracting information and knowledge.

    In traditional data mining, we generally use approaches like clustering, classification, and association rules. In Big Data also, we have to mine useful and valuable information, but with extremely large data sets. The two most used techniques in the case of Big Data are Big Data classification [19] and Big Data clustering [20]. The classification is supervised learning where classification algorithm needs training in order to achieve accurate classification. Whereas, clustering is unsupervised learning and does not need pretraining for performing clustering.

    Classification allows grouping the information by common attributes and comparing them with similarities and differences. It helps in identifying and segregating data, which allows appropriate data tagging and thus making them easy to locate and retrieve or rather searchable. This enables in identifying relevant data from the irrelevant ones and further identifying multiple duplications. The tagging of data makes the searching fast and data accession easy [21]. We can say classification techniques mold the scattered data into shape. This shaped data reasons for quality and confidence in the outcome [22]. Conclusively, classification helps the user in gathering knowledge (knowledge discovery) and future planning. Without the proper classification, Big Data is bound to fail at drawing any valuable inferences.

    Big Data classification has found many significant real-life applications such as predicting epidemic outbreak, drug discovery, providing and managing healthcare services [23,24], weather forecasting, product and service recommendation [25], sentiment analysis and opinion mining [14], user profiling, predicting the next possible crime in the city, network traffic classification and intrusion prediction, etc.

    Since classification involves supervised learning, a classification model is built under supervision, using a training data set. The model by comprehending the data patterns of training set infers on how to classify a similar unknown data to a class. Specific strategies need to be sorted out on how to use the vast data. Typically, before the data can be used, they are preprocessed [26,27]. By preprocessing, the data are cleaned off, which may include null values, missing values, and inconsistent data. Not all features in the data set are useful. The required and suitable features are extracted using appropriate algorithms. The data are generally ready to be used after preprocessing and feature extraction and selection [28]. The performance accuracy of the model increases if the model is trained with good data. In classification, different techniques such as probabilistic models [29], the decision trees [30], the neural networks [31], support vector machine (SVM) [32], etc. are used. Most of these techniques are also used in usual data mining. But, since the volume of data is huge in Big Data, old techniques for data mining need to be tweaked to work for Big Data.

    This chapter presents a comprehensive discussion of different techniques and tools used in Big Data classification. The rest of the chapter is organized as follows. Section 1.2 presents the preliminary of Big Data classification, including the need, challenges, types, pattern, phases, and approaches of Big Data classification. Various algorithms and techniques of Big Data classification are elaboratively discussed in Section 1.3. The recent and popular tools, libraries, and platforms for Big Data classification are discussed in Section 1.4. Section 1.5 summarizes the chapter.

    1.2 Big Data classification

    Since the term, Big Data defines the data sets that are too large or complex, the classification of these data sets is important for a deeper understanding of the data [33]. Using Big Data classification, different analysis can be performed on the data for accurate prediction.

    In this section, we will be learning about the classification, why classification is required and various types of classification. Phases of classification, classification patterns, and the challenges are also covered subsequently.

    1.2.1 Definition of classification

    Process of classifying the data into different categories, on the basis of some attributes or features, is known as classification [34]. In Data mining, it is a data analysis process, which involves predicting the class of the newly observed data through modeling [35]. Classification is a crucial function of Big Data processing and is directly entails in knowledge discovery. Fig. 1.1 shows the different phases of Big Data processing, including Big Data classification.

    Figure 1.1 Phases of Big Data processing.

    1.2.2 Need for classification in Big Data

    Data classification helps in knowledge discovery and intelligent decision making [35]. Thus, it plays a vital role in Big Data. Big Data is so huge as well as complex that without proper classification, finding information from Big Data will be like finding a needle in a haystack. Classification is required to systematically analyze this massive amount of data by organizing them into suitable classes. This helps in developing a precise model or description for each defined class using the data features of that class. The different aspects of the need for classification of Big Data are shown in Fig. 1.2.

    Figure 1.2 Different aspects of the need for classification of Big Data.

    1.2.3 Challenges in Big Data classification

    Big Data involves complexity, thanks to its multifaceted properties. Hence, obviously, handling Big Data will not be trivial. The same applies to Big Data classification, which is hugely challenging due to several factors, most of which are inherent to Big Data. Fig. 1.3 lists some of the crucial challenges in Big Data classification.

    Figure 1.3 Big Data classification challenges.

    1.2.4 Types of classification

    Classification can be categorized into the following three types:

    1. Binary classification: Binary classification is a classification method, in which new data are classified into two possible classes as an outcome, that is, it categorizes items into two groups [36]. An example of binary classification is gender classification, which has two classification tasks with two possible outcomes—male or female. Similarly, it can be used in classifying the state of a machine into faulty or good. Other application areas where binary classification can be used are medical diagnosis, spam detection, etc.

    2. Multiclass classification: As the name suggests, multiclass classification has more than two classification classes as an outcome. Multiclass or multinomial classification is a technique of assigning items into one of the N numbers of classes, where N is greater than two. Examples of multiclass classification are segregating emails into the appropriate folder, gene expression categorization, etc. In this classification, one target label is assigned to each sample, but the sample cannot have two or more labels at the same time [36]. For example, an animal can be a dog or a cat, not both at the same time [37].

    3. Multilabel classification: The multilabel classification approach calls for the classification task where each sample of the data set is mapped to the set of target labels, that is more than one class [37]. The example of multilabel classification is a news article. The news article can describe sports, a location, or a person at the same time [38].

    1.2.5 Big Data classification approaches

    Typically, the following two approaches are followed for Big Data classification:

    1. Supervised: Supervised classification approach calls for learning classification logic under directed supervision by means of understanding the patterns in a known labeled data set. This classification approach takes the large volume of data called the training data set as input. To understand the classification rules for categorizing data into the given classes, data analysis is performed on the data set. The classification rules which are learnt in this process are based on the features or attributes of each element of a class [39]. Since in this classification approach, the set of possible classes is known in advance—it is also known as directed or predictive classification.

    2. Unsupervised: Unsupervised classification approach calls for learning classification logic from a set of unlabeled data; that is, here the class labels are unknown, and the training sets are not along with predefined class labels. In other words, the set of possible classes to which a datum will be classified is unknown in prior; after classification, the classes are given names. Hence, this approach is often known as clustering. The classification is carried out by drawing the comparisons between the features of data. This approach is useful when features or attributes are unknown, and therefore, the unsupervised classification techniques are often said to be as descriptive or undirected. Example of this technique is arranging a bucket full of fruits where all fruits are in jumbled order, and the aim is to classify the fruits into groups

    Enjoying the preview?
    Page 1 of 1