Applications of Big Data in Healthcare: Theory and Practice
By Deepak Gupta and Nilanjan Dey
()
About this ebook
Applications of Big Data in Healthcare: Theory and Practice begins with the basics of Big Data analysis and introduces the tools, processes and procedures associated with Big Data analytics. The book unites healthcare with Big Data analysis and uses the advantages of the latter to solve the problems faced by the former. The authors present the challenges faced by the healthcare industry, including capturing, storing, searching, sharing and analyzing data. This book illustrates the challenges in the applications of Big Data and suggests ways to overcome them, with a primary emphasis on data repositories, challenges, and concepts for data scientists, engineers and clinicians.
The applications of Big Data have grown tremendously within the past few years and its growth can not only be attributed to its competence to handle large data streams but also to its abilities to find insights from complex, noisy, heterogeneous, longitudinal and voluminous data. The main objectives of Big Data in the healthcare sector is to come up with ways to provide personalized healthcare to patients by taking into account the enormous amounts of already existing data.
- Provides case studies that illustrate the business processes underlying the use of big data and deep learning health analytics to improve health care delivery
- Supplies readers with a foundation for further specialized study in clinical analysis and data management
- Includes links to websites, videos, articles and other online content to expand and support the primary learning objectives for each major section of the book
Related to Applications of Big Data in Healthcare
Related ebooks
Demystifying Big Data, Machine Learning, and Deep Learning for Healthcare Analytics Rating: 0 out of 5 stars0 ratingsData Analytics in Biomedical Engineering and Healthcare Rating: 0 out of 5 stars0 ratingsKey Advances in Clinical Informatics: Transforming Health Care through Health Information Technology Rating: 0 out of 5 stars0 ratingsArtificial Intelligence in Healthcare Rating: 0 out of 5 stars0 ratingsIoT-Based Data Analytics for the Healthcare Industry: Techniques and Applications Rating: 0 out of 5 stars0 ratingsArtificial Intelligence and Big Data Analytics for Smart Healthcare Rating: 0 out of 5 stars0 ratingsBig Data Analytics for Intelligent Healthcare Management Rating: 0 out of 5 stars0 ratingsDigital Health: Mobile and Wearable Devices for Participatory Health Applications Rating: 0 out of 5 stars0 ratingsWearable Telemedicine Technology for the Healthcare Industry: Product Design and Development Rating: 0 out of 5 stars0 ratingsAn Introduction to Healthcare Informatics: Building Data-Driven Tools Rating: 5 out of 5 stars5/5Deep Learning Techniques for Biomedical and Health Informatics Rating: 0 out of 5 stars0 ratingsArtificial Intelligence in Precision Health: From Concept to Applications Rating: 0 out of 5 stars0 ratingsDeep Learning for Data Analytics: Foundations, Biomedical Applications, and Challenges Rating: 0 out of 5 stars0 ratingsPractical Guide to Clinical Computing Systems: Design, Operations, and Infrastructure Rating: 0 out of 5 stars0 ratingsBig Data Analytics for Healthcare: Datasets, Techniques, Life Cycles, Management, and Applications Rating: 0 out of 5 stars0 ratingsIntelligent IoT Systems in Personalized Health Care Rating: 0 out of 5 stars0 ratingsInternet of Things in Biomedical Engineering Rating: 4 out of 5 stars4/5Machine Learning, Big Data, and IoT for Medical Informatics Rating: 0 out of 5 stars0 ratingsCognitive and Soft Computing Techniques for the Analysis of Healthcare Data Rating: 0 out of 5 stars0 ratingsInnovation in Health Informatics: A Smart Healthcare Primer Rating: 0 out of 5 stars0 ratingsMachine Learning in Bio-Signal Analysis and Diagnostic Imaging Rating: 0 out of 5 stars0 ratingsTrends in Development of Medical Devices Rating: 0 out of 5 stars0 ratingsImplementation of Smart Healthcare Systems using AI, IoT, and Blockchain Rating: 0 out of 5 stars0 ratingsChemoinformatics and Bioinformatics in the Pharmaceutical Sciences Rating: 0 out of 5 stars0 ratingsTelemedicine Technologies: Big Data, Deep Learning, Robotics, Mobile and Remote Applications for Global Healthcare Rating: 0 out of 5 stars0 ratingsEmergence of Pharmaceutical Industry Growth with Industrial IoT Approach Rating: 0 out of 5 stars0 ratingsHandbook of Deep Learning in Biomedical Engineering: Techniques and Applications Rating: 0 out of 5 stars0 ratings5G IoT and Edge Computing for Smart Healthcare Rating: 0 out of 5 stars0 ratingsMultidisciplinary Microfluidic and Nanofluidic Lab-on-a-Chip: Principles and Applications Rating: 0 out of 5 stars0 ratingsClinical Research Computing: A Practitioner's Handbook Rating: 0 out of 5 stars0 ratings
Science & Mathematics For You
The Big Book of Hacks: 264 Amazing DIY Tech Projects Rating: 4 out of 5 stars4/5How Emotions Are Made: The Secret Life of the Brain Rating: 4 out of 5 stars4/5Homo Deus: A Brief History of Tomorrow Rating: 4 out of 5 stars4/5Fantastic Fungi: How Mushrooms Can Heal, Shift Consciousness, and Save the Planet Rating: 5 out of 5 stars5/5Becoming Cliterate: Why Orgasm Equality Matters--And How to Get It Rating: 4 out of 5 stars4/5Memory Craft: Improve Your Memory with the Most Powerful Methods in History Rating: 3 out of 5 stars3/5How to Think Critically: Question, Analyze, Reflect, Debate. Rating: 5 out of 5 stars5/5Metaphors We Live By Rating: 4 out of 5 stars4/5On Food and Cooking: The Science and Lore of the Kitchen Rating: 5 out of 5 stars5/5The Psychology of Totalitarianism Rating: 5 out of 5 stars5/52084: Artificial Intelligence and the Future of Humanity Rating: 4 out of 5 stars4/5Free Will Rating: 4 out of 5 stars4/5Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career Rating: 4 out of 5 stars4/5Activate Your Brain: How Understanding Your Brain Can Improve Your Work - and Your Life Rating: 4 out of 5 stars4/5Hunt for the Skinwalker: Science Confronts the Unexplained at a Remote Ranch in Utah Rating: 4 out of 5 stars4/5The Wisdom of Psychopaths: What Saints, Spies, and Serial Killers Can Teach Us About Success Rating: 4 out of 5 stars4/5The Systems Thinker: Essential Thinking Skills For Solving Problems, Managing Chaos, Rating: 4 out of 5 stars4/5Outsmart Your Brain: Why Learning is Hard and How You Can Make It Easy Rating: 4 out of 5 stars4/5No Stone Unturned: The True Story of the World's Premier Forensic Investigators Rating: 4 out of 5 stars4/5Conscious: A Brief Guide to the Fundamental Mystery of the Mind Rating: 4 out of 5 stars4/5Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness Rating: 4 out of 5 stars4/5A Crack In Creation: Gene Editing and the Unthinkable Power to Control Evolution Rating: 4 out of 5 stars4/5No-Drama Discipline: the bestselling parenting guide to nurturing your child's developing mind Rating: 4 out of 5 stars4/518 Tiny Deaths: The Untold Story of Frances Glessner Lee and the Invention of Modern Forensics Rating: 4 out of 5 stars4/5The Structure of Scientific Revolutions Rating: 4 out of 5 stars4/5Born for Love: Why Empathy Is Essential--and Endangered Rating: 4 out of 5 stars4/5Why People Believe Weird Things: Pseudoscience, Superstition, and Other Confusions of Our Time Rating: 4 out of 5 stars4/5Flu: The Story of the Great Influenza Pandemic of 1918 and the Search for the Virus That Caused It Rating: 4 out of 5 stars4/5Lies My Gov't Told Me: And the Better Future Coming Rating: 4 out of 5 stars4/5
Related categories
Reviews for Applications of Big Data in Healthcare
0 ratings0 reviews
Book preview
Applications of Big Data in Healthcare - Ashish Khanna
IEEE.
Preface
This book begins with the basics of Big Data Analysis and introduces the tools, processes, and procedures associated with the same. It unites healthcare with a leading technology, that is, Big Data Analysis and uses the advantages of the latter to solve the problems faced by the former. The book starts with the basics of Big Data and progresses toward the challenges faced by the healthcare industry, including capturing, storing, searching, sharing, and analyzing data. The book highlights the reasons for the growing abundance and complexity of data in this sector. The applications of Big Data have grown tremendously within the past few years, and its growth can be attributed not only to its competence to handle large data sizes but also to its abilities to find insights from complex, noisy, heterogeneous, longitudinal, and voluminous data. This helps Big Data to answer the previously unanswered questions, and this is preciously what helps it find its applications in the healthcare industry. Big Data is nowadays the requirement of almost all the technologies/applications, and there is a separate and special need to address its association with healthcare. The main objective of Big Data in this sector is to come up with ways to provide personalized healthcare to the patients by taking into account the enormous amount of the already existing data. The book further illustrates the possible challenges in its applications and suggests ways to overcome them. The topic is vast and, hence, every technique and/or solution cannot be discussed in detail. The primary emphasis of this book is to introduce healthcare data repositories, challenges, and concepts to data scientists, students, and academicians at large.
Objective of the book
The main aim of this book is to provide a detailed understanding of Big Data and focus on its applications in the field of healthcare. The ultimate goal is to bridge data mining and medical informatics communities to foster interdisciplinary works that can be put to good use.
Organization of the book
The book is organized in 11 chapters, a brief description of which is given in the following:
1. Big Data Classification: Techniques and Tools
An enormous volume of data, known as Big Data, of varied properties, is continuously being generated from several sources. For efficient and consequential use of this huge amount of data, automated and correct categorization is very important. This chapter attempts to discuss various technicalities of Big Data classification, comprehensively.
2. Big Data Analytics for Healthcare: Theory and Applications
In this chapter, the procedure of big data analytics in the healthcare sector with some practical applications along with its challenges has been discussed. The work has been concluded with a discussion on potential opportunities for analytics in the healthcare sector.
3. Application of Tools and Techniques for Big Data Analytics of Healthcare System
In the past, various data-analysis tools and methods have been adopted to improve the services provided in a plethora of areas. This chapter highlights the improvements in terms of the effectiveness of predictions and inferences drawn so that future usage may be eased.
4. Healthcare and Medical Big Data Analytics
This chapter discusses effective data analysis, suitable classification and standardization of big data in medicine and healthcare, as well as excellent design and implementation of healthcare information systems.
5. Big Data Analytics in Medical Imaging
This chapter discusses the various medical image–processing tools and frameworks, Hadoop, MapReduce, Yarn, Spark, and Hive, used to solve the purpose. Machine-learning and deep-learning techniques are extensively used for carrying out the required analytics. Genetic algorithms and association rule learning techniques are considerably used for the purpose.
6. Big Data Analytics and Artificial Intelligence in Mental Healthcare
In this chapter, the authors discuss the major opportunities, limitations, and techniques used for improving mental healthcare through AI and big-data. They explore both the computational, clinical, and ethical considerations and best practices as well as layout the major researcher directions for the near future.
7. Big Data Based Breast Cancer Prediction Using Kernel Support Vector Machine With the Gray Wolf Optimization Algorithm
Today, big data in healthcare is often used to predict disease. Breast cancer is one of the primary cancers that a woman suffers. If we recognize this disease at an early stage, there is a greater chance of recovery. In this chapter, an optimal feature is selected using Oppositional Grasshopper Optimization (OGHO); further, these features are processed in the training phase using the kernel support vector machine with the Gray Wolf Optimization algorithm (KSVMGWO) to predict breast cancer.
8. Big Data Based Medical Data Classification Using Oppositional Gray Wolf Optimization With Kernel Ridge Regression
The classification of medical data is an important data mining issue that has been discussed for nearly a decade and has attracted numerous researchers around the world. Selection procedures provide the pathologist with valuable information for diagnosing and treating diseases. In this chapter, the authors aim to develop machine-learning algorithms to effectively predict the outbreak of chronic disease in general communities.
9. An Analytical Hierarchical Process Evaluation on Parameters Apps-Based Big Data Analytics for Healthcare Services
Any healthcare management system can be studied in terms of access, integration, privacy and security, confidentiality, sharing, assurance/relevancy, reliability, and cost involvement for the data/documents in the system. It can be a concern to the healthcare centers. Accessibility is a complex concept, and at least four aspects—that is, availability, utilization, relevance, and equity of access—require evaluation. These parameters are discussed and evaluated in this chapter.
10. Firefly—Binary Cuckoo Search Technique Based Heart Disease Prediction in Big Data Analytics
Nowadays, big data analysis is given more attention in complex healthcare settings. Fetal growth curves, the classic case of big health data, are used to predict coronary heart disease (CAD). This work aims to predict the risk of CAD using machine-learning algorithms such as Firefly—Binary Cuckoo Search (FFBCS). The authors also suggest a preliminary analysis of the performance of the framework.
11. Hybrid Technique for Heart Diseases Diagnosis Based on Convolution Neural Network and Long Short-Term Memory
In this chapter, a hybrid deep neural network using the dataset with 14 features as input and they are trained to utilize the convolution neural network (CNN) and long short-term memory (LSTM) hybrid algorithms to predict the presence or absence of disease in patients with the highest accuracy reaching 0.937%. The result of the study showed that the CNN-LSTM hybrid model had the best results in accuracy, recall, precession, F1 score, and AUC compared to other techniques.
1
Big Data classification: techniques and tools
Pijush Kanti Dutta Pramanik¹, Saurabh Pal², Moutan Mukhopadhyay² and Simar Preet Singh³, ¹1National Institute of Technology, Durgapur, India, ²2Bengal Institute of Technology, Kolkata, India, ³3Chandigarh Engineering College, Landran, India
Abstract
An enormous volume of data, known as Big Data, of varied properties, is continuously being generating from several sources. For efficient and consequential use of this huge amount of data, automated and correct categorization is very important. The precise categorization can find the correlations, hidden patterns, and other valuable insights. The process of categorization of mixed heterogeneous data is known as data classification and is done based on some predefined features. Various algorithms and techniques are proposed for Big Data classification. This chapter attempts to discuss various technicalities of Big Data classification, comprehensively. To start with, the basics of Big Data classifications such as need, types, patterns, phases, approaches, etc. are explained aptly. Different classification techniques, including traditional, evolutionary, and advanced machine learning technique, are discussed with suitable examples, along with citing their advantages and disadvantages. Finally, a survey of various open-source and commercial libraries, platforms, and tools for Big Data classification is presented.
Keywords
Big Data; machine learning; classification techniques; evolutionary algorithms; classification tools; Big Data platforms
1.1 Introduction
The modern digitized and smart world is continuously generating data of enormous volume from various sources, such as smart healthcare [1,2], smart cities [3,4], smart agriculture [5,6], smart buildings [7,8], smart learning [9,10], modern industries [11,12], social media [13,14], autonomous vehicles [15,16], cognitive systems [17,18], and so on. These data are generated and transmitted in various forms, volume and speed to different sinks. To have actionable usage of these data, they are required to be mined for extracting information and knowledge.
In traditional data mining, we generally use approaches like clustering, classification, and association rules. In Big Data also, we have to mine useful and valuable information, but with extremely large data sets. The two most used techniques in the case of Big Data are Big Data classification [19] and Big Data clustering [20]. The classification is supervised learning where classification algorithm needs training in order to achieve accurate classification. Whereas, clustering is unsupervised learning and does not need pretraining for performing clustering.
Classification allows grouping the information by common attributes and comparing them with similarities and differences. It helps in identifying and segregating data, which allows appropriate data tagging and thus making them easy to locate and retrieve or rather searchable. This enables in identifying relevant data from the irrelevant ones and further identifying multiple duplications. The tagging of data makes the searching fast and data accession easy [21]. We can say classification techniques mold the scattered data into shape. This shaped data reasons for quality and confidence in the outcome [22]. Conclusively, classification helps the user in gathering knowledge (knowledge discovery) and future planning. Without the proper classification, Big Data is bound to fail at drawing any valuable inferences.
Big Data classification has found many significant real-life applications such as predicting epidemic outbreak, drug discovery, providing and managing healthcare services [23,24], weather forecasting, product and service recommendation [25], sentiment analysis and opinion mining [14], user profiling, predicting the next possible crime in the city, network traffic classification and intrusion prediction, etc.
Since classification involves supervised learning, a classification model is built under supervision, using a training data set. The model by comprehending the data patterns of training set infers on how to classify a similar unknown data to a class. Specific strategies need to be sorted out on how to use the vast data. Typically, before the data can be used, they are preprocessed [26,27]. By preprocessing, the data are cleaned off, which may include null values, missing values, and inconsistent data. Not all features in the data set are useful. The required and suitable features are extracted using appropriate algorithms. The data are generally ready to be used after preprocessing and feature extraction and selection [28]. The performance accuracy of the model increases if the model is trained with good data. In classification, different techniques such as probabilistic models [29], the decision trees [30], the neural networks [31], support vector machine (SVM) [32], etc. are used. Most of these techniques are also used in usual data mining. But, since the volume of data is huge in Big Data, old techniques for data mining need to be tweaked to work for Big Data.
This chapter presents a comprehensive discussion of different techniques and tools used in Big Data classification. The rest of the chapter is organized as follows. Section 1.2 presents the preliminary of Big Data classification, including the need, challenges, types, pattern, phases, and approaches of Big Data classification. Various algorithms and techniques of Big Data classification are elaboratively discussed in Section 1.3. The recent and popular tools, libraries, and platforms for Big Data classification are discussed in Section 1.4. Section 1.5 summarizes the chapter.
1.2 Big Data classification
Since the term, Big Data defines the data sets that are too large or complex, the classification of these data sets is important for a deeper understanding of the data [33]. Using Big Data classification, different analysis can be performed on the data for accurate prediction.
In this section, we will be learning about the classification, why classification is required and various types of classification. Phases of classification, classification patterns, and the challenges are also covered subsequently.
1.2.1 Definition of classification
Process of classifying the data into different categories, on the basis of some attributes or features, is known as classification [34]. In Data mining, it is a data analysis process, which involves predicting the class of the newly observed data through modeling [35]. Classification is a crucial function of Big Data processing and is directly entails in knowledge discovery. Fig. 1.1 shows the different phases of Big Data processing, including Big Data classification.
Figure 1.1 Phases of Big Data processing.
1.2.2 Need for classification in Big Data
Data classification helps in knowledge discovery and intelligent decision making [35]. Thus, it plays a vital role in Big Data. Big Data is so huge as well as complex that without proper classification, finding information from Big Data will be like finding a needle in a haystack. Classification is required to systematically analyze this massive amount of data by organizing them into suitable classes. This helps in developing a precise model or description for each defined class using the data features of that class. The different aspects of the need for classification of Big Data are shown in Fig. 1.2.
Figure 1.2 Different aspects of the need for classification of Big Data.
1.2.3 Challenges in Big Data classification
Big Data involves complexity, thanks to its multifaceted properties. Hence, obviously, handling Big Data will not be trivial. The same applies to Big Data classification, which is hugely challenging due to several factors, most of which are inherent to Big Data. Fig. 1.3 lists some of the crucial challenges in Big Data classification.
Figure 1.3 Big Data classification challenges.
1.2.4 Types of classification
Classification can be categorized into the following three types:
1. Binary classification: Binary classification is a classification method, in which new data are classified into two possible classes as an outcome, that is, it categorizes items into two groups [36]. An example of binary classification is gender classification, which has two classification tasks with two possible outcomes—male or female. Similarly, it can be used in classifying the state of a machine into faulty or good. Other application areas where binary classification can be used are medical diagnosis, spam detection, etc.
2. Multiclass classification: As the name suggests, multiclass classification has more than two classification classes as an outcome. Multiclass or multinomial classification is a technique of assigning items into one of the N numbers of classes, where N is greater than two. Examples of multiclass classification are segregating emails into the appropriate folder, gene expression categorization, etc. In this classification, one target label is assigned to each sample, but the sample cannot have two or more labels at the same time [36]. For example, an animal can be a dog or a cat, not both at the same time [37].
3. Multilabel classification: The multilabel classification approach calls for the classification task where each sample of the data set is mapped to the set of target labels, that is more than one class [37]. The example of multilabel classification is a news article. The news article can describe sports, a location, or a person at the same time [38].
1.2.5 Big Data classification approaches
Typically, the following two approaches are followed for Big Data classification:
1. Supervised: Supervised classification approach calls for learning classification logic under directed supervision by means of understanding the patterns in a known labeled data set. This classification approach takes the large volume of data called the training data set as input. To understand the classification rules for categorizing data into the given classes, data analysis is performed on the data set. The classification rules which are learnt in this process are based on the features or attributes of each element of a class [39]. Since in this classification approach, the set of possible classes is known in advance—it is also known as directed or predictive classification.
2. Unsupervised: Unsupervised classification approach calls for learning classification logic from a set of unlabeled data; that is, here the class labels are unknown, and the training sets are not along with predefined class labels. In other words, the set of possible classes to which a datum will be classified is unknown in prior; after classification, the classes are given names. Hence, this approach is often known as clustering. The classification is carried out by drawing the comparisons between the features of data. This approach is useful when features or attributes are unknown, and therefore, the unsupervised classification techniques are often said to be as descriptive or undirected. Example of this technique is arranging a bucket full of fruits where all fruits are in jumbled order, and the aim is to classify the fruits into groups