Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Disease Prediction using Machine Learning, Deep Learning and Data Analytics
Disease Prediction using Machine Learning, Deep Learning and Data Analytics
Disease Prediction using Machine Learning, Deep Learning and Data Analytics
Ebook389 pages3 hours

Disease Prediction using Machine Learning, Deep Learning and Data Analytics

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book is a comprehensive review of technologies and data in healthcare services. It features a compilation of 10 chapters that inform readers about the recent research and developments in this field. Each chapter focuses on a specific aspect of healthcare services, highlighting the potential impact of technology on enhancing practices and outcomes.
The main features of the book include 1) referenced contributions from healthcare and data analytics experts, 2) a broad range of topics that cover healthcare services, and 3) demonstration of deep learning techniques for specific diseases.

Key topics:
- Federated learning in analysis of sensitive healthcare data while preserving privacy and security.
- Artificial intelligence for 3-D bone image reconstruction.
- Detection of disease severity and creating personalized treatment plans using machine learning and software tools
- Case studies for disease detection methods for different disease and conditions, including dementia, asthma, eye diseases
- Brain-computer interfaces
- Data mining for standardized electronic health records
- Data collection, management, and analysis in epidemiological research

The book is a resource for learners and professionals in healthcare service training programs and health administration departments.

Readership
Learners and professionals in healthcare service training programs and health administration departments.

LanguageEnglish
Release dateMar 7, 2024
ISBN9789815179125
Disease Prediction using Machine Learning, Deep Learning and Data Analytics

Related to Disease Prediction using Machine Learning, Deep Learning and Data Analytics

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for Disease Prediction using Machine Learning, Deep Learning and Data Analytics

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Disease Prediction using Machine Learning, Deep Learning and Data Analytics - Geeta Rani

    Role of Federated Learning in Healthcare: A Review

    Geeta Rani³, Meet Oza¹, Heta Patel¹, Vijaypal Singh Dhaka³, *, Sushma Hans²

    ¹ Manipal University Jaipur, Jaipur, India

    ² Amity University, Dubai Campus, Dubai, United Arab Emirates

    ³ Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur, India

    Abstract

    In the modern era, there is a boom in automating medical diagnosis by adopting emerging technologies and advanced applications of artificial intelligence. These technologies require a huge amount of data for training the models and precisely predicting the disease or disorder. Multiple organizations can contribute data for such systems but maintaining data privacy while sharing the data is a major challenge. Also, provisioning a large data corpus for the performance improvement of machine learning and deep learning models in the healthcare domain while keeping the patient’s medical confidentiality intact is a point of concern. Thus, there is a strong need to preserve the privacy of medical data. This calls for the use of up-to-the-minute technologies where the necessity of sharing raw data is completely eradicated, while each organization receives a catered infrastructure for processing data. A cross-silo federated learning model is based on the concept of decentralized data weights collection from multiple clients which are then processed on the central server for modeling and aggregation, thus maintaining data privacy in its true sense. The authors in this manuscript provide a detailed comparative study of the different deep learning-based models in federated learning and how efficiently they can classify lung X-Ray images into three classes: Covid-19, Pneumonia, and Normal. This study can provide a benchmark for the researchers looking forward to deep learning-based model applications of cross-silo federated learning in healthcare.

    Keywords: Covid-19, Diagnosis, Deep learning, Federated, Medical, Machine learning, Segmentation, X-Ray.


    * Corresponding author Vijaypal Singh Dhaka: Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur, India; E-mail: vijaypalsingh.dhaka@jaipur.manipal.edu

    INTRODUCTION

    When Covid-19 pandemic hit the world, it became very important to figure out various ways to detect the existence of the novel virus, apart from the usual RT-PCR tests. Scientists and researchers around the globe have studied, developed, and presented numerous ways to detect the novel Covid-19 virus. Many researchers have also provided multiple ways to detect Covid-19 and pneumonia through CT-Scans and X-rays of lungs with the use of machine learning techniques. But the accurate prediction via these machine learning and deep learning models for detection requires a large amount of dataset for training the models. In real scenarios, such large datasets are either not feasible due to system constraints or the patient’s medical confidentiality gets impaired. The feasibility issues persist even though we have access to many image annotation tools available in the market. This is because such tools and services are very expensive, and they also require expert supervision and proficiency especially if they are utilized for disease diagnosis. Along with monetary hindrances, the feasibility issue also refers to the communication overhead that would occur due to a large dataset being transmitted for centralization [1]. Such an issue exists in the healthcare domain and across all domains where the models are required to learn and train with the help of multiple clients’ data without invading the users’ privacy.

    The application of traditional machine learning and deep learning in the field of healthcare has already been studied by various researchers for tumor prediction [2], covid-19 screening [3, 4], disease prediction [5], cardiovascular diseases prediction [6], coronary artery disease prediction [7], diabetes prediction [8], glaucoma detection [9], etc. A similar case pertaining to users’ privacy was encountered by Google in 2016 when the team coined the term ‘Federated Learning’ while advocating an advanced and novel approach that utilizes distributed data from mobile devices for training. Further, this approach presents how a central model is updated by only using the aggregate of the parameters of the local mobile devices [10]. Federated learning inherently trains the central model based only on the parameters passed on by local machine learning models. In addition to only sharing the parameters, the parameters are also encrypted before being passed on which increases data privacy. Federated learning can be differentiated from distributed learning because of the fact that the main objective of federated learning is training on a large dataset from different clients without the transfer of raw data. Whereas distributed learning focuses on distributing the computing resources across clients [11]. Incorporation of this Federated Learning along with cross-silo transferred learning opens new doors to endless possibilities of more accurate innovations due to the availability of a huge data corpus for training without actually having to exchange or transfer the data. In cross-silo federated learning, data is segregated as silos, i.e., multiple confined data sources which in turn centrally aggregate and train a model by passing out only the trained weights and parameters from each client. For collaboration between institutions of healthcare, finance, etc. user data is extremely sensitive, and open alliances might expose such sensitive data to various vulnerabilities. Cross-silo federated learning only sanctions weights and parameter transfer and hence the data fenced within the silos itself. And therefore, cross-silo federated learning serves as a superior alternative to the traditional centralized machine learning approaches. Federated learning has been so far applied to various healthcare applications. For example [12], distributed learning has been used to solve a problem related to hospitalizations due to cardiac cases; and [13] leveraged machine learning in a federated setting to predict fatality and duration of stay at the hospital using electronic medical records.

    Federated learning systems seem promising but due to their nature of a dispersed framework, it faces certain challenges too such as, communication cost, resource cost, security of communication, etc. High communication costs refer to the overhead incurred due to ample transmissions of parameters for the training process. Frequent transfer of parameters is required between silos in order to present potent results and hence this communication overhead acts as a hindrance to federated learning especially when the connection is slow, and a considerable number of devices or organizations are involved. Also, since FL works on distributed systems, each system involved might have different computational power, different storage capabilities, and different bandwidths. A single slightly less efficient system can be a weak link to the entire process and on the other hand, providing all the concerned organizations with full-fledged resources might increase the system cost. Apart from these overhead challenges, privacy concerns also prevail in federated learning. Although federated learning is known as a mechanism to preserve user data privacy, it is not general wisdom that FL by itself doesn’t protect data privacy. Recent studies [14] have revealed that as models communicate constantly for the transfer of parameters, the process is seen to be leaking some information in the course. For example, [15] a study showed even a small section of original gradients may be enough to let local data slip from the system. Moreover, since the parameters are obtained via model training at the local level, vulnerabilities such as model inversion or attacks on model parameters can corrupt aggregate inference.

    Even after weighing the opportunities and hindrances presented by the federated learning approach in healthcare, we can conclude that federated learning still races way ahead of traditional machine learning practices. The user data confidentiality issues that come along with the traditional approaches directly make the patients’

    data susceptible. In this research, we propose a federated learning model to classify chest X-rays as COVID-19, Pneumonia, and normal lungs.

    LITERATURE REVIEW

    In this section, let us take a look at the previous works by researchers in this domain of federated learning to detect the abnormalities of lungs. Many researchers have experimented with federated learning to provide prominent techniques to detect the covid-19 virus in human lungs. Most of which accommodate CT-Scans as input data. For example, a group of researchers [16] Qi Dou, et al., presented a deep convolutional neural network-based artificial intelligence model using CT scans that incorporated federated learning to detect COVID-19 lung abnormalities. Here the main dataset was compiled from 3 hospitals in Hong Kong with 75 confirmed covid-19 patients. For external validation, the researchers used 4 other datasets which included 22 patients from China and 35 patients from Germany in total. The effectiveness of federated learning here was checked on full CT slices, i.e. without previously knowing if tissue damage is present or not. For every client in this experiment, the central aggregating server was given the dataset size that was given as input to every local device. Along with this, the central server was also provided with the weighted average of the local models for the updation of the global model on the central server. The model trained on an internal dataset showed the highest AUC score of 95.40% mostly because it had the single largest dataset. However, the joint model that was trained with all the internal datasets derived an AUC score of 92.97%. Along with the federated learning model, the researchers here applied ensemble learning over three models corresponding to the three datasets. The comparison clearly showed that the performance of federated learning across all metrics was superior to that of the model ensemble method. The finding from this research was that in order to curtail the false-positive predictions, transfer learning from an extensive dataset will be more suitable. For evaluation, although multiple sources of data were considered, it still included data of 132 patients only, which might induce model bias.

    Similarly, another group of researchers [17], Dong Yang, et al., experimented by consolidating federated learning and semi-supervised learning. Here, the dataset consisted of COVID-19 data that included 736 CT scans of 700 patients from China; 496 scans of 244 patients from Japan; 472 scans of 147 patients from Italy. Other data included 38 CT scans of 38 patients from the National Institutes of Health. The CT scans of these patients were examined for known non-COVID-19 types of pneumonia from bacteria, and fungi, and non-COVID viruses were included as other pneumonia. The dataset also included 101 CT scan images of 101 patients, who were men diagnosed with prostate cancer, and 474 CT scans from the LIDC public dataset belonging to 474 patients. The study shows that the framework is efficient to extract valuable information only if unlabelled data is input by the clients and also demonstrates that a lower learning rate on unsupervised clients generally benefits all clients involved in the federated learning process. The researchers have shown that for the identification of COVID-19-infected regions, the data of patients with a completely COVID-19- free background also turned out to be contributive. This was achieved via 'false alarm rejection'. Although the model fails to classify other types of lung abnormalities like pneumonia or cancer, the dataset has high variation in demographics and so the accuracy metrics range approximately between 50-60%.

    Further, Rajesh Kumar, et al. [18], have proposed their research on the detection of COVID-19 based on CT scan images using an amalgamation of federated learning and blockchain along with deep learning models. The researchers have presented a Capsule Network-based segmentation and classification for the detection of covid19 patterns from lung CT scans. This paper proposed a method that uses blockchain in order to only fetch the trained model parameters while preserving the client’s privacy as the raw dataset is primarily stored by the organization itself and is not transmitted further. Basically, a blockchain network is incorporated here to receive the locally and individually trained parameters of each client. This local model is then fed into federated learning in order to merge it with the central model. The new dataset introduced is assembled from hospitals and CT scanner machines and hence a data normalization technique is applied here. The data collected from 3 hospitals consists of 34,006 CT scans of 89 patients, out of which, 68 patients were identified as COVID-19 positive. Data normalization applied here has two parts: spatial normalization and signal normalization. Segmentation is done to get CT scan 2D slices and for classification of covid19, the capsule network is used that is very much like Hinton's Capsule Network. The Capsule Network is made up of four layers namely the convolutional layer, hidden layer, PrimaryCaps layer, and DigitCaps layer. The study demonstrated that the Capsule Network is superior to the standard Artificial Neural Network as in the case of Capsule Network, capsules do what neurons do otherwise and the output is a vector representation of each component instead of a scalar value like in the ANN. Also to validate the federated model, third-party datasets were used. The deep learning models used for comparative experiments are VGG16, AlexNet, Inception V3, ResNet 50-152 layers, MobileNet and DenseNet. The accuracy of 98.68% for the covid19 images is the highest using their proposed Capsule Network. Also, the Capsule Network has the highest sensitivity/recall and ResNet has the best specificity. The study also shows how an increment in the number of clients in turn improves the performance of the central model

    Enjoying the preview?
    Page 1 of 1