Introduction to Algorithms for Data Mining and Machine Learning
By Xin-She Yang
()
About this ebook
Introduction to Algorithms for Data Mining and Machine Learning introduces the essential ideas behind all key algorithms and techniques for data mining and machine learning, along with optimization techniques. Its strong formal mathematical approach, well selected examples, and practical software recommendations help readers develop confidence in their data modeling skills so they can process and interpret data for classification, clustering, curve-fitting and predictions. Masterfully balancing theory and practice, it is especially useful for those who need relevant, well explained, but not rigorous (proofs based) background theory and clear guidelines for working with big data.
- Presents an informal, theorem-free approach with concise, compact coverage of all fundamental topics
- Includes worked examples that help users increase confidence in their understanding of key algorithms, thus encouraging self-study
- Provides algorithms and techniques that can be implemented in any programming language, with each chapter including notes about relevant software packages
Xin-She Yang
Xin-She Yang obtained his DPhil in Applied Mathematics from the University of Oxford. He then worked at Cambridge University and National Physical Laboratory (UK) as a Senior Research Scientist. He is currently a Reader in Modelling and Simulation at Middlesex University London, Fellow of the Institute of Mathematics and its Application (IMA) and a Book Series Co-Editor of the Springer Tracts in Nature-Inspired Computing. He has published more than 25 books and more than 400 peer-reviewed research publications with over 82000 citations, and he has been on the prestigious list of highly cited researchers (Web of Sciences) for seven consecutive years (2016-2022).
Read more from Xin She Yang
Engineering Mathematics with Examples and Applications Rating: 4 out of 5 stars4/5Engineering Optimization: An Introduction with Metaheuristic Applications Rating: 0 out of 5 stars0 ratingsOptimization Techniques and Applications with Examples Rating: 0 out of 5 stars0 ratingsNature-Inspired Optimization Algorithms Rating: 0 out of 5 stars0 ratingsEngineering Simulation and its Applications: Algorithms and Numerical Methods Rating: 0 out of 5 stars0 ratingsBio-Inspired Computation and Applications in Image Processing Rating: 0 out of 5 stars0 ratingsBio-Inspired Computation in Telecommunications Rating: 0 out of 5 stars0 ratings
Related to Introduction to Algorithms for Data Mining and Machine Learning
Related ebooks
Practical Machine Learning for Data Analysis Using Python Rating: 0 out of 5 stars0 ratingsDeep Learning in Bioinformatics: Techniques and Applications in Practice Rating: 0 out of 5 stars0 ratingsComputational Learning Approaches to Data Analytics in Biomedical Applications Rating: 5 out of 5 stars5/5Designing Machine Learning Systems with Python Rating: 0 out of 5 stars0 ratingsData Science: Concepts and Practice Rating: 3 out of 5 stars3/5Deep Learning through Sparse and Low-Rank Modeling Rating: 0 out of 5 stars0 ratingsNeural Data Science: A Primer with MATLAB® and Python™ Rating: 5 out of 5 stars5/5Pattern Recognition Rating: 4 out of 5 stars4/5Pattern Recognition and Machine Learning Rating: 0 out of 5 stars0 ratingsTensorFlow A Complete Guide - 2019 Edition Rating: 0 out of 5 stars0 ratingsDeep Learning for Data Analytics: Foundations, Biomedical Applications, and Challenges Rating: 0 out of 5 stars0 ratingsConvolutional Neural Networks in Python: Beginner's Guide to Convolutional Neural Networks in Python Rating: 0 out of 5 stars0 ratingsA Greater Foundation for Machine Learning Engineering: The Hallmarks of the Great Beyond in Pytorch, R, Tensorflow, and Python Rating: 0 out of 5 stars0 ratingsPractical Machine Learning Cookbook Rating: 0 out of 5 stars0 ratingsTrends in Deep Learning Methodologies: Algorithms, Applications, and Systems Rating: 0 out of 5 stars0 ratingsMachine Learning in Action Rating: 0 out of 5 stars0 ratingsBuilding Machine Learning Systems with Python Rating: 4 out of 5 stars4/5Introduction to Statistical Machine Learning Rating: 4 out of 5 stars4/5Machine Learning: A Bayesian and Optimization Perspective Rating: 3 out of 5 stars3/5Machine Learning and Data Mining Rating: 3 out of 5 stars3/5Data Mining: Practical Machine Learning Tools and Techniques Rating: 4 out of 5 stars4/5Computer Vision and Applications: A Guide for Students and Practitioners,Concise Edition Rating: 5 out of 5 stars5/5Deep Learning for Medical Image Analysis Rating: 4 out of 5 stars4/5TensorFlow Machine Learning Cookbook Rating: 4 out of 5 stars4/5Python Deep Learning Rating: 5 out of 5 stars5/5
Mathematics For You
Calculus Made Easy Rating: 4 out of 5 stars4/5Real Estate by the Numbers: A Complete Reference Guide to Deal Analysis Rating: 0 out of 5 stars0 ratingsThe Little Book of Mathematical Principles, Theories & Things Rating: 3 out of 5 stars3/5My Best Mathematical and Logic Puzzles Rating: 5 out of 5 stars5/5Quantum Physics for Beginners Rating: 4 out of 5 stars4/5Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics Rating: 4 out of 5 stars4/5The Math Book: From Pythagoras to the 57th Dimension, 250 Milestones in the History of Mathematics Rating: 3 out of 5 stars3/5Flatland Rating: 4 out of 5 stars4/5Algebra - The Very Basics Rating: 5 out of 5 stars5/5The Everything Guide to Algebra: A Step-by-Step Guide to the Basics of Algebra - in Plain English! Rating: 4 out of 5 stars4/5Mental Math Secrets - How To Be a Human Calculator Rating: 5 out of 5 stars5/5The Thirteen Books of the Elements, Vol. 1 Rating: 0 out of 5 stars0 ratingsSummary of The Black Swan: by Nassim Nicholas Taleb | Includes Analysis Rating: 5 out of 5 stars5/5Basic Math & Pre-Algebra For Dummies Rating: 4 out of 5 stars4/5Game Theory: A Simple Introduction Rating: 4 out of 5 stars4/5The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need Rating: 5 out of 5 stars5/5The Math of Life and Death: 7 Mathematical Principles That Shape Our Lives Rating: 4 out of 5 stars4/5Algebra I Workbook For Dummies Rating: 3 out of 5 stars3/5Geometry For Dummies Rating: 5 out of 5 stars5/5The Golden Ratio: The Divine Beauty of Mathematics Rating: 5 out of 5 stars5/5Relativity: The special and the general theory Rating: 5 out of 5 stars5/5A Mind for Numbers | Summary Rating: 4 out of 5 stars4/5Introducing Game Theory: A Graphic Guide Rating: 4 out of 5 stars4/5Is God a Mathematician? Rating: 4 out of 5 stars4/5
Reviews for Introduction to Algorithms for Data Mining and Machine Learning
0 ratings0 reviews
Book preview
Introduction to Algorithms for Data Mining and Machine Learning - Xin-She Yang
Introduction to Algorithms for Data Mining and Machine Learning
First edition
Xin-She Yang
Middlesex University, School of Science and Technology, London, United Kingdom
Table of Contents
Cover image
Title page
Copyright
About the author
Preface
Acknowledgments
1: Introduction to optimization
Abstract
1.1. Algorithms
1.2. Optimization
1.3. Unconstrained optimization
1.4. Nonlinear constrained optimization
1.5. Notes on software
Bibliography
2: Mathematical foundations
Abstract
2.1. Convexity
2.2. Computational complexity
2.3. Norms and regularization
2.4. Probability distributions
2.5. Bayesian network and Markov models
2.6. Monte Carlo sampling
2.7. Entropy, cross entropy, and KL divergence
2.8. Fuzzy rules
2.9. Data mining and machine learning
2.10. Notes on software
Bibliography
3: Optimization algorithms
Abstract
3.1. Gradient-based methods
3.2. Variants of gradient-based methods
3.3. Optimizers in deep learning
3.4. Gradient-free methods
3.5. Evolutionary algorithms and swarm intelligence
3.6. Notes on software
Bibliography
4: Data fitting and regression
Abstract
4.1. Sample mean and variance
4.2. Regression analysis
4.3. Nonlinear least squares
4.4. Overfitting and information criteria
4.5. Regularization and Lasso method
4.6. Notes on software
Bibliography
5: Logistic regression, PCA, LDA, and ICA
Abstract
5.1. Logistic regression
5.2. Softmax regression
5.3. Principal component analysis
5.4. Linear discriminant analysis
5.5. Singular value decomposition
5.6. Independent component analysis
5.7. Notes on software
Bibliography
6: Data mining techniques
Abstract
6.1. Introduction
6.2. Hierarchy clustering
6.3. k-Nearest-neighbor algorithm
6.4. k-Means algorithm
6.5. Decision trees and random forests
6.6. Bayesian classifiers
6.7. Data mining for big data
6.8. Notes on software
Bibliography
7: Support vector machine and regression
Abstract
7.1. Statistical learning theory
7.2. Linear support vector machine
7.3. Kernel functions and nonlinear SVM
7.4. Support vector regression
7.5. Notes on software
Bibliography
8: Neural networks and deep learning
Abstract
8.1. Learning
8.2. Artificial neural networks
8.3. Back propagation algorithm
8.4. Loss functions in ANN
8.5. Optimizers and choice of optimizers
8.6. Network architecture
8.7. Deep learning
8.8. Tuning of hyperparameters
8.9. Notes on software
Bibliography
Bibliography
Index
Copyright
Academic Press is an imprint of Elsevier
125 London Wall, London EC2Y 5AS, United Kingdom
525 B Street, Suite 1650, San Diego, CA 92101, United States
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
Copyright © 2019 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher's permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
ISBN: 978-0-12-817216-2
For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals
Publisher: Candice Janco
Acquisition Editor: J. Scott Bentley
Editorial Project Manager: Michael Lutz
Production Project Manager: Nilesh Kumar Shah
Designer: Miles Hitchen
Typeset by VTeX
About the author
Xin-She Yang obtained his PhD in Applied Mathematics from the University of Oxford. He then worked at Cambridge University and National Physical Laboratory (UK) as a Senior Research Scientist. Now he is Reader at Middlesex University London, and an elected Bye-Fellow at Cambridge University.
He is also the IEEE Computer Intelligence Society (CIS) Chair for the Task Force on Business Intelligence and Knowledge Management, Director of the International Consortium for Optimization and Modelling in Science and Industry (iCOMSI), and an Editor of Springer's Book Series Springer Tracts in Nature-Inspired Computing (STNIC).
With more than 20 years of research and teaching experience, he has authored 10 books and edited more than 15 books. He published more than 200 research papers in international peer-reviewed journals and conference proceedings with more than 36 800 citations. He has been on the prestigious lists of Clarivate Analytics and Web of Science highly cited researchers in 2016, 2017, and 2018. He serves on the Editorial Boards of many international journals including International Journal of Bio-Inspired Computation, Elsevier's Journal of Computational Science (JoCS), International Journal of Parallel, Emergent and Distributed Systems, and International Journal of Computer Mathematics. He is also the Editor-in-Chief of the International Journal of Mathematical Modelling and Numerical Optimisation.
Preface
Xin-She Yang
Both data mining and machine learning are becoming popular subjects for university courses and industrial applications. This popularity is partly driven by the Internet and social media because they generate a huge amount of data every day, and the understanding of such big data requires sophisticated data mining techniques. In addition, many applications such as facial recognition and robotics have extensively used machine learning algorithms, leading to the increasing popularity of artificial intelligence. From a more general perspective, both data mining and machine learning are closely related to optimization. After all, in many applications, we have to minimize costs, errors, energy consumption, and environment impact and to maximize sustainability, productivity, and efficiency. Many problems in data mining and machine learning are usually formulated as optimization problems so that they can be solved by optimization algorithms. Therefore, optimization techniques are closely related to many techniques in data mining and machine learning.
Courses on data mining, machine learning, and optimization are often compulsory for students, studying computer science, management science, engineering design, operations research, data science, finance, and economics. All students have to develop a certain level of data modeling skills so that they can process and interpret data for classification, clustering, curve-fitting, and predictions. They should also be familiar with machine learning techniques that are closely related to data mining so as to carry out problem solving in many real-world applications. This book provides an introduction to all the major topics for such courses, covering the essential ideas of all key algorithms and techniques for data mining, machine learning, and optimization.
Though there are over a dozen good books on such topics, most of these books are either too specialized with specific readership or too lengthy (often over 500 pages). This book fills in the gap with a compact and concise approach by focusing on the key concepts, algorithms, and techniques at an introductory level. The main approach of this book is informal, theorem-free, and practical. By using an informal approach all fundamental topics required for data mining and machine learning are covered, and the readers can gain such basic knowledge of all important algorithms with a focus on their key ideas, without worrying about any tedious, rigorous mathematical proofs. In addition, the practical approach provides about 30 worked examples in this book so that the readers can see how each step of the algorithms and techniques works. Thus, the readers can build their understanding and confidence gradually and in a step-by-step manner. Furthermore, with the minimal requirements of basic high school mathematics and some basic calculus, such an informal and practical style can also enable the readers to learn the contents by self-study and at their own pace.
This book is suitable for undergraduates and graduates to rapidly develop all the fundamental knowledge of data mining, machine learning, and optimization. It can also be used by students and researchers as a reference to review and refresh their knowledge in data mining, machine learning, optimization, computer science, and data science.
January 2019 in London
Acknowledgments
Xin-She Yang
I would like to thank all my students and colleagues who have given valuable feedback and comments on some of the contents and examples of this book. I also would like to thank my editors, J. Scott Bentley and Michael Lutz, and the staff at Elsevier for their professionalism. Last but not least, I thank my family for all the help and support.
January 2019
1
Introduction to optimization
Abstract
This chapter introduces the fundamentals of algorithms in the context of data mining, optimization, and machine learning, including the feasibility, constraints, optimality, Lagrange multipliers, KKT conditions, and gradient-based techniques.
Keywords
Algorithm; data mining; gradient; machine learning; optimization
Chapter Outline
1.1 Algorithms
1.1.1 Essence of an algorithm
1.1.2 Issues with algorithms
1.1.3 Types of algorithms
1.2 Optimization
1.2.1 A simple example
1.2.2 General formulation of optimization
1.2.3 Feasible solution
1.2.4 Optimality criteria
1.3 Unconstrained optimization
1.3.1 Univariate functions
1.3.2 Multivariate functions
1.4 Nonlinear constrained optimization
1.4.1 Penalty method
1.4.2 Lagrange multipliers
1.4.3 Karush–Kuhn–Tucker conditions
1.5 Notes on software
This book introduces the most fundamentals and algorithms related to optimization, data mining, and machine learning. The main requirement is some understanding of high-school mathematics and basic calculus; however, we will review and introduce some of the mathematical foundations in the first two chapters.
1.1 Algorithms
An algorithm is an iterative, step-by-step procedure for computation. The detailed procedure can be a simple description, an equation, or a series of descriptions in combination with equations. Finding the roots of a polynomial, checking if a natural number is a prime number, and generating random numbers are all algorithms.
1.1.1 Essence of an algorithm
, we can use the following iterative equation:
(1.1)
where k .
Example 1
, then we have
(1.2)
Similarly, we have
(1.3)
(1.4)
. The accuracy of this iterative formula or algorithm is high because it achieves the accuracy of five decimal places after four iterations.
due to division by zero.
is equivalent to solving the equation
(1.5)
. We know that Newton's root-finding algorithm can be written as
(1.6)
. Thus, Newton's formula becomes
(1.7)
which can be written as
(1.8)
This is exactly what we have in Eq. (1.1).
Newton's method has rigorous mathematical foundations, which has a guaranteed convergence under certain conditions. However, in general, Eq. .
1.1.2 Issues with algorithms
The advantage of the algorithm given in Eq. , how can we find the other root −2 in addition to +2?
, not −2.
, we have
(1.9)
(1.10)
, not +2.
This highlights a key issue here: the final solution seems to depend on the initial starting point for this algorithm, which is true for many algorithms.
Now the relevant question is: how do we know where to start to get a particular solution? The general short answer is we do not know
. Thus, some knowledge of the problem under consideration or an educated guess may be useful to find the final solution.
In fact, most algorithms may depend on the initial configuration, and such algorithms are often carrying out search moves locally. Thus, this type of algorithm is often referred to as local search. A good algorithm should be able to forget
its initial configuration though such algorithms may not exist at all for most types of problems.
What we need in general is the global search, which attempts to find final solutions that are less sensitive to the initial starting point(s).
is necessary for some algorithms such as Newton's method given in Eq. . Some modifications are needed.
There are other issues related to algorithms such as the setting of parameters, the slow rate of convergence, condition numbers, and iteration structures. All these make algorithm designs and usage somehow challenging, and we will discuss these issues in more detail later in this book.
1.1.3 Types of algorithms
An algorithm can only do a specific computation task (at most a class of computational tasks), and no algorithms can do all the tasks. Thus, algorithms can be classified due to their purposes. An algorithm to find roots of a polynomial belongs to root-finding algorithms, whereas an algorithm for ranking a set of numbers belongs to sorting algorithms. There are many classes of algorithms for different purposes. Even for the same purpose such as sorting, there are many different algorithms such as the merge sort, bubble sort, quicksort, and others.
using Eq. (1.1), random initial values (both positive and negative) can allow the algorithm to find both roots. In fact, a major trend in the modern metaheuristics is using some randomization to suit different purposes.
For algorithms to be introduced in this book, we are mainly concerned with