R and Data Mining: Examples and Case Studies
3/5
()
About this ebook
- Presents an introduction into using R for data mining applications, covering most popular data mining techniques
- Provides code examples and data so that readers can easily learn the techniques
- Features case studies in real-world applications to help readers apply the techniques in their work
Yanchang Zhao
A Senior Data Mining Analyst in Australia Government since 2009. Before joining public sector, he was an Australian Postdoctoral Fellow (Industry) in the Faculty of Engineering & Information Technology at University of Technology, Sydney, Australia. His research interests include clustering, association rules, time series, outlier detection and data mining applications and he has over forty papers published in journals and conference proceedings. He is a member of the IEEE and a member of the Institute of Analytics Professionals of Australia, and served as program committee member for more than thirty international conferences.
Related to R and Data Mining
Related ebooks
Learning Predictive Analytics with R Rating: 0 out of 5 stars0 ratingsData Mining Applications with R Rating: 4 out of 5 stars4/5Introducing Data Science: Big data, machine learning, and more, using Python tools Rating: 5 out of 5 stars5/5Applied Data Mining for Forecasting Using SAS Rating: 0 out of 5 stars0 ratingsData Mining: Practical Machine Learning Tools and Techniques Rating: 4 out of 5 stars4/5Learning R for Geospatial Analysis Rating: 0 out of 5 stars0 ratingsSAS Statistics by Example Rating: 5 out of 5 stars5/5Predictive Modeling with SAS Enterprise Miner: Practical Solutions for Business Applications, Third Edition Rating: 0 out of 5 stars0 ratingsR in Action, Third Edition: Data analysis and graphics with R and Tidyverse Rating: 0 out of 5 stars0 ratingsSAS for Forecasting Time Series, Third Edition Rating: 0 out of 5 stars0 ratingsPrinciples and Practice of Big Data: Preparing, Sharing, and Analyzing Complex Information Rating: 0 out of 5 stars0 ratingsMastering Text Mining with R Rating: 0 out of 5 stars0 ratingsR: Recipes for Analysis, Visualization and Machine Learning Rating: 0 out of 5 stars0 ratingsSimulation for Data Science with R Rating: 0 out of 5 stars0 ratingsR: Data Analysis and Visualization Rating: 5 out of 5 stars5/5Handbook of Statistical Analysis and Data Mining Applications Rating: 4 out of 5 stars4/5Time Series Analysis in the Social Sciences: The Fundamentals Rating: 0 out of 5 stars0 ratingsData Analysis with R Rating: 5 out of 5 stars5/5Mastering Predictive Analytics with R Rating: 4 out of 5 stars4/5Mastering Data Analysis with R Rating: 5 out of 5 stars5/5RStudio for R Statistical Computing Cookbook Rating: 0 out of 5 stars0 ratingsMastering Scientific Computing with R Rating: 3 out of 5 stars3/5Learning RStudio for R Statistical Computing Rating: 4 out of 5 stars4/5Learning Bayesian Models with R Rating: 5 out of 5 stars5/5Machine Learning with R - Second Edition Rating: 5 out of 5 stars5/5ggplot2 Essentials Rating: 0 out of 5 stars0 ratingsMachine Learning with R Rating: 4 out of 5 stars4/5R Machine Learning By Example Rating: 0 out of 5 stars0 ratingsR for Data Science Rating: 5 out of 5 stars5/5R Object-oriented Programming Rating: 3 out of 5 stars3/5
Mathematics For You
Algebra - The Very Basics Rating: 5 out of 5 stars5/5Basic Math Notes Rating: 5 out of 5 stars5/5Geometry For Dummies Rating: 5 out of 5 stars5/5Basic Math & Pre-Algebra For Dummies Rating: 4 out of 5 stars4/5Algebra I Workbook For Dummies Rating: 3 out of 5 stars3/5Game Theory: A Simple Introduction Rating: 4 out of 5 stars4/5Quantum Physics for Beginners Rating: 4 out of 5 stars4/5The Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need Rating: 5 out of 5 stars5/5Mental Math Secrets - How To Be a Human Calculator Rating: 5 out of 5 stars5/5My Best Mathematical and Logic Puzzles Rating: 5 out of 5 stars5/5Calculus For Dummies Rating: 4 out of 5 stars4/5Introducing Game Theory: A Graphic Guide Rating: 4 out of 5 stars4/5ACT Math & Science Prep: Includes 500+ Practice Questions Rating: 3 out of 5 stars3/5The Everything Guide to Algebra: A Step-by-Step Guide to the Basics of Algebra - in Plain English! Rating: 4 out of 5 stars4/5The Elements of Euclid for the Use of Schools and Colleges (Illustrated) Rating: 0 out of 5 stars0 ratingsThe Golden Ratio: The Divine Beauty of Mathematics Rating: 5 out of 5 stars5/5See Ya Later Calculator: Simple Math Tricks You Can Do in Your Head Rating: 4 out of 5 stars4/5Calculus Made Easy Rating: 4 out of 5 stars4/5Is God a Mathematician? Rating: 4 out of 5 stars4/5The Thirteen Books of the Elements, Vol. 1 Rating: 0 out of 5 stars0 ratingsThe Little Book of Mathematical Principles, Theories & Things Rating: 3 out of 5 stars3/5A Mind for Numbers | Summary Rating: 4 out of 5 stars4/5GED® Math Test Tutor, 2nd Edition Rating: 0 out of 5 stars0 ratingsLogicomix: An epic search for truth Rating: 4 out of 5 stars4/5Algebra I For Dummies Rating: 4 out of 5 stars4/5
Reviews for R and Data Mining
1 rating0 reviews
Book preview
R and Data Mining - Yanchang Zhao
frequency
Chapter 1
Introduction
This book introduces into using R for data mining. It presents many examples of various data mining functionalities in R and three case studies of real-world applications. The supposed audience of this book are postgraduate students, researchers, and data miners who are interested in using R to do their data mining research and projects. We assume that readers already have a basic idea of data mining and also have some basic experience with R. We hope that this book will encourage more and more people to use R to do data mining work in their research and applications.
This chapter introduces basic concepts and techniques for data mining, including a data mining process and popular data mining techniques. It also presents R and its packages, functions, and task views for data mining. At last, some datasets used in this book are described.
1.1 Data Mining
Data mining is the process to discover interesting knowledge from large amounts of data (Han and Kamber, 2000). It is an interdisciplinary field with contributions from many areas, such as statistics, machine learning, information retrieval, pattern recognition, and bioinformatics. Data mining is widely used in many domains, such as retail, finance, telecommunication, and social media.
The main techniques for data mining include classification and prediction, clustering, outlier detection, association rules, sequence analysis, time series analysis, and text mining, and also some new techniques such as social network analysis and sentiment analysis. Detailed introduction of data mining techniques can be found in text books on data mining (Han and Kamber, 2000; Hand et al., 2001; Witten and Frank, 2005). In real-world applications, a data mining process can be broken into six major phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment, as defined by the CRISP-DM (Cross Industry Standard Process for Data Mining).¹ This book focuses on the modeling phase, with data exploration and model evaluation involved in some chapters. Readers who want more information on data mining are referred to online resources in Chapter 15.
1.2 R
R² (R Development Core Team, 2012) is a free software environment for statistical computing and graphics. It provides a wide variety of statistical and graphical techniques. R can be extended easily via packages. There are around 4000 packages available in the CRAN package repository,³ as on August 1, 2012. More details about R are available in An Introduction to R⁴ (Venables et al., 2010) and R Language Definition⁵ (R Development Core Team, 2010b) at the CRAN website. R is widely used in both academia and industry.
To help users to find out which R packages to use, the CRAN Task Views⁶ are a good guidance. They provide collections of packages for different tasks. Some task views related to data mining