Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

R and Data Mining: Examples and Case Studies
R and Data Mining: Examples and Case Studies
R and Data Mining: Examples and Case Studies
Ebook383 pages40 hours

R and Data Mining: Examples and Case Studies

Rating: 3 out of 5 stars

3/5

()

Read preview

About this ebook

R and Data Mining introduces researchers, post-graduate students, and analysts to data mining using R, a free software environment for statistical computing and graphics. The book provides practical methods for using R in applications from academia to industry to extract knowledge from vast amounts of data. Readers will find this book a valuable guide to the use of R in tasks such as classification and prediction, clustering, outlier detection, association rules, sequence analysis, text mining, social network analysis, sentiment analysis, and more.Data mining techniques are growing in popularity in a broad range of areas, from banking to insurance, retail, telecom, medicine, research, and government. This book focuses on the modeling phase of the data mining process, also addressing data exploration and model evaluation.With three in-depth case studies, a quick reference guide, bibliography, and links to a wealth of online resources, R and Data Mining is a valuable, practical guide to a powerful method of analysis.
  • Presents an introduction into using R for data mining applications, covering most popular data mining techniques
  • Provides code examples and data so that readers can easily learn the techniques
  • Features case studies in real-world applications to help readers apply the techniques in their work
LanguageEnglish
Release dateDec 31, 2012
ISBN9780123972712
R and Data Mining: Examples and Case Studies
Author

Yanchang Zhao

A Senior Data Mining Analyst in Australia Government since 2009. Before joining public sector, he was an Australian Postdoctoral Fellow (Industry) in the Faculty of Engineering & Information Technology at University of Technology, Sydney, Australia. His research interests include clustering, association rules, time series, outlier detection and data mining applications and he has over forty papers published in journals and conference proceedings. He is a member of the IEEE and a member of the Institute of Analytics Professionals of Australia, and served as program committee member for more than thirty international conferences.

Related to R and Data Mining

Related ebooks

Mathematics For You

View More

Related articles

Reviews for R and Data Mining

Rating: 3 out of 5 stars
3/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    R and Data Mining - Yanchang Zhao

    frequency

    Chapter 1

    Introduction

    This book introduces into using R for data mining. It presents many examples of various data mining functionalities in R and three case studies of real-world applications. The supposed audience of this book are postgraduate students, researchers, and data miners who are interested in using R to do their data mining research and projects. We assume that readers already have a basic idea of data mining and also have some basic experience with R. We hope that this book will encourage more and more people to use R to do data mining work in their research and applications.

    This chapter introduces basic concepts and techniques for data mining, including a data mining process and popular data mining techniques. It also presents R and its packages, functions, and task views for data mining. At last, some datasets used in this book are described.

    1.1 Data Mining

    Data mining is the process to discover interesting knowledge from large amounts of data (Han and Kamber, 2000). It is an interdisciplinary field with contributions from many areas, such as statistics, machine learning, information retrieval, pattern recognition, and bioinformatics. Data mining is widely used in many domains, such as retail, finance, telecommunication, and social media.

    The main techniques for data mining include classification and prediction, clustering, outlier detection, association rules, sequence analysis, time series analysis, and text mining, and also some new techniques such as social network analysis and sentiment analysis. Detailed introduction of data mining techniques can be found in text books on data mining (Han and Kamber, 2000; Hand et al., 2001; Witten and Frank, 2005). In real-world applications, a data mining process can be broken into six major phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment, as defined by the CRISP-DM (Cross Industry Standard Process for Data Mining).¹ This book focuses on the modeling phase, with data exploration and model evaluation involved in some chapters. Readers who want more information on data mining are referred to online resources in Chapter 15.

    1.2 R

    R² (R Development Core Team, 2012) is a free software environment for statistical computing and graphics. It provides a wide variety of statistical and graphical techniques. R can be extended easily via packages. There are around 4000 packages available in the CRAN package repository,³ as on August 1, 2012. More details about R are available in An Introduction to R⁴ (Venables et al., 2010) and R Language Definition⁵ (R Development Core Team, 2010b) at the CRAN website. R is widely used in both academia and industry.

    To help users to find out which R packages to use, the CRAN Task Views⁶ are a good guidance. They provide collections of packages for different tasks. Some task views related to data mining

    Enjoying the preview?
    Page 1 of 1