Mastering RStudio – Develop, Communicate, and Collaborate with R
()
About this ebook
About This Book
- Discover the multi-functional use of RStudio to support your daily work with R code
- Learn to create stunning, meaningful, and interactive graphs and learn to embed them into easy communicable reports using multiple R packages
- Develop your own R packages and Shiny web apps to share your knowledge and collaborate with others
Who This Book Is For
This book is aimed at R developers and analysts who wish to do R statistical development while taking advantage of RStudio’s functionality to ease their development efforts. R programming experience is assumed as well as being comfortable with R’s basic structures and a number of functions.
What You Will Learn
- Discover the RStudio IDE and details about the user interface
- Communicate your insights with R Markdown in static and interactive ways
- Learn how to use different graphic systems to visualize your data
- Build interactive web applications with the Shiny framework to present and share your results
- Understand the process of package development and assemble your own R packages
- Easily collaborate with other people on your projects by using Git and GitHub
- Manage the R environment for your organization with RStudio and Shiny server
- Apply your obtained knowledge about RStudio and R development to create a real-world dashboard solution
In Detail
RStudio helps you to manage small to large projects by giving you a multi-functional integrated development environment, combined with the power and flexibility of the R programming language, which is becoming the bridge language of data science for developers and analyst worldwide. Mastering the use of RStudio will help you to solve real-world data problems.
This book begins by guiding you through the installation of RStudio and explaining the user interface step by step. From there, the next logical step is to use this knowledge to improve your data analysis workflow. We will do this by building up our toolbox to create interactive reports and graphs or even web applications with Shiny. To collaborate with others, we will explore how to use Git and GitHub with RStudio and how to build your own packages to ensure top quality results. Finally, we put it all together in an interactive dashboard written with R.
Style and approach
An easy-to-follow guide full of hands-on examples to master RStudio.
Beginning from explaining the basics, each topic is explained with a lot of details for every feature.
Related to Mastering RStudio – Develop, Communicate, and Collaborate with R
Related ebooks
Learning Shiny Rating: 0 out of 5 stars0 ratingsLearning RStudio for R Statistical Computing Rating: 4 out of 5 stars4/5Learning R Programming Rating: 5 out of 5 stars5/5Learn R By Coding Rating: 0 out of 5 stars0 ratingsSAS Viya: The Python Perspective Rating: 0 out of 5 stars0 ratingsLearning Jupyter Rating: 5 out of 5 stars5/5R Programming - a Comprehensive Guide: Software Rating: 0 out of 5 stars0 ratingsFlask By Example Rating: 0 out of 5 stars0 ratingsSAS Viya: The R Perspective Rating: 0 out of 5 stars0 ratingsReal-Time Big Data Analytics Rating: 5 out of 5 stars5/5Learning Python Rating: 5 out of 5 stars5/5Web Application Development with R Using Shiny - Second Edition Rating: 0 out of 5 stars0 ratingsMastering Text Mining with R Rating: 0 out of 5 stars0 ratingsggplot2 Essentials Rating: 0 out of 5 stars0 ratingsRStudio for R Statistical Computing Cookbook Rating: 0 out of 5 stars0 ratingsR Data Science Essentials Rating: 2 out of 5 stars2/5R: Data Analysis and Visualization Rating: 5 out of 5 stars5/5Practical Data Science with R, Second Edition Rating: 4 out of 5 stars4/5Mastering Data Analysis with R Rating: 5 out of 5 stars5/5Big Data Analytics with R Rating: 0 out of 5 stars0 ratingsR: Recipes for Analysis, Visualization and Machine Learning Rating: 0 out of 5 stars0 ratingsLearning Bayesian Models with R Rating: 5 out of 5 stars5/5R for Data Science Rating: 5 out of 5 stars5/5R Machine Learning By Example Rating: 0 out of 5 stars0 ratingsLearning Predictive Analytics with R Rating: 0 out of 5 stars0 ratingsLearning pandas Rating: 4 out of 5 stars4/5R in Action: Data analysis and graphics with R Rating: 4 out of 5 stars4/5R High Performance Programming Rating: 4 out of 5 stars4/5Python Data Analysis Cookbook Rating: 5 out of 5 stars5/5
Programming For You
Python: For Beginners A Crash Course Guide To Learn Python in 1 Week Rating: 4 out of 5 stars4/5Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps Rating: 4 out of 5 stars4/5HTML & CSS: Learn the Fundaments in 7 Days Rating: 4 out of 5 stars4/5Java for Beginners: A Crash Course to Learn Java Programming in 1 Week Rating: 5 out of 5 stars5/5SQL: For Beginners: Your Guide To Easily Learn SQL Programming in 7 Days Rating: 5 out of 5 stars5/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer. Rating: 5 out of 5 stars5/5Coding All-in-One For Dummies Rating: 4 out of 5 stars4/5Python Machine Learning By Example Rating: 4 out of 5 stars4/5101 Amazing Nintendo NES Facts: Includes facts about the Famicom Rating: 4 out of 5 stars4/5Pokemon Go: Guide + 20 Tips and Tricks You Must Read Hints, Tricks, Tips, Secrets, Android, iOS Rating: 5 out of 5 stars5/5Linux: Learn in 24 Hours Rating: 5 out of 5 stars5/5Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5Learn SQL in 24 Hours Rating: 5 out of 5 stars5/5SQL All-in-One For Dummies Rating: 3 out of 5 stars3/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project Rating: 5 out of 5 stars5/5Modern C++ for Absolute Beginners: A Friendly Introduction to C++ Programming Language and C++11 to C++20 Standards Rating: 0 out of 5 stars0 ratingsPython Projects for Beginners: A Ten-Week Bootcamp Approach to Python Programming Rating: 0 out of 5 stars0 ratings
Reviews for Mastering RStudio – Develop, Communicate, and Collaborate with R
0 ratings0 reviews
Book preview
Mastering RStudio – Develop, Communicate, and Collaborate with R - Hillebrand Julian
Table of Contents
Mastering RStudio – Develop, Communicate, and Collaborate with R
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Errata
Piracy
Questions
1. The RStudio IDE – an Overview
Downloading and installing RStudio
Installing R
For Ubuntu
Using RStudio with different versions of R
Windows
Ubuntu
Updating RStudio
Getting to know the RStudio interface
The four main panes
The Source editor pane
Syntax highlighting
Code completion
Executing R Code from the source pane
Code folding
Debugging code
The Environment and History panes
History pane
Console pane
The Files, Plots, Packages, Help, and Viewer panes
The Files pane
The Plot pane
The Packages pane
The Help pane
The Viewer pane
Customizing RStudio
Using keyboard shortcuts
Working with RStudio and projects
Creating a project with RStudio
Locating your project
Using RStudio with Dropbox
Preventing Dropbox synchronization conflicts
Creating your first project
Organizing your folders
Saving the data
Analyzing the data
Correcting the path for report exporting
Exporting your analysis as a report
Summary
2. Communicating Your Work with R Markdown
The concept of reproducible research
Doing reproducible research with R Markdown
What is Markdown?
What is literate programming?
A brief side note on Sweave
Dynamic report generation with knitr
What is R Markdown?
A side note about LaTeX
Configuring R Markdown
Getting started with R Markdown in RStudio
Creating your first R Markdown document
The R Markdown interface
Inspecting the R Markdowns panes
Explaining the R Markdown File pane settings
File tab arrows
Saving current document
Spell check
Find/replace
Question mark
Knit HTML
Gear icon
Output Format: HTML
Output Format – PDF
Output Format – Word
Run and re-run icons
Chunks
Jump to menu
Viewer pane options
Advanced R Markdown documents
Getting to know R code chunks
Customizing R code chunks
Chunk options
Avoiding errors, warnings, and other messages
Hiding distracting lines of code
Embedding R code inline
Labeling code chunks
Pandoc and knitr options
Output formats
Changing the look of the output
Using a custom CSS style sheet
Using R Markdown templates
Package vignette
The Tufte handout
Compiling R Notebooks
Generating R Markdown presentations
ioslides
Slidy
Beamer
Summary
3. R Lesson I – Graphics System
The graphic system in R
An introduction to the graphic devices
The R graphics package—base
Creating base plots
Using the base graphics
Base graphics parameters
Annotating with base plotting functions
Introducing the lattice package
Creating lattice plots
Getting to know the lattice plot types
The lattice panel functions
Lattice key points summary
Introducing ggplot2
Looking at the history of ggplot2
The Grammar of Graphics
Applying The Grammar of Graphics with ggplot2
Using ggplot2
Installing the ggplot2 package
Qplot() and ggplot()
Creating your first graph with ggplot2
Modifying ggplot objects with the plus operator
Setting the aesthetics parameter
Adding layers using geoms
Choosing the right geom
Modifying parameters
Changing the color of your plot
Changing the shape
Changing the size
Saving ggplot objects in variables
Using stats layers
Saving ggplot graphs
Customizing your charts
Subsetting your data
Setting titles
Changing the axis labels
Swapping the X and Y axes
Improving the look of ggplot2 charts
Creating graphs with the economist theme
Creating graphs with the wall street journal theme
Interactive plotting systems
Introducing ggvis
Our first ggvis graphic
Interactive ggvis graphs
A look at the rCharts package
Using googleVis
HTML widgets
dygraphs
Leaflet
rbokeh
Summary
4. Shiny – a Web-app Framework for R
Introducing Shiny – the app framework
Creating a new Shiny web app with RStudio
Creating your first Shiny application
Sketching the final app
Constructing the user interface for your app
Creating the server file
The final application
Deconstructing the final app into its components
The components of the user interface
The server file in detail
The connection between the server and the ui file
The concept of reactivity
The source and endpoint structure
The purpose of the reactive conductor
Discovering the scope of the Shiny user interface
Exploring the Shiny interface layouts
The sidebar layout
The grid layout
The tabset panel layout
The navlist panel layout
The navbar page as the page layout
Adding widgets to your application
Shiny input elements
A brief overview of the output elements
Individualizing your app even further with Shiny tags
Creating dynamic user interface elements
Using conditionalPanel
Taking advantage of the renderUI function
Sharing your Shiny application with others
Offering a download of your Shiny app
Gist
GitHub
Zip file
Package
Deploying your app to the web
Shinyapps.io
Setting up a self-hosted Shiny server
Diving into the Shiny ecosystem
Creating apps with more files
Expanding the Shiny package
Summary
5. Interactive Documents with R Markdown
Creating interactive documents with R Markdown
Using R Markdown and Shiny
Shiny Document
Shiny Presentation
Disassembling a Shiny R Markdown document
Embedding interactive charts into R Markdown
Using ggvis for interactive R Markdown documents
rCharts
googleVis
HTML widgets
dygraphs
Three.js and R
networkD3
metricsgraphics
Publishing interactive R Markdown documents
Summary
6. Creating Professional Dashboards with R and Shiny
Explaining the concept of dashboards
Introducing the shinydashboard package
Installing shinydashboard
Explaining the structure of shinydashboard
Showing the elements of shinydashboard
Header elements
Sidebar elements
Body elements
Boxes
FluidRows
InfoBox and valueBox
Building your own KPI dashboard
Creating our data architecture
Sketching the look of our dashboard
Transferring our plan into R code
Considering a file and folder structure
Accessing our data sources
MySQL – the customer data
Dropbox – our data storage system
Google Analytics – the website data
Twitter – the social data
Google Sheets – the inventory data
Putting it all together
Creating the Twitter engagement box
Summary
7. Package Development in RStudio
Understanding R packages
Understanding the package structure
Installing devtools
Building packages with RStudio
Creating a new package project with RStudio
Looking at the created files
Using Packrat with a project
Writing the documentation for a package
Creating Rd documentation files
Looking at an example documentation file
Adding examples
dontrun
dontshow
Editing the DESCRIPTION file
General information
Dependencies
License
Understanding the namespaces of a package
Building and checking a package
Checking a package
Customizing the package build options
Using roxygen2 for package documentation
Installing the roxygen2 package
Generating Rd Files
Testing a package
Using testthat in a package
Adding a dataset to a package
Creating .rda files
Using LazyData with a package
Writing a package vignette with R markdown
Creating vignette files
References for further information
Summary
8. Collaborating with Git and GitHub
Introducing version control
Installing Git
Installing Git on Windows
Installing Git on Linux
Configuring Git
Explaining the basic terminology
Repository
Commit
Diff
Branch
Merge
Fetch
Pull
Push
Using Git via shell
Using the shell from Rstudio
Using Git with RStudio
Using RStudio and GitHub via SSH
Creating a new project with Git
Explaining the gitignore file
Keeping track of changes
Recording changes
Introducing the Git drop-down menu
Undoing a mistake
Pushing to a remote repository on github.com
Using an existing GitHub project with RStudio
Using branches
Making a pull request
Reviewing and merging pull requests
Further resources
Summary
9. R for your Organization – Managing the RStudio Server
Managing the RStudio Server
Using Amazon Web Services as the server platform
Creating an AWS account
Using S3 to store our data
Creating our bucket
Uploading a dataset to the bucket
Launching our EC2 instance
Choosing An amazon Machine Image
Choosing an instance type
Configuring instance details
Creating a new IAM role
Adding storage
Tagging an instance
Configuring a security group
Reviewing
Creating a key pair
Launching the instance
Connecting with the new EC2 instance
What is SSH?
Bringing it all together
Setting up R, RStudio, and the Shiny Server
Choosing your RStudio version
Installing base R
Installing RStudio and the Shiny Server
RStudio and the Shiny Server in your browser
Administrating your RStudio server environment
Getting rid of the R memory problem
Connecting our S3 bucket with RStudio
Basic RStudio server management
Managing the Shiny Server
Basic commands for the Shiny Server
Summary
10. Extending RStudio and Your Knowledge of R
Extending RStudio, finding answers, and more
RStudio environment customizations
Customizing the Rprofile
Where to find your Rprofile
Adding custom functions
The first and last functions
More ideas for your Rprofile
R help is on the way
Getting questions and answers
Stack Overflow (Stack Exchange)
Data Science (Stack Exchange)
Cross Validated (Stack Exchange)
Open Data (Stack Exchange)
R mailing lists – R-help
How to ask questions correctly
Learning more about packages, functions, and more
R FAQs
R and CRAN documentations
R search engines
RStudio cheat sheets
Sharing your R code
Improving your R knowledge
Learning R interactively
Try R
DataCamp
Leada
Swirl
Attending online courses
Coursera
Johns Hopkins University – Data Science Specialization
Johns Hopkins University – Genomic Data Science
Udacity
Other MOOC courses, related platforms, and programs
Staying up to date in the R world
R-Bloggers
The R Journal
Summary
Index
Mastering RStudio – Develop, Communicate, and Collaborate with R
Mastering RStudio – Develop, Communicate, and Collaborate with R
Copyright © 2015 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: November 2015
Production reference: 1251115
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78398-254-7
www.packtpub.com
Credits
Authors
Julian Hillebrand
Maximilian H. Nierhoff
Reviewer
Nicholas A. Yager
Commissioning Editor
Kartikey Pandey
Acquisition Editor
Tushar Gupta
Content Development Editor
Anish Dhurat
Technical Editor
Mohita Vyas
Copy Editor
Angad Singh
Project Coordinator
Harshal Ved
Proofreader
Safis Editing
Indexer
Rekha Nair
Graphics
Abhinash Sahu
Production Coordinator
Melwyn Dsa
Cover Work
Melwyn Dsa
About the Authors
Julian Hillebrand studied international business marketing management at the Cologne Business School in Germany. His interest in the current questions of the business world showed him the importance of data-driven decision-making. Because of the growing size of available inputs, he soon realized the great potential of R for analyzing and visualizing data. This fascination made him start a blog project about using data science, especially for social media data analysis, which can be found at http://thinktostart.com/. He managed to combine his hands-on tutorials with his marketing and business knowledge.
Julian is always looking for new technological opportunities and is also interested in the emerging field of machine learning. He completed several digital learning offerings to take his data science capabilities to the next level.
Maximilian H. Nierhoff is an analyst for online marketing with more than half a decade of experience in managing online marketing channels and digital analytics. After studying economics, cultural activities, and creative industries, he started building online marketing departments and realized quickly that future marketing forces should also have programming knowledge. He has always been passionate about everything related to the topics of data, marketing, and customer journey analysis. Therefore, he has specialized in using R since then, which is his first-choice language for programming, data science, and analysis capabilities. He considers himself a lifelong learner and is an avid user of MOOCs, which are about R and digital analytics.
About the Reviewer
Nicholas A. Yager is a biostatistician and software developer researching statistical genomics, image analysis, and infectious disease epidemiology. With an education in biochemistry and biostatistics, his experience in analyzing cutting-edge genomics data and simulating complex biological systems has given him an in-depth understanding of scientific computing and data analysis. Currently, Nicholas works for a personalized medicine company, designing medical informatics systems for next-generation personalized cancer tests. Aside from this book, Nicholas has reviewed Unsupervised Learning with R, Packt Publishing.
I would like to thank my friends, Lauren and Matt, and my mentor, Dr. Gregg Hartvigsen, for their help in reviewing this book.
www.PacktPub.com
Support files, eBooks, discount offers, and more
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Free access for Packt account holders
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.
Preface
Data analysis, visualization, and the handling of complex statistical issues was reserved just for universities and very few organizations for a long time. In fact, an easy-to-use and free environment to make the concept of data analysis available to a broader audience was not available.
But in the early nineties, R saw the light of day, and since then, it has been on a meteoric rise. R has shaped the landscape of data science in recent years like no other programing language. Because of its open source nature, it became widely known and is often referred to as the lingua franca of data analysis. Another reason for this huge success is the availability of a sophisticated Integrated Development Environment (IDE) named RStudio.
The development of RStudio started in 2010, and now, it is the de facto, go-to IDE for everybody working with R. The mission statement of RStudio is "to provide the most widely used open source and enterprise-ready professional software for the R statistical computing environment."
But RStudio offers more than just a handy way to create R scripts; it grew to a real ecosystem by providing a variety of functionalities like package, application, interactive reporting creation, and more. Walking this way, RStudio has managed to bring data analysis to a broader audience. And because of its continuous desire to innovate R and its possibilities, it can be seen as a further development of the R language. RStudio combines the strong statistical power of R, the community, and open source spirit with cutting edge technologies of user interface development.
This made RStudio more than just a tool for statisticians; it became the platform for everybody who wants to generate insights from data and share them with others.
Therefore, we will hereafter guide you to develop, communicate, and collaborate with R by mastering RStudio.
What this book covers
Chapter 1, The RStudio IDE – an Overview, describes how to install RStudio, and gives a general overview of its user interface.
Chapter 2, Communicating Your Work with R Markdown, shows how to create R Markdown documents and presentations with the help of the concept of reproducible research.
Chapter 3, R Lesson I – Graphics System, gives an introduction to the landscape of plotting packages in R and the basic process of plot creation with different packages for interactive graphs.
Chapter 4, Shiny – a Web-app Framework for R, describes how to create web applications with the Shiny framework by explaining the basic concept of reactive programming.
Chapter 5, Interactive Documents with R Markdown, explains how to create interactive R Markdown documents with the Shiny framework and other R packages.
Chapter 6, Creating Professional Dashboards with R and Shiny, introduces the concept of dashboards, and how to build a professional dashboard with the shinydashboard package.
Chapter 7, Package Development in RStudio, describes the basic process of package development in R, and how to create R packages with RStudio.
Chapter 8, Collaborating with Git and GitHub, shows the fundamentals of Git and GitHub, and how to use them with RStudio.
Chapter 9, R for your Organization – Managing the RStudio Server, describes how to install R, RStudio, and the Shiny Server on a cloud server to create a fully flexible programming environment.
Chapter 10, Extending RStudio and Your Knowledge of R, explains where you can find additional resources to improve your work with R and RStudio.
What you need for this book
To fully apply the knowledge learned in this book, you will need a computer with access to the Internet, and the ability to install the R environment as well as the RStudio IDE. The first chapter will guide you through this process.
Who this book is for
This book is aimed at R developers and analysts who wish to work on R statistical development while taking advantage of RStudio's functionality to ease their development efforts. Experience with R programming is assumed, as well as being comfortable with R's basic structures and a number of functions.
Conventions
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: You can also export the analysis.R script as a report in the HTML, PDF, or MS Word format, and you will then find the report in your code folder.
A block of code is set as follows:
gaToken <- GoogleApiCreds(
userName = your@email.com
,