Advanced R 4 Data Programming and the Cloud: Using PostgreSQL, AWS, and Shiny

Ebook611 pages5 hours

Advanced R 4 Data Programming and the Cloud: Using PostgreSQL, AWS, and Shiny

Name: Advanced R 4 Data Programming and the Cloud: Using PostgreSQL, AWS, and Shiny
Author: Matt Wiley
ISBN: 9781484259733

By Matt Wiley and Joshua F. Wiley

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Program for data analysis using R and learn practical skills to make your work more efficient. This revised book explores how to automate running code and the creation of reports to share your results, as well as writing functions and packages. It includes key R 4 features such as a new color palette for charts, an enhanced reference counting system, and normalization of matrix and array types where matrix objects now formally inherit from the array class, eliminating inconsistencies.
Advanced R 4 Data Programming and the Cloud is not designed to teach advanced R programming nor to teach the theory behind statistical procedures. Rather, it is designed to be a practical guide moving beyond merely using R; it shows you how to program in R to automate tasks.
This book will teach you how to manipulate data in modern R structures and includes connecting R to databases such as PostgreSQL, cloud services such as Amazon Web Services (AWS), and digital dashboards such as Shiny. Each chapter also includes a detailed bibliography with references to research articles and other resources that cover relevant conceptual and theoretical topics.
What You Will Learn

Write and document R functions using R 4
Make an R package and share it via GitHub or privately
Add tests to R code to ensure it works as intended
Use R to talk directly to databases and do complex data management
Run R in the Amazon cloud
Deploy a Shiny digital dashboard
Generate presentation-ready tables and reports using R

Who This Book Is For

Working professionals, researchers, and students who are familiar with R and basic statistical techniques such as linear regression and who want to learn how to take their R coding and programming to the next level.

Skip carousel

LanguageEnglish

PublisherApress

Release dateJul 16, 2020

ISBN9781484259733

Author

Matt Wiley

Related authors

Skip carousel

Related to Advanced R 4 Data Programming and the Cloud

Related ebooks

Skip carousel

Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python
Ebook
Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python
byDavid Paper
Rating: 0 out of 5 stars
0 ratings
Data Science Careers, Training, and Hiring: A Comprehensive Guide to the Data Ecosystem: How to Build a Successful Data Science Career, Program, or Unit
Ebook
Data Science Careers, Training, and Hiring: A Comprehensive Guide to the Data Ecosystem: How to Build a Successful Data Science Career, Program, or Unit
byRenata Rawlings-Goss
Rating: 0 out of 5 stars
0 ratings
R Machine Learning Essentials
Ebook
R Machine Learning Essentials
byUsuelli Michele
Rating: 0 out of 5 stars
0 ratings
Creating Good Data: A Guide to Dataset Structure and Data Representation
Ebook
Creating Good Data: A Guide to Dataset Structure and Data Representation
byHarry J. Foxwell
Rating: 0 out of 5 stars
0 ratings
Professional Penetration Testing: Volume 1: Creating and Learning in a Hacking Lab
Ebook
Professional Penetration Testing: Volume 1: Creating and Learning in a Hacking Lab
byThomas Wilhelm
Rating: 4 out of 5 stars
4/5
Beginning Apache Spark 2: With Resilient Distributed Datasets, Spark SQL, Structured Streaming and Spark Machine Learning library
Ebook
Beginning Apache Spark 2: With Resilient Distributed Datasets, Spark SQL, Structured Streaming and Spark Machine Learning library
byHien Luu
Rating: 0 out of 5 stars
0 ratings
Learn Microservices with Spring Boot: A Practical Approach to RESTful Services Using an Event-Driven Architecture, Cloud-Native Patterns, and Containerization
Ebook
Learn Microservices with Spring Boot: A Practical Approach to RESTful Services Using an Event-Driven Architecture, Cloud-Native Patterns, and Containerization
byMoisés Macero García
Rating: 0 out of 5 stars
0 ratings
R Object-oriented Programming
Ebook
R Object-oriented Programming
byKelly Black
Rating: 3 out of 5 stars
3/5
Mastering Machine Learning with R
Ebook
Mastering Machine Learning with R
byLesmeister Cory
Rating: 0 out of 5 stars
0 ratings
Python Data Science Essentials
Ebook
Python Data Science Essentials
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Mastering Python for Data Science
Ebook
Mastering Python for Data Science
bySamir Madhavan
Rating: 3 out of 5 stars
3/5
Data Science Fundamentals for Python and MongoDB
Ebook
Data Science Fundamentals for Python and MongoDB
byDavid Paper
Rating: 0 out of 5 stars
0 ratings
Practical Machine Learning for Streaming Data with Python: Design, Develop, and Validate Online Learning Models
Ebook
Practical Machine Learning for Streaming Data with Python: Design, Develop, and Validate Online Learning Models
bySayan Putatunda
Rating: 0 out of 5 stars
0 ratings
Mastering Hibernate
Ebook
Mastering Hibernate
byRamin Rad
Rating: 0 out of 5 stars
0 ratings
Spark for Data Science
Ebook
Spark for Data Science
bySrinivas Duvvuri
Rating: 0 out of 5 stars
0 ratings
Practical Python AI Projects: Mathematical Models of Optimization Problems with Google OR-Tools
Ebook
Practical Python AI Projects: Mathematical Models of Optimization Problems with Google OR-Tools
bySerge Kruk
Rating: 0 out of 5 stars
0 ratings
Deep Learning: Convergence to Big Data Analytics
Ebook
Deep Learning: Convergence to Big Data Analytics
byMurad Khan
Rating: 0 out of 5 stars
0 ratings
Rapid Java Persistence and Microservices: Persistence Made Easy Using Java EE8, JPA and Spring
Ebook
Rapid Java Persistence and Microservices: Persistence Made Easy Using Java EE8, JPA and Spring
byRaj Malhotra
Rating: 0 out of 5 stars
0 ratings
Pro Oracle Database 18c Administration: Manage and Safeguard Your Organization’s Data
Ebook
Pro Oracle Database 18c Administration: Manage and Safeguard Your Organization’s Data
byMichelle Malcher
Rating: 0 out of 5 stars
0 ratings
A Data Scientist's Guide to Acquiring, Cleaning, and Managing Data in R
Ebook
A Data Scientist's Guide to Acquiring, Cleaning, and Managing Data in R
bySamuel E. Buttrey
Rating: 0 out of 5 stars
0 ratings
Data Science Solutions on Azure: Tools and Techniques Using Databricks and MLOps
Ebook
Data Science Solutions on Azure: Tools and Techniques Using Databricks and MLOps
byJulian Soh
Rating: 0 out of 5 stars
0 ratings
Practical Data Science with Python 3: Synthesizing Actionable Insights from Data
Ebook
Practical Data Science with Python 3: Synthesizing Actionable Insights from Data
byErvin Varga
Rating: 0 out of 5 stars
0 ratings
Microservices for the Enterprise: Designing, Developing, and Deploying
Ebook
Microservices for the Enterprise: Designing, Developing, and Deploying
byKasun Indrasiri
Rating: 0 out of 5 stars
0 ratings
R Data Science Quick Reference: A Pocket Guide to APIs, Libraries, and Packages
Ebook
R Data Science Quick Reference: A Pocket Guide to APIs, Libraries, and Packages
byThomas Mailund
Rating: 0 out of 5 stars
0 ratings
Computer Vision Using Deep Learning: Neural Network Architectures with Python and Keras
Ebook
Computer Vision Using Deep Learning: Neural Network Architectures with Python and Keras
byVaibhav Verdhan
Rating: 0 out of 5 stars
0 ratings
Learn Computer Science with Swift: Computation Concepts, Programming Paradigms, Data Management, and Modern Component Architectures with Swift and Playgrounds
Ebook
Learn Computer Science with Swift: Computation Concepts, Programming Paradigms, Data Management, and Modern Component Architectures with Swift and Playgrounds
byJesse Feiler
Rating: 0 out of 5 stars
0 ratings
Using OpenRefine
Ebook
Using OpenRefine
byRuben Verborgh
Rating: 4 out of 5 stars
4/5
Practical User Research: Everything You Need to Know to Integrate User Research to Your Product Development
Ebook
Practical User Research: Everything You Need to Know to Integrate User Research to Your Product Development
byEmmanuelle Savarit
Rating: 0 out of 5 stars
0 ratings
Mobile Agents in Networking and Distributed Computing
Ebook
Mobile Agents in Networking and Distributed Computing
byJiannong Cao
Rating: 0 out of 5 stars
0 ratings
Software Development From A to Z: A Deep Dive into all the Roles Involved in the Creation of Software
Ebook
Software Development From A to Z: A Deep Dive into all the Roles Involved in the Creation of Software
byOlga Filipova
Rating: 0 out of 5 stars
0 ratings

Programming For You

Skip carousel

A Slackers Guide to Coding with Python: Ultimate Beginners Guide to Learning Python Quick
Ebook
A Slackers Guide to Coding with Python: Ultimate Beginners Guide to Learning Python Quick
byChris Y. Reynolds
Rating: 0 out of 5 stars
0 ratings
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
Ebook
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
byKevin Clark
Rating: 5 out of 5 stars
5/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Python: Learn Python in 24 Hours
Ebook
Python: Learn Python in 24 Hours
byAlex Nordeen
Rating: 4 out of 5 stars
4/5
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
Ebook
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
byBrady Ellison
Rating: 5 out of 5 stars
5/5
The JavaScript Workshop: Learn to develop interactive web applications with clean and maintainable JavaScript code
Ebook
The JavaScript Workshop: Learn to develop interactive web applications with clean and maintainable JavaScript code
byJoseph Labrecque
Rating: 5 out of 5 stars
5/5
Coding All-in-One For Dummies
Ebook
Coding All-in-One For Dummies
byNikhil Abraham
Rating: 4 out of 5 stars
4/5
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
Ebook
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
byJames Tudor
Rating: 5 out of 5 stars
5/5
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
Ebook
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
byRobert Oliver
Rating: 0 out of 5 stars
0 ratings
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
Ebook
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
byAnthony Adams
Rating: 4 out of 5 stars
4/5
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
Ebook
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
byJason Scotts
Rating: 4 out of 5 stars
4/5
Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
Ebook
Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
byTimothy C. Needham
Rating: 4 out of 5 stars
4/5
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
HTML & CSS: Learn the Fundaments in 7 Days
Ebook
HTML & CSS: Learn the Fundaments in 7 Days
byMichael Knapp
Rating: 4 out of 5 stars
4/5
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
Ebook
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
byMark Chan
Rating: 5 out of 5 stars
5/5
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Python for Beginners: Learn the Fundamentals of Computer Programming
Ebook
Python for Beginners: Learn the Fundamentals of Computer Programming
byJ Foster
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
Ebook
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
bySebastian Raschka
Rating: 5 out of 5 stars
5/5
Programming Arduino: Getting Started with Sketches
Ebook
Programming Arduino: Getting Started with Sketches
bySimon Monk
Rating: 4 out of 5 stars
4/5
SQL: For Beginners: Your Guide To Easily Learn SQL Programming in 7 Days
Ebook
SQL: For Beginners: Your Guide To Easily Learn SQL Programming in 7 Days
byi Code Academy
Rating: 5 out of 5 stars
5/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Learn JavaScript in 24 Hours
Ebook
Learn JavaScript in 24 Hours
byAlex Nordeen
Rating: 3 out of 5 stars
3/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Learn PowerShell in a Month of Lunches, Fourth Edition: Covers Windows, Linux, and macOS
Ebook
Learn PowerShell in a Month of Lunches, Fourth Edition: Covers Windows, Linux, and macOS
byTravis Plunk
Rating: 0 out of 5 stars
0 ratings
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
Ebook
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
byGwendolyn Faraday
Rating: 5 out of 5 stars
5/5
C Programming Language, A Step By Step Beginner's Guide To Learn C Programming In 7 Days.
Ebook
C Programming Language, A Step By Step Beginner's Guide To Learn C Programming In 7 Days.
byDarrel L. Graham
Rating: 4 out of 5 stars
4/5
The Little SAS Book: A Primer, Sixth Edition
Ebook
The Little SAS Book: A Primer, Sixth Edition
byLora D. Delwiche
Rating: 5 out of 5 stars
5/5
Python Programming, Deep Learning: 3 Books in 1: A Complete Guide for Beginners, Python Coding for Ai, Neural Networks, & Machine Learning, Data Science/Analysis with Practical Exercises for Learners
Ebook
Python Programming, Deep Learning: 3 Books in 1: A Complete Guide for Beginners, Python Coding for Ai, Neural Networks, & Machine Learning, Data Science/Analysis with Practical Exercises for Learners
byAnthony Adams
Rating: 4 out of 5 stars
4/5
Linux: Learn in 24 Hours
Ebook
Linux: Learn in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

The Role of Infrastructure in ML // Niels Bantilan // #197
Podcast episode
The Role of Infrastructure in ML // Niels Bantilan // #197
byMLOps.community
0 ratings
0% found this document useful
The Birth and Growth of Spark: An Open Source Success Story // Matei Zaharia // MLOps Podcast #155
Podcast episode
The Birth and Growth of Spark: An Open Source Success Story // Matei Zaharia // MLOps Podcast #155
byMLOps.community
0 ratings
0% found this document useful
How Hera is an Enabler of MLOps Integrations // Flaviu Vadan // Coffee Sessions #115
Podcast episode
How Hera is an Enabler of MLOps Integrations // Flaviu Vadan // Coffee Sessions #115
byMLOps.community
0 ratings
0% found this document useful
Personalizing Learning through Technology and AI
Podcast episode
Personalizing Learning through Technology and AI
byInsights Tomorrow
0 ratings
0% found this document useful
Open Source Software as a Triumph of Information Hiding, Modularity, and Creating Optionality with Dr. Gail Murphy: In this newest episode of The Idealcast, Gene Kim speaks with Dr. Gail Murphy, Professor of Computer Science and Vice President of Research and Innovation at the University of British Columbia. She is also the co-founder, board member, and former Chi...
Podcast episode
Open Source Software as a Triumph of Information Hiding, Modularity, and Creating Optionality with Dr. Gail Murphy: In this newest episode of The Idealcast, Gene Kim speaks with Dr. Gail Murphy, Professor of Computer Science and Vice President of Research and Innovation at the University of British Columbia. She is also the co-founder, board member, and former Chi...
byThe Idealcast with Gene Kim by IT Revolution
0 ratings
0% found this document useful
Modernizing Data Management: Organizations are experiencing significant data sprawl. It is getting complex to get accurate insights timely from the data one has in hand, or it is lying there unused. What are the gaps in our data strategy, architecture, and execution processes?...
Podcast episode
Modernizing Data Management: Organizations are experiencing significant data sprawl. It is getting complex to get accurate insights timely from the data one has in hand, or it is lying there unused. What are the gaps in our data strategy, architecture, and execution processes?...
byCIO Talk Network Podcast
0 ratings
0% found this document useful
79. Agile Today from the 2020 UMD Symposium: Scrum with distributed teams. Agile in a traditionally non-Agile environment. Project Management in the time of Agile: is it still relevant? These are the topics we cover in the third episode of our series from the 2020 UMD Project Management...
Podcast episode
79. Agile Today from the 2020 UMD Symposium: Scrum with distributed teams. Agile in a traditionally non-Agile environment. Project Management in the time of Agile: is it still relevant? These are the topics we cover in the third episode of our series from the 2020 UMD Project Management...
byPM Point of View
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Science, Technology, and Society
0 ratings
0% found this document useful
Explainable/Interpretable AI: In this episode, we will continue with part two of a four-part series looking at Responsible AI (Listen to part one: ). One of the major challenges with effectively developing, deploying, and managing AI systems are often related to the “black...
Podcast episode
Explainable/Interpretable AI: In this episode, we will continue with part two of a four-part series looking at Responsible AI (Listen to part one: ). One of the major challenges with effectively developing, deploying, and managing AI systems are often related to the “black...
byGARP Risk Podcast
0 ratings
0% found this document useful
ProductizeML: Assisting Your Team to Better Build ML Products // Adrià Romero // MLOps Meetup #47
Podcast episode
ProductizeML: Assisting Your Team to Better Build ML Products // Adrià Romero // MLOps Meetup #47
byMLOps.community
0 ratings
0% found this document useful
Fast.ai, AutoML, and Software Engineering for ML: Jeremy Howard // Coffee Session #47
Podcast episode
Fast.ai, AutoML, and Software Engineering for ML: Jeremy Howard // Coffee Session #47
byMLOps.community
0 ratings
0% found this document useful
Eliminating Garbage In/Garbage Out for Analytics and ML // Roy Hasson & Santona Tuli // MLOps Podcast #166
Podcast episode
Eliminating Garbage In/Garbage Out for Analytics and ML // Roy Hasson & Santona Tuli // MLOps Podcast #166
byMLOps.community
0 ratings
0% found this document useful
Exploring Open-Source for Tissue Image Analysis and Data Science Business w/ Trevor McKee, Pathomics.io
Podcast episode
Exploring Open-Source for Tissue Image Analysis and Data Science Business w/ Trevor McKee, Pathomics.io
byDigital Pathology Podcast
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Economics
0 ratings
0% found this document useful
PSW #771 - Dan DeCloss
Podcast episode
PSW #771 - Dan DeCloss
bySecurity Weekly Podcast Network (Audio)
0 ratings
0% found this document useful
Making Compliance Suck Less with AJ Yawn: AJ Yawn is the co-founder and CEO at ByteChek, a startup that’s focused on making compliance suck less. He’s also a founding board member of the National Association of Black Compliance & Risk Management Professionals, and advisor at CISO MAG, and an advi
Podcast episode
Making Compliance Suck Less with AJ Yawn: AJ Yawn is the co-founder and CEO at ByteChek, a startup that’s focused on making compliance suck less. He’s also a founding board member of the National Association of Black Compliance & Risk Management Professionals, and advisor at CISO MAG, and an advi
byScreaming in the Cloud
0 ratings
0% found this document useful
Advancing Health Care with AI: Humana’s Slawek Kierner Talks Synthetic Data and Real Lives: Slawek Kierner, senior vice president of enterprise data and analytics at Humana, has been immersed in data for as long as he can remember. His fascination with process simulations began on his first PC running MATLAB and Sumulink, and later led him...
Podcast episode
Advancing Health Care with AI: Humana’s Slawek Kierner Talks Synthetic Data and Real Lives: Slawek Kierner, senior vice president of enterprise data and analytics at Humana, has been immersed in data for as long as he can remember. His fascination with process simulations began on his first PC running MATLAB and Sumulink, and later led him...
byMe, Myself, and AI
0 ratings
0% found this document useful
Challenges Operationalizing ML (And Some Solutions) // Nathan Ryan Frank // #199
Podcast episode
Challenges Operationalizing ML (And Some Solutions) // Nathan Ryan Frank // #199
byMLOps.community
0 ratings
0% found this document useful
MLOps vs. LLMOps Panel // LLMs in Conference in Production Conference Part II
Podcast episode
MLOps vs. LLMOps Panel // LLMs in Conference in Production Conference Part II
byMLOps.community
0 ratings
0% found this document useful
Putting the “Fun” in Functional with Frank Chen: Almost everyone is using Slack, and a lot of that is because of the work of those like Frank Chen, Slack’s Senior Staff Software Engineer. Frank is here to tell us how Slack keeps us all angrily typing. But equally as important is his own trajectory which
Podcast episode
Putting the “Fun” in Functional with Frank Chen: Almost everyone is using Slack, and a lot of that is because of the work of those like Frank Chen, Slack’s Senior Staff Software Engineer. Frank is here to tell us how Slack keeps us all angrily typing. But equally as important is his own trajectory which
byScreaming in the Cloud
0 ratings
0% found this document useful
Composable Data Analytics
Podcast episode
Composable Data Analytics
byThe Cloudcast
0 ratings
0% found this document useful
[Best of 2023] #134 - A Developer-Centric Approach to Measuring and Improving Productivity - Margaret-Anne Storey & Abi Noda
Podcast episode
[Best of 2023] #134 - A Developer-Centric Approach to Measuring and Improving Productivity - Margaret-Anne Storey & Abi Noda
byTech Lead Journal
0 ratings
0% found this document useful
The Future of Search in the Era of Large Language Models // Saahil Jain // MLOps Podcast #150
Podcast episode
The Future of Search in the Era of Large Language Models // Saahil Jain // MLOps Podcast #150
byMLOps.community
0 ratings
0% found this document useful
The Three Roles of the Chief Data Officer: ADP’s Jack Berkowitz
Podcast episode
The Three Roles of the Chief Data Officer: ADP’s Jack Berkowitz
byMe, Myself, and AI
0 ratings
0% found this document useful
MLOps Build or Buy, Startup vs. Enterprise? // Aaron Maurer & Katrina Ni # 157
Podcast episode
MLOps Build or Buy, Startup vs. Enterprise? // Aaron Maurer & Katrina Ni # 157
byMLOps.community
0 ratings
0% found this document useful
#88 - Observability Engineering - Liz Fong-Jones
Podcast episode
#88 - Observability Engineering - Liz Fong-Jones
byTech Lead Journal
0 ratings
0% found this document useful
#182 Ben Kessler on the OEO Model of Measurement: The Cognitive Crucible is a forum that presents different perspectives and emerging thought leadership related to the information environment. The opinions expressed by guests are their own, and do not necessarily reflect the views of or endorsement...
Podcast episode
#182 Ben Kessler on the OEO Model of Measurement: The Cognitive Crucible is a forum that presents different perspectives and emerging thought leadership related to the information environment. The opinions expressed by guests are their own, and do not necessarily reflect the views of or endorsement...
byThe Cognitive Crucible
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Business, Management, and Marketing
0 ratings
0% found this document useful
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
Podcast episode
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
byData Engineering Podcast
0 ratings
0% found this document useful
How to measure and improve developer productivity | Nicole Forsgren (Microsoft Research, GitHub, Google)
Podcast episode
How to measure and improve developer productivity | Nicole Forsgren (Microsoft Research, GitHub, Google)
byLenny's Podcast: Product | Growth | Career
0 ratings
0% found this document useful

Skip carousel

Federated Learning Uses The Data Right On Our Devices
Futurity
Article
Federated Learning Uses The Data Right On Our Devices
Jul 21, 2022
2 min read
Why We Need To Fear The Risk Of AI Model Collapse
Evening Standard
Article
Why We Need To Fear The Risk Of AI Model Collapse
Dec 17, 2023
4 min read
How To Make Sense From And With AI ?
The European Business Review
Article
How To Make Sense From And With AI ?
Sep 25, 2021
4 min read
2 The Use of Python in AI and ML
Techfastly
Article
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
Inform And Enhance Your Business With Open Data
PC Pro Magazine
Article
Inform And Enhance Your Business With Open Data
Jun 10, 2021
7 min read
Jobs Of The Future
True Love
Article
Jobs Of The Future
Jan 26, 2023
5 min read
Web App Security
Linux Format
Article
Web App Security
Jun 29, 2021
8 min read
Machine Learning – With Zero Programming
APC
Article
Machine Learning – With Zero Programming
Aug 12, 2019
6 min read
Secure Mobile Comms
RECOIL OFFGRID
Article
Secure Mobile Comms
Dec 6, 2022
9 min read
Q&A: OPENAI CTO MIRA MURATI ON SHEPHERDING CHATGPT
AppleMagazine
Article
Q&A: OPENAI CTO MIRA MURATI ON SHEPHERDING CHATGPT
Apr 28, 2023
4 min read
Q&A: OPENAI CTO MIRA MURATI ON SHEPHERDING CHATGPT
TechLife News
Article
Q&A: OPENAI CTO MIRA MURATI ON SHEPHERDING CHATGPT
Apr 29, 2023
4 min read
How AI Joins The Fight Against Coronavirus
APC
Article
How AI Joins The Fight Against Coronavirus
Apr 20, 2020
4 min read
Finding A New Career In AI
APC
Article
Finding A New Career In AI
Mar 23, 2020
4 min read
Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Trained To Hire
Linux Format
Article
Trained To Hire
Nov 16, 2021
Matt Yonkovit is Percona’s Head of Open Source Strategy and a member of SHA (Silly Hats Anonymous). The Linux Foundation published its report on open source jobs last month. It revealed that 97 per cent of hiring managers said people with open source
1 min read
Signals Of Change: how To Evolve For The New Global Reality
Rotman Management
Article
Signals Of Change: how To Evolve For The New Global Reality
May 1, 2022
11 min read
Quantum Leap
Marketing
Article
Quantum Leap
Jul 11, 2019
6 min read
Principles of Technical Leadership
Techfastly
Article
Principles of Technical Leadership
Mar 1, 2022
IT staff is more than just a number on a spreadsheet. This information is valuable, but it does not tell the whole story. We’ll also need to know about your team’s project history, current (non-hired) CV, and the skills and positions they have—and wa
2 min read
Generative AI: What Leaders Need To Know
Rotman Management
Article
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
Intelligence Analysis
PRIVATE GAME WILDLIFE RANCHING
Article
Intelligence Analysis
Jun 13, 2018
3 min read
How And Where You Use Machine-learning
APC
Article
How And Where You Use Machine-learning
Oct 7, 2019
4 min read
8 Network Security For Your Home And Office
Techfastly
Article
8 Network Security For Your Home And Office
Nov 30, 2020
7 min read
What Do Academics Think?
The Big Issue Magazine
Article
What Do Academics Think?
May 19, 2023
3 min read
01 Ready Or Not, AI Is Here To Assist You
HWM Singapore
Article
01 Ready Or Not, AI Is Here To Assist You
Jul 11, 2023
4 min read
The ARC's Farm Assessment Toolkit
Farmer's Weekly
Article
The ARC's Farm Assessment Toolkit
Oct 20, 2023
5 min read
Software Whiteboards
Linux Format
Article
Software Whiteboards
Jul 26, 2022
1 min read
Arnab PANDEY
Techfastly
Article
Arnab PANDEY
Apr 1, 2021
11 min read
Why Your Organisation Needs To Lift Its Data Game
NZBusiness and Management
Article
Why Your Organisation Needs To Lift Its Data Game
Oct 22, 2019
From problems stemming from the recent New Zealand census to data collected by Facebook, data has been in the news a lot lately. It may seem obvious that large organisations such as Statistics New Zealand and Facebook need to continually improve thei
3 min read
CalicoPie Family Historian 7
Computeractive
Article
CalicoPie Family Historian 7
Mar 24, 2021
SOFTWARE | £60 from Family Historian Store www.snipca.com/37615 If you’ve ever researched your family tree, you’ll know it’s much harder than the BBC’s celebrity genealogy programme Who Do You Think You Are? makes it appear. You’ll certainly need to
2 min read
How Mature Is Your Organisation With Regards To Digital And Web Analytics?
NZ Marketing
Article
How Mature Is Your Organisation With Regards To Digital And Web Analytics?
Jun 9, 2021
1 min read

Related categories

Skip carousel

Reviews for Advanced R 4 Data Programming and the Cloud

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Advanced R 4 Data Programming and the Cloud - Matt Wiley

M. Wiley, J. F. WileyAdvanced R 4 Data Programming and the Cloudhttps://doi.org/10.1007/978-1-4842-5973-3_1

1. Programming Basics

Matt Wiley¹ and Joshua F. Wiley²

(1)

Victoria College, Victoria, TX, USA

(2)

Monash University, Melbourne, VIC, Australia

As with most languages, becoming a power user requires extra understanding of the underlying structure or rules. Data science through R is powerful, and this chapter discusses such programming basics including objects, operators, and functions.

Before we dig too deeply into R, some general principles to follow may well be in order. First, experimentation is good. It is much more powerful to learn hands-on than it is simply to read. Download the source files that come with this text, and try new things!

Second, it can help quite a bit to become familiar with the ? function. Simply type ? immediately followed by text in your R console to call up help of some kind—for example, ?sum. We cover more on functions later, but this is too useful to ignore until that time. Using your favorite search engine is also wise (such as this search string: R sum na). While memorizing some things may be helpful, much of programming is gaining skill with effective search.

Finally, just before we dive into the real reason you bought this book, a word of caution: this is an applied text. Our goal is to get you up and running as quickly as possible toward some useful skills. A rigorous treatment of most of these topics—even or especially the ideas in this chapter—is worthwhile, yet beyond the scope of this book.

1.1 Software Choices and Reproducibility

This book is written for more experienced users of the R language, and we suppose readers are familiar with installing R. For completeness, we list the primary software used throughout this book in Table 1-1. Individual R packages will be introduced inside any chapter where their use is indicated. Specifics of setting up an Amazon cloud compute instance will be walked through in the relevant cloud computing chapter as well.

Table 1-1

Advanced R Tech Stack

For a complete walk-through of how to install R and RStudio on Windows or Macintosh, please see our Beginning R [37] book.

1.2 Reproducing Results

One useful feature of R is the abundance of packages written by experts worldwide. This is also potentially the Achilles’ heel of using R: from the version of R itself to the version of particular packages, lots of code specifics are in flux. Your code has the potential to not work from day to day, let alone our code written weeks or months before this book was published.

All code used in the following chapters will be hosted on GitHub . Code there may well be more recent than printed in this text. Should code in the text not work due to package changes or base R changes, please visit this book’s GitHub site:

options(

width = 70,

stringsAsFactors = FALSE,

digits = 2)

1.3 Types of Objects

First of all, we need things to build our language, and in R, these are called objects. We start with five very common types of objects.

Logical objects take on just two values: TRUE or FALSE. Computers are binary machines, and data often may be recorded and modeled in an all-or-nothing world. These logical values can be helpful, where TRUE has a value of 1 and FALSE has a value of 0.

As a reminder, # (e.g., the pound sign or hashtag) is an indicator of a code comment . The words that follow the # are not processed by R and are meant to help the reader:

TRUE ## logical

## [1] TRUE

FALSE ## logical

## [1] FALSE

As you may remember from some quickly muttered comments of your college algebra professor, there are many types of numbers. Whole numbers, which include zero as well as negative values, are called integers . In set notation, … ,-2, -1, 0, 1, 2, … , these numbers are useful for headcounts or other indexes. In R, integers have a capital L suffix. If decimal numbers are needed, then double numeric objects are in order. These are the numbers suited for ratio data types. Complex numbers have useful properties as well and are understood precisely as you might expect, with an i suffix on the imaginary portion. R is quite friendly in using all of these numbers, and you simply type in the desired numbers (remember to add the L or i suffix as needed):

42L ## integer

## [1] 42

1.5 ## double numeric

## [1] 1.5

2+3i ## complex number

## [1] 2+3i

Nominal-level data may be stored via the character class and is designated with quotation marks:

a ## character

## [1] a

Of course, numerical data may have missing values. These missing values are of the type that the rest of the data in that set would be (we discuss data storage shortly). Nevertheless, it can be helpful to know how to hand-code logical, integer, double, complex, or character missing values:

NA ## logical

## [1] NA

NA_integer_ ## integer

## [1] NA

NA_real_ ## double / numeric

## [1] NA

NA_character_ ## character

## [1] NA

NA_complex_ ## complex

## [1] NA

Factors are a special kind of object, not so useful for general programming, but used a fair amount in statistics. A factor variable indicates that a variable should be treated discretely. Factors are stored as integers, with labels to indicate the original value:

factor(1:3)

## [1] 1 2 3

## Levels: 1 2 3

factor(c(alice, bob, charlie))

## [1] alice bob charlie

## Levels: alice bob charlie

factor(letters[1:3])

## [1] a b c

## Levels: a b c

We turn now to data structures, which can store objects of the types we have discussed (and of course more). A vector is a relatively simple data storage object. A simple way to create a vector is with the concatenate function c():

## vector

c(1, 2, 3)

## [1] 1 2 3

Just as in mathematics, a scalar is a vector of just length 1. Toward the opposite end of the continuum, a matrix is a vector with dimensions for both rows and columns. Notice the way the matrix is populated with the numbers 1–6, counting down each column:

## scalar is just a vector of length one

c(1)

## [1] 1

## matrix is a vector with dimensions

matrix(c(1:6), nrow = 3, ncol = 2)

## [,1] [,2]

## [1,] 1 4

## [2,] 2 5

## [3,] 3 6

All vectors, be they scalar, vector, or matrix, can have only one data type (e.g., integer, logical, or complex). If more than one type of data is needed, it may make sense to store the data in a list. A list is a vector of objects, in which each element of the list may be a different type. In the following example, we build a list that has character, vector, and matrix elements:

## vectors and matrices can only have one type of data (e.g., integer, logical, etc.)

## list is a vector of objects

## lists can have different type of objects in each element

list(

c(a),

c(1, 2, 3),

matrix(c(1:6), nrow = 3, ncol = 2)

)

## [[1]]

## [1] a

## [[2]]

## [1] 1 2 3

## [[3]]

## [,1] [,2]

## [1,] 1 4

## [2,] 2 5

## [3,] 3 6

A particular type of list is the data frame , in which each element of the list is identical in length (although not necessarily in object type). With the underlying building blocks of the simpler objects, more complex structures evolve. Take a look at the following instructive examples with output:

## data frames are special type of lists

## where each element of the list is identical in length

data.frame(

1:3,

4:6)

## X1.3 X4.6

## 1 1 4

## 2 2 5

## 3 3 6

## using non equal length objects causes problems

data.frame(

1:3,

4:5)

## Error in data.frame(1:3, 4:5): arguments imply differing number of rows: 3, 2

data.frame( 1:3, letters[1:3])

## X1.3 letters.1.3.

## 1 1 a

## 2 2 b

## 3 3 c

Because of their superior computational speed, in this text we primarily use data table objects in R from the data.table package [9]. Data tables are similar to data frames, yet are designed to be more memory efficient and faster (mostly due to more underlying C++ code). Even though we recommend data tables, we show some examples with data frames as well because when working with R, much historical code includes data frames and indeed data tables inherit many methods from data frames (notice the last line of code that follows shows TRUE):

##if not yet installed, run the below line of code if needed.

#install.packages(data.table)

library(data.table)

## data.table 1.12.8 using 6 threads (see ?getDTthreads). Latest news: r-datatable.com

dataTable <- data.table( 1:3, 4:6)

dataTable

## V1 V2

## 1: 1 4

## 2: 2 5

## 3: 3 6

is.data.frame(dataTable)

## [1] TRUE

It is worth mentioning at this stage a little bit about the data structure wars. Historically, the predominant way to structure the types of row/column or tabular data many researchers use was data frames. As data grew in column width and row length, this base R structure no longer solved everyone’s needs. Grown out of the same SQL data control mindset as some of the largest databases, the data.table package/library is (these days) suited to multiple computer cores and efficient memory operations and uses more-efficient-than-R languages under the hood. For the largest data sets and for those who have any background in SQL or other programming languages, data tables are hugely effective and intuitive. Not all folks first coming to R have a programming background (and indeed that is a very good thing). A competing data structure, the tibble, is part of what is called the tidyverse (a portmanteau of tidy and universe). Tibbles, like data tables, are also data frames at heart, yet they are improved. In the authors’ opinion, while not yet quite as fast as tables, tibbles have a more new-user-friendly style or language syntax. They’re beloved by a large part of the R community and are an important part of modern R. In practice, data tables are still faster and can often achieve tasks your authors find most common in fewer lines of code. Both these newer structures have their strengths, and both have their place in the R universe (and indeed Chapters 7 and 8 focus on data tables, yet time is given to tibbles in Chapter 9). All the same, this text will primarily use data tables.

Having explored several types of objects, we turn our attention to ways of manipulating those objects with operators and functions.

1.4 Base Operators and Functions

Objects are not enough for a language ; while nouns are nice, actions are required. Operators and functions are the verbs of the programming world. We start with assignment, which can be done in two ways. Much like written languages, more elegant turns of phrase can be more helpful than simpler prose. So although both = and <- are assignment operators and do the same thing, because = is used within functions to set arguments, we recommend for clarity’s sake to use <- for general assignment. We nevertheless demonstrate both assignment techniques. Assignments allow objects to be given sensible names; this can significantly enhance code readability (for your future self as well as for other users).

In addition to assigning names to variables, you can check specifics by using functions. Functions in R take the general format of function name, followed by parentheses, with input inside the parentheses, and then R provides output. Here are examples:

x <- 5

y = 3

## [1] 5

## [1] 3

is.integer(x)

## [1] FALSE

is.double(y)

## [1] TRUE

is.vector(x)

## [1] TRUE

It can help to be able to pronounce and speak these lines of code (and indeed that idea is at the heart of our preference for <- which reads is assigned). The preceding code might well be read "the variable x is assigned the integer value 5. Contrastingly, while the precise same under-the-hood operation is occurring in the next line with y, saying y equals 3" is perhaps less clear as to whether we are discussing an innate property of y vs. performing an assignment.

Once an object is assigned, you can access specific object elements by using brackets. Most computer languages start their indexing at either 0 or 1. R starts indexing at 1. Also, note you can readily change old assignments with little trouble and no warning; it is wise to watch names cautiously and comment code carefully:

x <- c(a, b, c)

x[1]

## [1] a

is.vector(x)

## [1] TRUE

is.vector(x[1])

## [1] TRUE

is.character(x[1])

## [1] TRUE

What do we mean by watch names carefully? We called the preceding vector x, and it was not a very interesting name. Tough to remember x has the first three letters of the alphabet. Instead, we might choose a better variable name, swapping the tough-to-recall x with passingLetterGrades . Even better, ours makes sense when spoken variable can easily be improved later, such as if we wanted to add the sometimes passing letter grade D or maybe the pass/fail passing grade S:

passingLetterGrades <- c(A, B, C)

passingLetterGrades[2]

## [1] B

While a vector may take only a single index, more complex structures require more indices. For the matrix you met earlier, the first index is the row, and the second is for column position. Notice that after building a matrix and assigning it, there are many ways to access various combinations of elements. This process of accessing just some of the elements is sometimes called subsetting :

x2 <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 3, ncol = 2)

x2 ## print to see full matrix

## [,1] [,2]

## [1,] 1 4

## [2,] 2 5

## [3,] 3 6

x2[1, 2] ## row 1, column 2

## [1] 4

x2[1, ] ## all row 1

## [1] 1 4

x2[, 1] ## all column 1

## [1] 1 2 3

## can also grab several at once

x2[c(1, 2), ] ## rows 1 and 2

## [,1] [,2]

## [1,] 1 4

## [2,] 2 5

x2[c(1, 3), ] ## rows 1 and 3

## [,1] [,2]

## [1,] 1 4

## [2,] 3 6

## can drop one element using negative values

x[-2] ## drop element two

## [1] a c

x2[, -2] ## drop column two

## [1] 1 2 3

x2[-1, ] ## drop row 1

## [,1] [,2]

## [1,] 2 5

## [2,] 3 6

is.vector(x2)

## [1] FALSE

is.matrix(x2)

## [1] TRUE

Accessing and subsetting lists is perhaps a trifle more complex, yet all the more essential to learn and master for later techniques. A single index in a single bracket returns the entire element at that spot (recall that for a list, each element may be a vector or just a single object). Using double brackets returns the object within that element of the list—nothing more.

Thus, the following code is, in fact, a vector with the element a inside. Again, using the data-type-checking functions can be helpful in learning how to interpret various pieces of code:

## for lists using a single bracket

## returns a list with one element

y <- list(

c(a),

c(1:3))

y[1]

## [[1]]

## [1] a

is.vector(y[1])

## [1] TRUE

is.list(y[1])

## [1] TRUE

is.character(y[1])

## [1] FALSE

Contrast that with this code, which is simply the element a:

## using double bracket returns the object within that

## element of the list, nothing more

y[[1]]

## [1] a

is.vector(y[[1]])

## [1] TRUE

is.list(y[[1]])

## [1] FALSE

is.character(y[[1]])

## [1] TRUE

You can, in fact, chain brackets together, so the second element of the list (a vector with the numbers 1–3) can be accessed, and then, within that vector, the third element can be accessed:

## can chain brackets togeter

y[[2]][3] ## second element of the list, third element of the vector

## [1] 3

Brackets almost always work, depending on the type of object, but there may be additional ways to access components. Named data frames and lists can use the $ operator. Notice in the following code how the bracket or dollar sign ends up being equivalent:

x3 <- data.frame(

A = 1:3,

B = 4:6)

y2 <- list(

C = c(a),

D = c(1, 2, 3))

x3$A

## [1] 1 2 3

y2$C

## [1] a

## these are equivalent to

x3[[A]]

## [1] 1 2 3

y2[[C]]

## [1] a

Notice that although both data frames and lists are lists, neither is a matrix:

is.list(x3)

## [1] TRUE

is.list(y2)

## [1] TRUE

is.matrix(x3)

## [1] FALSE

is.matrix(y2)

## [1] FALSE

Moreover, despite not being matrices, because of their special nature (i.e., all elements have equal length), data frames and data tables can be indexed similarly to matrices:

x3[1, 1]

## [1] 1

x3[1, ]

## A B

## 1 1 4

x3[, 1]

## [1] 1 2 3

Any named object can be indexed by using the names rather than the positional numbers, provided those names have been set:

x3[1, A]

## [1] 1

x3[, A]

## [1] 1 2 3

For data frames, this applies to both column and row names, and these names can be established after building the matrix:

rownames(x3) <- c(first, second, third)

x3[second, B]

## [1] 5

Data tables use a slightly different approach. While we will devote two later chapters to using data tables, for now, we mention a few facts. Selecting rows works almost identically, but selecting columns does not require quotes. Additionally, you can select multiples by name without quotes by using the .() operator . Should you need to use quotes, the data table can be accessed by using the option with = FALSE such as follows:

x4 <- data.table(

A = 1:3,

B = 4:6)

x4[1, ]

## A B

## 1: 1 4

x4[, A] #no quote needed for a column in data.table

## [1] 1 2 3

x4[1, A]

## [1] 1

x4[1:2, .(A, B)]

## A B

## 1: 1 4

## 2: 2 5

x4[1, A, with = FALSE]

## A

## 1: 1

Remember, we said that everything in R is either an object or a function. Those are the two building blocks. So, technically, the bracket operators are functions. Although they’re not used as functions with the telltale parens (), they can be. Most functions are named, but the brackets are a particular case and require using single quotes in the regular function format, as in the following example:

'['(x, 1)

## [1] a

'['(x3, second, A)

## [1] 2

'[['(y, 2)

## [1] 1 2 3

In practice of course, this is almost never used this way. It only needs saying to understand you can code up your own meaning for any function (we devote a chapter on writing your own functions). In fact, this is conceptually how data table gets away with not using quotes for the column names—it changed the way the bracket function works when is.data.table() returns TRUE.

Although we have been using the is.datatype() function to better illustrate what an object is, you can do more. Specifically, you can check whether a value is missing an element by using the is.na() function :

NA == NA ## does not work

## [1] NA

is.na(NA) ## works

## [1] TRUE

Of course, the preceding code snippet usually has a vector or matrix element argument whose populated status is up for debate. Our last (for now) exploratory function is the inherits() function. It is helpful when no is.class() function exists, which can occur when specific classes outside the core ones you have seen presented so far are developed:

inherits(x3, data.frame)

## [1] TRUE

inherits(x2, matrix)

## [1] TRUE

You can also force lower types into higher types. This coercion can be helpful but may have unintended consequences. It can be particularly risky if you have a more advanced data object being coerced to a lesser type (pay close attention to the attempt to coerce an integer):

as.integer(3.8)

## [1] 3

as.character(3)

## [1] 3

as.numeric(3)

## [1] 3

as.complex(3)

## [1] 3+0i

as.factor(3)

## [1] 3

## Levels: 3

as.matrix(3)

## [,1]

## [1,] 3

as.data.frame(3)

## 3

## 1 3

as.list(3)

## [[1]]

## [1] 3

as.logical(a) ## NA no warning

## [1] NA

as.logical(3) ## TRUE, no warning

## [1] TRUE

as.numeric(a) ## NA with a warning

## Warning: NAs introduced by coercion

## [1] NA

Coercion can be helpful. All the same, it must be used cautiously. Before you move on from this section, if any of this is new, be sure to experiment with different inputs than the ones we tried in the preceding example! Experimenting never hurts, and it can be a powerful way to learn.

Let’s turn our attention now to mathematical and logical operators and functions.

1.5 Mathematical Operators and Functions

Several operators can be used for comparison. These will be helpful later, once we get into loops and building our own functions. Equally useful are symbolic logic forms. We start with some basic comparisons and admit to a strange predilection for the number 4:

####################MATH################################

###### Comparisons and logicals

4 > 4

## [1] FALSE

4 >= 4

## [1] TRUE

4 < 4

## [1] FALSE

4 <= 4

## [1] TRUE

4 == 4

## [1] TRUE

4 != 4

## [1] FALSE

It is sensible now to mention that although the preceding code may be helpful, often numbers differ from one another only slightly—particularly in the programming environment, which relies on the computer representation of floating-point (irrational) numbers. Therefore, we often check that things are close within a tolerance:

all.equal(1, 1.00000002, tolerance = .00001)

## [1] TRUE

In symbolic logic, and as well as or are useful comparisons between two objects. In R, we use & for and vs. | for or. Complex logic tests can be constructed from these simple structures:

TRUE | FALSE

## [1] TRUE

FALSE | TRUE

## [1] TRUE

TRUE & TRUE

## [1] TRUE

TRUE & FALSE

## [1] FALSE

All of the logic tests mentioned so far apply just as well to vectors as they apply to single objects:

1:3 >= 3:1

## [1] FALSE TRUE TRUE

c(TRUE, TRUE) | c(TRUE, FALSE)

## [1] TRUE TRUE

c(TRUE, TRUE) & c(TRUE, FALSE)

## [1] TRUE FALSE

If you want only a single response, such as for if/else flow control, you can use && or ——, which stop evaluating as soon as they have determined the final result. Work through the following code and output carefully:

## for cases where you only want a single response

## such as for if else flow control

## can use && or ||, which stop evaluating after they confirm what it is

## for example

## Error in eval(expr, envir, enclos): object 'W' not found

TRUE | W

## Error in eval(expr, envir, enclos): object 'W' not found

## BUT

TRUE || W

## [1] TRUE

W || TRUE

## Error in eval(expr, envir, enclos): object 'W' not found

FALSE & W

## Error in eval(expr, envir, enclos): object 'W' not found

FALSE && W

## [1] FALSE

Note that the double operators are not, in fact, vectorized. They simply use the first element of any vectors:

c(TRUE, TRUE) || c(TRUE, FALSE)

## [1] TRUE

c(TRUE, TRUE) && c(TRUE, FALSE)

## [1] TRUE

The any() and all() functions are helpful as well in these contexts for similar reasons:

## two additional useful functions are

any(c(TRUE, FALSE, FALSE))

## [1] TRUE

all(c(TRUE, FALSE, TRUE))

## [1] FALSE

all(c(TRUE, TRUE, TRUE))

## [1] TRUE

We turn our attention now to mathematical, rather than logical, operators. R is powerful mathematically and can perform most mathematical calculations. So although we introduce some functions, we are leaving many out of the mix. For more details, ?Arithmetic can be your friend. It is (as always) important to be aware of the way computers perform mathematical calculations. Being able to code bespoke solutions directly is powerful, yet with the freedom to customize comes a corresponding amount of responsibility. Take a careful look at the following mathematical operations (which can behave differently than expected because of implementation choices):

3 + 3

## [1] 6

3 – 3

## [1] 0

3 * 3

## [1] 9

3 / 3

## [1] 1

(-27) ˆ (1/3)

## [1] NaN

4 %/% .7

## [1] 5

4 %% .3

## [1] 0.1

R also has some common functions that have straightforward names:

sqrt(3)

## [1] 1.7

abs(-3)

## [1] 3

exp(1)

## [1] 2.7

log(2.71)

## [1] 1

Trigonometric functions also have their part, and ?Trig can bring up a nice list of these. We show cosine’s function call cos() for brevity. Note the slight inaccuracy again on the cosine function’s output:

cos(3.1415) ## cosine

## [1] -1

?Trig

We close this section and this chapter with a brief selection of matrix operations. Scalar operations use the basic arithmetic operators. To perform matrix multiplication, we use %*%:

## [,1] [,2]

## [1,] 1 4

## [2,] 2 5

## [3,] 3 6

x2 * 3

## [,1] [,2]

## [1,] 3 12

## [2,] 6 15

## [3,] 9 18

x2 + 3

## [,1] [,2]

## [1,] 4 7

## [2,] 5 8

## [3,] 6 9

x2 %*% matrix(c(1, 1), 2)

## [,1]

## [1,] 5

## [2,] 7

## [3,] 9

Matrices have a few other fairly common operations that are helpful in linear algebra. For some of the modeling applications we cover later on, we discuss an appropriate amount of mathematics as needed in the following chapters. Still, this seems a good place to show how the transpose, cross product, and transpose cross product might be coded. We show both the raw code to make the cross product and transpose cross product occur and easier function calls that may be used. This is a relatively common occurrence in R, incidentally. Through packages, quite a few techniques are implemented in fairly clear function calls. Here are the examples:

## transpose

t(x2)

## [,1] [,2] [,3]

## [1,] 1 2 3

## [2,] 4 5 6

## cross product

t(x2) %*% x2

## [,1] [,2]

## [1,] 14 32

## [2,] 32 77

## easier cross product

crossprod(x2)

## [,1] [,2]

## [1,] 14 32

## [2,] 32 77

## transpose cross product

x2 %*% t(x2)

## [,1] [,2] [,3]

## [1,] 17 22 27

## [2,] 22 29 36

## [3,] 27 36 45

## easier transpose cross product

tcrossprod(x2)

## [,1] [,2] [,3]

## [1,] 17 22 27

## [2,] 22 29 36

## [3,] 27 36 45

As you have just seen, it is common in R for someone else to have done the heavy lifting by making a function that outputs the desired outcome. Of course, these friendly programmers’ work is subjected to only the underlying constraints of R itself as well as the ability to acquire a free GitHub account. User, beware (at least in some cases)! Thus, it can be helpful to understand the base commands and operators that make R work.

Next, let’s focus on understanding implementation nuances as well as quickly getting data in and out of R.

1.6 Summary

We will conclude each chapter with a summary Table 1-2 of any R concepts of major import. These will generally be functions, although some objects will be worth discussing too in the case of this chapter.

Table 1-2

Chapter 1 summary

M. Wiley, J. F. WileyAdvanced R 4 Data Programming and the Cloudhttps://doi.org/10.1007/978-1-4842-5973-3_2

2. Programming Utilities

Matt Wiley¹ and Joshua F. Wiley²

(1)

Victoria College, Victoria, TX, USA

(2)

Monash University, Melbourne, VIC, Australia

One of the powerful features of R is the highly skilled, kindly community of enthusiasts, developers, and package authors. In particular, to extend the functionality of base R, one can find and add packages which in turn allow one to use new functions.

As a reminder, in R, functions tend to be actions our code takes to create an output or result based on one or more inputs (also called formals). While we save a discussion for how to code your own functions for another chapter, using functions created and shared in the R community provides highly helpful additions to what R can do.

In particular, we will focus in this chapter on functions for learning more about functions, operating system environment and file management, and data input and output to and from R:

options(width = 70, digits = 2)

2.1 Installing and Using Packages

Packages are hosted on CRAN [1] which is built into the base R environment (well, technically into a package named utils which is preloaded with base R).

Enjoying the preview?

Page 1 of 1

Advanced R 4 Data Programming and the Cloud: Using PostgreSQL, AWS, and Shiny

About this ebook

Matt Wiley

Related authors

Related to Advanced R 4 Data Programming and the Cloud

Related ebooks

Programming For You

Related podcast episodes

Related articles

Related categories

Reviews for Advanced R 4 Data Programming and the Cloud

What did you think?

Book preview

Advanced R 4 Data Programming and the Cloud - Matt Wiley

1. Programming Basics

1.1 Software Choices and Reproducibility

1.2 Reproducing Results

1.3 Types of Objects

1.4 Base Operators and Functions

1.5 Mathematical Operators and Functions

1.6 Summary

2. Programming Utilities

2.1 Installing and Using Packages