Ebook339 pages2 hours

R High Performance Programming

Name: R High Performance Programming
Brand: Packt Publishing
Rating: 4.3 (2 reviews)

By Aloysius Lim and William Tjhi

Rating: 4.5 out of 5 stars

4.5/5

()

Read preview

About this ebook

About This Book

Benchmark and profile R programs to solve performance bottlenecks
Combine the ease of use and flexibility of R with the power of big data tools
Filled with practical techniques and useful code examples to process large data sets more efficiently

Who This Book Is For

This book is for programmers and developers who want to improve the performance of their R programs by making them run faster with large data sets or who are trying to solve a pesky performance problem.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateJan 29, 2015

ISBN9781783989270

Author

Aloysius Lim

Related authors

Skip carousel

Related to R High Performance Programming

Related ebooks

Skip carousel

R Object-oriented Programming
Ebook
R Object-oriented Programming
byKelly Black
Rating: 3 out of 5 stars
3/5
R for Data Science
Ebook
R for Data Science
byDan Toomey
Rating: 5 out of 5 stars
5/5
R Data Science Essentials
Ebook
R Data Science Essentials
byKoushik Raja B.
Rating: 2 out of 5 stars
2/5
Python Data Science Essentials
Ebook
Python Data Science Essentials
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Pandas 1.x Cookbook - Second Edition: Practical recipes for scientific computing, time series analysis, and exploratory data analysis using Python, 2nd Edition
Ebook
Pandas 1.x Cookbook - Second Edition: Practical recipes for scientific computing, time series analysis, and exploratory data analysis using Python, 2nd Edition
byMatt Harrison
Rating: 5 out of 5 stars
5/5
Learning pandas - Second Edition
Ebook
Learning pandas - Second Edition
byHeydt Michael
Rating: 4 out of 5 stars
4/5
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
Ebook
Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
byPurna Chander Rao. Kathula
Rating: 5 out of 5 stars
5/5
Practical Data Science Cookbook - Second Edition
Ebook
Practical Data Science Cookbook - Second Edition
bySean Patrick Murphy
Rating: 0 out of 5 stars
0 ratings
Regression Analysis with Python
Ebook
Regression Analysis with Python
byBoschetti Alberto
Rating: 0 out of 5 stars
0 ratings
Bayesian Analysis with Python
Ebook
Bayesian Analysis with Python
byOsvaldo Martin
Rating: 5 out of 5 stars
5/5
Python Data Analysis
Ebook
Python Data Analysis
byIvan Idris
Rating: 4 out of 5 stars
4/5
Hands-On Time Series Analysis with R: Perform time series analysis and forecasting using R
Ebook
Hands-On Time Series Analysis with R: Perform time series analysis and forecasting using R
byRami Krispin
Rating: 0 out of 5 stars
0 ratings
R Machine Learning Essentials
Ebook
R Machine Learning Essentials
byUsuelli Michele
Rating: 0 out of 5 stars
0 ratings
Mastering Python for Data Science
Ebook
Mastering Python for Data Science
bySamir Madhavan
Rating: 3 out of 5 stars
3/5
Mastering Data Analysis with R
Ebook
Mastering Data Analysis with R
byDaróczi Gergely
Rating: 5 out of 5 stars
5/5
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
Ebook
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
byStefanie Molin
Rating: 0 out of 5 stars
0 ratings
Introduction to R for Business Intelligence
Ebook
Introduction to R for Business Intelligence
byJay Gendron
Rating: 0 out of 5 stars
0 ratings
Interactive Applications Using Matplotlib
Ebook
Interactive Applications Using Matplotlib
byBenjamin V. Root
Rating: 0 out of 5 stars
0 ratings
Getting Started with Python Data Analysis
Ebook
Getting Started with Python Data Analysis
byVo.T.H Phuong
Rating: 0 out of 5 stars
0 ratings
Python Data Science Essentials - Second Edition
Ebook
Python Data Science Essentials - Second Edition
byBoschetti Alberto
Rating: 4 out of 5 stars
4/5
Mastering pandas for Finance
Ebook
Mastering pandas for Finance
byHeydt Michael
Rating: 0 out of 5 stars
0 ratings
Web Application Development with R Using Shiny - Second Edition
Ebook
Web Application Development with R Using Shiny - Second Edition
byBeeley Chris
Rating: 0 out of 5 stars
0 ratings
Python Data Analysis - Second Edition
Ebook
Python Data Analysis - Second Edition
byArmando Fandango
Rating: 0 out of 5 stars
0 ratings
Learning Data Mining with Python - Second Edition
Ebook
Learning Data Mining with Python - Second Edition
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Learning Bayesian Models with R
Ebook
Learning Bayesian Models with R
byM.Koduvely Dr. Hari
Rating: 5 out of 5 stars
5/5
R Graph Essentials
Ebook
R Graph Essentials
byDavid Alexander Lillis
Rating: 0 out of 5 stars
0 ratings
Data Science: Concepts and Practice
Ebook
Data Science: Concepts and Practice
byVijay Kotu
Rating: 3 out of 5 stars
3/5
Mastering Python for Finance - Second Edition: Implement advanced state-of-the-art financial statistical applications using Python, 2nd Edition
Ebook
Mastering Python for Finance - Second Edition: Implement advanced state-of-the-art financial statistical applications using Python, 2nd Edition
byJames Ma Weiming
Rating: 4 out of 5 stars
4/5
Learning Data Mining with Python
Ebook
Learning Data Mining with Python
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Hands-On Genetic Algorithms with Python: Applying genetic algorithms to solve real-world deep learning and artificial intelligence problems
Ebook
Hands-On Genetic Algorithms with Python: Applying genetic algorithms to solve real-world deep learning and artificial intelligence problems
byEyal Wirsansky
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
Ebook
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
byTriumph Books
Rating: 5 out of 5 stars
5/5
CompTIA Security+ Practice Questions
Ebook
CompTIA Security+ Practice Questions
byIP Specialist
Rating: 2 out of 5 stars
2/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Network+ Study Guide & Practice Exams
Ebook
Network+ Study Guide & Practice Exams
byRobert Shimonski
Rating: 4 out of 5 stars
4/5
Practical Lock Picking: A Physical Penetration Tester's Training Guide
Ebook
Practical Lock Picking: A Physical Penetration Tester's Training Guide
byDeviant Ollam
Rating: 5 out of 5 stars
5/5
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
Ebook
AP Computer Science Principles Premium, 2024: 6 Practice Tests + Comprehensive Review + Online Practice
bySeth Reichelson
Rating: 0 out of 5 stars
0 ratings
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
Ebook
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
Ebook
Childhood Unplugged: Practical Advice to Get Kids Off Screens and Find Balance
byKatherine Johnson Martinko
Rating: 0 out of 5 stars
0 ratings
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
Hacking: Ultimate Beginner's Guide for Computer Hacking in 2018 and Beyond: Hacking in 2018, #1
Ebook
Hacking: Ultimate Beginner's Guide for Computer Hacking in 2018 and Beyond: Hacking in 2018, #1
byDexter Jackson
Rating: 4 out of 5 stars
4/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
Master Builder Roblox: The Essential Guide
Ebook
Master Builder Roblox: The Essential Guide
byTriumph Books
Rating: 4 out of 5 stars
4/5
Summary of Dotcom Secrets: by Russell Brunson - The Underground Playbook for Growing Your Company Online with Sales Funnels - A Comprehensive Summary
Ebook
Summary of Dotcom Secrets: by Russell Brunson - The Underground Playbook for Growing Your Company Online with Sales Funnels - A Comprehensive Summary
byAlexander Cooper
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
Podcast episode
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
byDataFramed
0 ratings
0% found this document useful
#35 Data Science in Finance
Podcast episode
#35 Data Science in Finance
byDataFramed
0 ratings
0% found this document useful
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
Podcast episode
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
byHow to Data (Joshiverse- Journey of a Budding Data Scientist)
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Unraveling Python's Syntax to Its Core With Brett Cannon
Podcast episode
Unraveling Python's Syntax to Its Core With Brett Cannon
byThe Real Python Podcast
100%
100% found this document useful
#10 Data Science, the Environment and MOOCs: Air pollution, the environment and data science: where do these intersect? Find out in this episode of DataFramed, in which Hugo speaks with Roger Peng, Professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health...
Podcast episode
#10 Data Science, the Environment and MOOCs: Air pollution, the environment and data science: where do these intersect? Find out in this episode of DataFramed, in which Hugo speaks with Roger Peng, Professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health...
byDataFramed
0 ratings
0% found this document useful
#51 Francois Chollet - Intelligence and Generalisation
Podcast episode
#51 Francois Chollet - Intelligence and Generalisation
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
Podcast episode
Anaconda + Pyston and more: with Peter Wang, CEO of Anaconda
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
Podcast episode
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
byMachine Learning Cafe
0 ratings
0% found this document useful
Measuring Your Python Learning Progress
Podcast episode
Measuring Your Python Learning Progress
byThe Real Python Podcast
100%
100% found this document useful
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
Podcast episode
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
byGoogle Cloud Platform Podcast
100%
100% found this document useful
#1 Data Science, Past, Present and Future: Hilary Mason talks about the past, present, and future of data science with Hugo. Hilary is the VP of Research at Cloudera Fast Forward, a machine intelligence research company, and the data scientist in residence at Accel. If you want to hear about wh...
Podcast episode
#1 Data Science, Past, Present and Future: Hilary Mason talks about the past, present, and future of data science with Hugo. Hilary is the VP of Research at Cloudera Fast Forward, a machine intelligence research company, and the data scientist in residence at Accel. If you want to hear about wh...
byDataFramed
100%
100% found this document useful
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
Podcast episode
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
byPHPRoundtable Podcast
0 ratings
0% found this document useful
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
Podcast episode
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
084: Yves Hilpisch – Quantitative finance and programming trading strategies w/ The Python Quants: Dr. Yves Hilpisch is the founder of The Python Quants, a keynote speaker, and a three-time published author (most notably, Python For Finance). He regularly contracts to hedge funds, banks and exchanges, and hosts workshops on Python programming and algor
Podcast episode
084: Yves Hilpisch – Quantitative finance and programming trading strategies w/ The Python Quants: Dr. Yves Hilpisch is the founder of The Python Quants, a keynote speaker, and a three-time published author (most notably, Python For Finance). He regularly contracts to hedge funds, banks and exchanges, and hosts workshops on Python programming and algor
byChat With Traders
0 ratings
0% found this document useful
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
Podcast episode
[DataFramed Careers Series #2] What Makes a Great Data Science Portfolio
byDataFramed
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Science, Technology, and Society
0 ratings
0% found this document useful
Advantages of Completing Small Python Projects
Podcast episode
Advantages of Completing Small Python Projects
byThe Real Python Podcast
0 ratings
0% found this document useful
An Exploration Of Financial Exchange Risk Management Strategies: An interview with Paul Stafford about his experiences using computational methods to build risk management strategies for financial exchange markets.
Podcast episode
An Exploration Of Financial Exchange Risk Management Strategies: An interview with Paul Stafford about his experiences using computational methods to build risk management strategies for financial exchange markets.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
Podcast episode
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
byAnalytics on Fire
0 ratings
0% found this document useful
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
Podcast episode
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
byAI Live & Unbiased
0 ratings
0% found this document useful
Composable Data Analytics
Podcast episode
Composable Data Analytics
byThe Cloudcast
0 ratings
0% found this document useful
Big Data, Data Lakes, and Blockchain with Rahul Pathak, Executive at Amazon Web Services: Everyone knows that data is exploding. What most people don’t realize is the pace and ways in which data is changing our everyday lives. According to , we’re seeing a “roughly 10x increase in data every 5 years, and the types of data that’s...
Podcast episode
Big Data, Data Lakes, and Blockchain with Rahul Pathak, Executive at Amazon Web Services: Everyone knows that data is exploding. What most people don’t realize is the pace and ways in which data is changing our everyday lives. According to , we’re seeing a “roughly 10x increase in data every 5 years, and the types of data that’s...
byMission Daily
0 ratings
0% found this document useful
MLOps.community #6 - Mid Scale Production Feature Engineering with Dr. Venkata Pingali
Podcast episode
MLOps.community #6 - Mid Scale Production Feature Engineering with Dr. Venkata Pingali
byMLOps.community
0 ratings
0% found this document useful
70: Web Components at Microsoft: Summary Daniel Buchner (@csuwildcat), former Mozillian & Program Manager at Microsoft takes us through the plans for Web Components at Microsoft. Daniel is the creator of the Web Components free open source library, X-Tag which Microsoft is now...
Podcast episode
70: Web Components at Microsoft: Summary Daniel Buchner (@csuwildcat), former Mozillian & Program Manager at Microsoft takes us through the plans for Web Components at Microsoft. Daniel is the creator of the Web Components free open source library, X-Tag which Microsoft is now...
byThe Web Platform Podcast
0 ratings
0% found this document useful
[DataFramed Careers Series #3]: Accelerating Data Careers with Writing
Podcast episode
[DataFramed Careers Series #3]: Accelerating Data Careers with Writing
byDataFramed
0 ratings
0% found this document useful
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
Podcast episode
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
byData Engineering Podcast
0 ratings
0% found this document useful
65: Strand Web Components: Summary MediaMath (@MediaMath) has created an open source project built on top of Web Components & Polymer (@Polymer) called Strand. It was created for their internal web product Terminal One but is available and easy to get on Github....
Podcast episode
65: Strand Web Components: Summary MediaMath (@MediaMath) has created an open source project built on top of Web Components & Polymer (@Polymer) called Strand. It was created for their internal web product Terminal One but is available and easy to get on Github....
byThe Web Platform Podcast
0 ratings
0% found this document useful
Data Analytics Launches with Bruno Aziza and Eric Schmidt: Stephanie Wong and Jenny Brown are your hosts this week, discussing data analytics with the yin and yang of the field, Bruno Aziza and Eric Schmidt.
Podcast episode
Data Analytics Launches with Bruno Aziza and Eric Schmidt: Stephanie Wong and Jenny Brown are your hosts this week, discussing data analytics with the yin and yang of the field, Bruno Aziza and Eric Schmidt.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
WLP224 What's Going On and Do Online Meetings Matter?: This podcast is brought to you by Virtual Not Distant Ltd. For full shownotes, and details of Pilar Orti's new book Online Meetings That Matter, please see . And here's What's Going On: - This article reflects a moving target in a fast-moving...
Podcast episode
WLP224 What's Going On and Do Online Meetings Matter?: This podcast is brought to you by Virtual Not Distant Ltd. For full shownotes, and details of Pilar Orti's new book Online Meetings That Matter, please see . And here's What's Going On: - This article reflects a moving target in a fast-moving...
by21st Century Work Life and leading remote teams
0 ratings
0% found this document useful

Skip carousel

Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Manipulate Data Like A Pro With Pandas
Linux Format
Article
Manipulate Data Like A Pro With Pandas
Jul 27, 2021
7 min read
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Chicago Tribune
Article
Want A Job In Data Science? You Might Have To Take A Standardized Test When Applying
Jul 10, 2018
3 min read
Machine Learning And Investing: The Cautious Seldom Err Or Write Great Poetry
Finweek - English
Article
Machine Learning And Investing: The Cautious Seldom Err Or Write Great Poetry
Oct 18, 2019
5 min read
How Image Recognition Works
APC
Article
How Image Recognition Works
Nov 4, 2019
4 min read
Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
DJANGO Create A Database-driven Website
Linux Format
Article
DJANGO Create A Database-driven Website
Jun 4, 2019
The Django web framework was named after the famous guitarist Django Reinhardt and was first created by web developers at a small newspaper in Kansas. The main goals of Django is to enable fast development of complex websites with database needs. It
7 min read
01 Giving Data Collectors—and Donors—a Real-Time Rush
Fast Company
Article
01 Giving Data Collectors—and Donors—a Real-Time Rush
Mar 20, 2017
7 min read
Inform And Enhance Your Business With Open Data
PC Pro Magazine
Article
Inform And Enhance Your Business With Open Data
Jun 10, 2021
7 min read
Why We Need To Fear The Risk Of AI Model Collapse
Evening Standard
Article
Why We Need To Fear The Risk Of AI Model Collapse
Dec 17, 2023
4 min read
2 The Use of Python in AI and ML
Techfastly
Article
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
Putting Artificial Intelligence to Work
Rotman Management
Article
Putting Artificial Intelligence to Work
May 1, 2018
11 min read
Finding Your Data
APC
Article
Finding Your Data
Sep 9, 2019
4 min read
Quantum Leap
Marketing
Article
Quantum Leap
Jul 11, 2019
6 min read
How And Where You Use Machine-learning
APC
Article
How And Where You Use Machine-learning
Oct 7, 2019
4 min read
COMPETITIVE ADVANTAGE THROUGH SOFTWARE: Contrasting Enterprises & Startups
The European Business Review
Article
COMPETITIVE ADVANTAGE THROUGH SOFTWARE: Contrasting Enterprises & Startups
Feb 4, 2019
6 min read
Thriving As An Ecosystem Partner
The European Business Review
Article
Thriving As An Ecosystem Partner
Sep 30, 2022
Researching ecosystems that span industries from e-commerce and publishing to semiconductors and healthcare over the past decade, we found companies that have been successful for years by contributing to an ecosystem. Sometimes, by contributing as pa
10 min read
Finding A New Career In AI
APC
Article
Finding A New Career In AI
Mar 23, 2020
4 min read
Leadership Forum: Investing in Disruption
Rotman Management
Article
Leadership Forum: Investing in Disruption
Jan 1, 2019
10 min read
“How Do You Launch A Product Without Alienating Or Damaging Your Customers?”
PC Pro Magazine
Article
“How Do You Launch A Product Without Alienating Or Damaging Your Customers?”
Feb 10, 2022
6 min read
It As The Whipping Boy: Mistakenly Confusing ‘Enterprise It’ With ‘Consumer It’
The European Business Review
Article
It As The Whipping Boy: Mistakenly Confusing ‘Enterprise It’ With ‘Consumer It’
Jul 31, 2020
As users of digital technologies in their personal lives, many executives pine for their internal IT systems to give them a similar experience and to be just like IT is in their daily lives. They point to the simplicity, ease of use and hassle free n
9 min read
Inside APC
APC
Article
Inside APC
Aug 12, 2019
2 min read
Inside APC
APC
Article
Inside APC
Jan 27, 2020
2 min read
Inside APC
APC
Article
Inside APC
Dec 2, 2019
2 min read
Inside APC
APC
Article
Inside APC
Jun 17, 2019
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over 35 years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on th
2 min read
Inside APC
APC
Article
Inside APC
Oct 7, 2019
2 min read
Inside APC
APC
Article
Inside APC
Jul 15, 2019
2 min read
Inside APC
APC
Article
Inside APC
Feb 24, 2020
2 min read
Inside APC
APC
Article
Inside APC
Sep 9, 2019
2 min read
Inside APC
APC
Article
Inside APC
Nov 4, 2019
2 min read

Related categories

Skip carousel

Reviews for R High Performance Programming

Rating: 4.25 out of 5 stars

4.5/5

2 ratings0 reviews

Book preview

R High Performance Programming - Aloysius Lim

R High Performance Programming

Credits

About the Authors

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Understanding R's Performance – Why Are R Programs Sometimes Slow?

Three constraints on computing performance – CPU, RAM, and disk I/O

R is interpreted on the fly

R is single-threaded

R requires all data to be loaded into memory

Algorithm design affects time and space complexity

Summary

2. Profiling – Measuring Code's Performance

Measuring total execution time

Measuring execution time with system.time()

Repeating time measurements with rbenchmark

Measuring distribution of execution time with microbenchmark

Profiling the execution time

Profiling a function with Rprof()

The profiling results

Profiling memory utilization

Monitoring memory utilization, CPU utilization, and disk I/O using OS tools

Identifying and resolving bottlenecks

Summary

3. Simple Tweaks to Make R Run Faster

Vectorization

Use of built-in functions

Preallocating memory

Use of simpler data structures

Use of hash tables for frequent lookups on large data

Seeking fast alternative packages in CRAN

Summary

4. Using Compiled Code for Greater Speed

Compiling R code before execution

Compiling functions

Just-in-time (JIT) compilation of R code

Using compiled languages in R

Prerequisites

Including compiled code inline

Calling external compiled code

Considerations for using compiled code

R APIs

R data types versus native data types

Creating R objects and garbage collection

Allocating memory for non-R objects

Summary

5. Using GPUs to Run R Even Faster

General purpose computing on GPUs

R and GPUs

Installing gputools

Fast statistical modeling in R with gputools

Summary

6. Simple Tweaks to Use Less RAM

Reusing objects without taking up more memory

Removing intermediate data when it is no longer needed

Calculating values on the fly instead of storing them persistently

Swapping active and nonactive data

Summary

7. Processing Large Datasets with Limited RAM

Using memory-efficient data structures

Smaller data types

Sparse matrices

Symmetric matrices

Bit vectors

Using memory-mapped files and processing data in chunks

The bigmemory package

The ff package

Summary

8. Multiplying Performance with Parallel Computing

Data parallelism versus task parallelism

Implementing data parallel algorithms

Implementing task parallel algorithms

Running the same task on workers in a cluster

Running different tasks on workers in a cluster

Executing tasks in parallel on a cluster of computers

Shared memory versus distributed memory parallelism

Optimizing parallel performance

Summary

9. Offloading Data Processing to Database Systems

Extracting data into R versus processing data in a database

Preprocessing data in a relational database using SQL

Converting R expressions to SQL

Using dplyr

Using PivotalR

Running statistical and machine learning algorithms in a database

Using columnar databases for improved performance

Using array databases for maximum scientific-computing performance

Summary

10. R and Big Data

Understanding Hadoop

Setting up Hadoop on Amazon Web Services

Processing large datasets in batches using Hadoop

Uploading data to HDFS

Analyzing HDFS data with RHadoop

Other Hadoop packages for R

Summary

Index

R High Performance Programming

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: January 2015

Production reference: 1230115

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78398-926-3

www.packtpub.com

Credits

Authors

Aloysius Lim

William Tjhi

Reviewers

Richard Cotton

Kirill Müller

John Silberholz

Commissioning Editor

Kunal Parikh

Acquisition Editor

Richard Brookes-Bland

Content Development Editor

Susmita Sabat

Technical Editor

Shiny Poojary

Copy Editor

Neha Vyas

Project Coordinator

Milton Dsouza

Proofreaders

Ameesha Green

Clyde Jenkins

Jonathan Todd

Indexer

Tejal Soni

Graphics

Sheetal Aute

Valentina D'silva

Production Coordinator

Komal Ramchandani

Cover Work

Komal Ramchandani

About the Authors

Aloysius Lim has a knack for translating complex data and models into easy-to-understand insights. As cofounder of About People, a data science and design consultancy, he loves solving problems and helping others to find practical solutions to business challenges using data. His breadth of experience—7 years in the government, education, and retail industries—equips him with unique perspectives to find creative solutions.

My deepest thanks go to God for the opportunity to write this book and share the knowledge that I have been given. My lovely wife, Bethany, has been a tremendous source of support and encouragement throughout this project. Thank you dear, for all your love. Many thanks to my partner William for his wonderful friendship. He has been a source of inspiration and insights throughout this journey.

William Tjhi is a data scientist with years of experience working in academia, government, and industry. He began his data science journey as a PhD candidate researching new algorithms to improve the robustness of high-dimensional data clustering. Upon receiving his doctorate, he moved from basic to applied research, solving problems among others in molecular biology and epidemiology using machine learning. He published some of his research in peer-reviewed journals and conferences. With the rise of Big Data, William left academia for industry, where he started practicing data science in both business and public sector settings. William is passionate about R and has been using it as his primary analysis tool since his research days. He was once part of Revolution Analytics, and there he contributed to make R more suitable for Big Data.

I would like to thank my coauthor, Aloysius. Your hard work, patience, and determination made this book possible.

About the Reviewers

Richard Cotton is a data scientist with a mixed background in proteomics, debt collection, and chemical health and safety, and he has worked extensively on tools to give nontechnical users access to statistical models. He is the author of the book Learning R, O'Reilly, and has created a number of popular R packages, including assertive, regex, pathological, and sig. He works for Weill Cornell Medical College in Qatar.

Kirill Müller holds a diploma in computer science and currently works as a research assistant at the Institute for Transport Planning and Systems of the Swiss Federal Institute of Technology (ETHZ) in Zurich. He is an avid R user and has contributed to several R packages.

John Silberholz is a fourth year PhD student at the MIT Operations Research Center, working under advisor Dimitris Bertsimas. His thesis research focuses on data-driven approaches to design novel chemotherapy regimens for advanced cancer and approaches to identify effective population screening strategies for cancer. His research interests also include analytical applications in the fields of bibliometrics and heuristic evaluation. John codeveloped 15.071x: The Analytics Edge, a massive open online course (MOOC), which teaches machine learning and optimization using R and spreadsheet solvers.

Before coming to MIT, John completed his BS degree in mathematics and computer science from the University of Maryland. He completed internships as a software developer at Microsoft and Google, and he cofounded Enertaq, an electricity grid reliability start-up.

www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

Preface

In a world where data is becoming increasingly important, business people and scientists need tools to analyze and process large volumes of data efficiently. R is one of the tools that have become increasingly popular in recent years for data processing, statistical analysis, and data science. While R has its roots in academia, it is now used by organizations across a wide range of industries and geographical areas.

But the design of R imposes some inherent limits on the size of the data and the complexity of computations that it can manage efficiently. This can be a huge obstacle for R users who need to process the ever-growing volume of data in their organizations.

This book, R High Performance Programming, will help you understand the situations that often pose performance difficulties in R, such as memory and computational limits. It will also show you a range of techniques to overcome these performance limits. You can choose to use these techniques alone, or in various combinations that best fit your needs and your computing environment.

This book is designed to be a practical guide on how to improve the performance of R programs, with just enough explanation of why, so that you understand the reasoning behind each solution. As such, we will provide code examples for every technique that we cover in this book, along with performance profiling results that we generated on our machines to demonstrate the performance improvements. We encourage you to follow along by entering and running the code in your own environment to see the performance improvements for yourself.

If you would like to understand how R is designed and why it has performance limitations, the R Internals documentation (http://cran.r-project.org/doc/manuals/r-release/R-ints.html) will provide helpful clues.

This book is written based on open source R because it is the most widely used version of R and is freely available to anybody. If you are using a commercial version of R, check with your software vendor to see what performance improvements they might have made available to you.

The R community has created many new packages to improve the performance of R, which are available on the Comprehensive R Archive Network (CRAN) (http://cran.r-project.org/). We cannot analyze every package on CRAN—there are thousands of them—to see if they provide performance enhancements for specific operations. Instead, this book focuses on the most common tasks for R programmers and introduces techniques that you can use on any R project.

What this book covers

Chapter 1, Understanding R's Performance – Why Are R Programs Sometimes Slow?, kicks off our journey by taking a peek under R's hood to explore the various ways in which R programs can hit performance limits. We will look at how R's design sometimes creates performance bottlenecks in R programs in terms of computation (CPU), memory (RAM), and disk input/output (I/O).

Chapter 2, Profiling – Measuring Code's Performance, introduces a few techniques that we will use throughout the book to measure the performance of R code, so that we can understand the nature of our performance problems.

Chapter 3, Simple Tweaks to Make R Run Faster, describes how to improve the computational speed of R code. These are basic techniques that you can use in any R program.

Chapter 4, Using Compiled Code for Greater Speed, explores the use of compiled code in another programming language such as C to maximize the performance of our computations. We will see how compiled code can perform faster than R, and look at how to integrate compiled code into our R programs.

Chapter 5, Using GPUs to Run R Even Faster, brings us to the realm of modern accelerators by leveraging Graphics Processing Units (GPUs) to run complex computations at high speed.

Chapter 6, Simple Tweaks to Use Less RAM, describes the basic techniques to manage and optimize RAM utilization of your R programs to allow you to process larger datasets.

Chapter 7, Processing Large Datasets with Limited RAM, explains how to process datasets that are larger than the available RAM using memory-efficient data structures and disk resident data formats.

Chapter 8, Multiplying Performance with Parallel Computing, introduces parallelism in R. We will explore how to run code in parallel in R on a single machine and on multiple machines. We will also look at the factors that need to be considered in the design of our parallel code.

Chapter 9, Offloading Data Processing to Database Systems, describes how certain computations can be offloaded to an external database system. This is useful to minimize Big Data movements in and out of the database, and especially when you already have access to a powerful database system with computational power and speed for you to leverage.

Chapter 10, R and Big Data, concludes the book by exploring the use of Big Data technologies to take R's performance to the limit.

If you are in a hurry, we recommend that you read the following chapters first, then supplement your reading with other chapters that are relevant for your situation:

Chapter 1, Understanding R's Performance – Why Are R Programs Sometimes Slow?

Chapter 2, Profiling – Measuring Code's Performance

Chapter 3, Simple Tweaks to Make R Run Faster

Chapter 6, Simple Tweaks to Use Less RAM

What you need for this book

All the codes in this book were developed in R 3.1.1 64-bit on Mac OS X 10.9. Wherever possible, they have also been tested on Ubuntu desktop 14.04 LTS and Windows 8.1. All code examples can be downloaded from https://github.com/r-high-performance-programming/rhpp-2015.

To follow along the code examples, we recommend you to install R 3.1.1 64-bit or a later version in your environment.

We also recommend you to run R in a Unix environment (this includes Linux and Mac OS X). While R runs on Windows, some packages that we will use, for example, bigmemory runs only in a Unix environment. Whenever there

Enjoying the preview?

Page 1 of 1

R High Performance Programming

About this ebook

Aloysius Lim

Related authors

Related to R High Performance Programming

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for R High Performance Programming

What did you think?

Book preview

R High Performance Programming - Aloysius Lim

Table of Contents

R High Performance Programming

R High Performance Programming

Credits

About the Authors

About the Reviewers

Support files, eBooks, discount offers, and more

Why subscribe?

Preface

What this book covers

What you need for this book