Discovering Partial Least Squares with JMP

Ebook507 pages3 hours

Discovering Partial Least Squares with JMP

Name: Discovering Partial Least Squares with JMP
Author: Ian Cox
ISBN: 9781612908298

By Ian Cox and Marie Gaudard

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Partial Least Squares (PLS) is a flexible statistical modeling technique that applies to data of any shape. It models relationships between inputs and outputs even when there are more predictors than observations. Using JMP statistical discovery software from SAS, Discovering Partial Least Squares with JMP explores PLS and positions it within the more general context of multivariate analysis.

Ian Cox and Marie Gaudard use a “learning through doingâ€ style. This approach, coupled with the interactivity that JMP itself provides, allows you to actively engage with the content. Four complete case studies are presented, accompanied by data tables that are available for download. The detailed “how toâ€ steps, together with the interpretation of the results, help to make this book unique.

Discovering Partial Least Squares with JMP is of interest to professionals engaged in continuing development, as well as to students and instructors in a formal academic setting. The content aligns well with topics covered in introductory courses on: psychometrics, customer relationship management, market research, consumer research, environmental studies, and chemometrics. The book can also function as a supplement to courses in multivariate statistics and to courses on statistical methods in biology, ecology, chemistry, and genomics.

While the book is helpful and instructive to those who are using JMP, a knowledge of JMP is not required, and little or no prior statistical knowledge is necessary. By working through the introductory chapters and the case studies, you gain a deeper understanding of PLS and learn how to use JMP to perform PLS analyses in real-world situations.

This book motivates current and potential users of JMP to extend their analytical repertoire by embracing PLS. Dynamically interacting with JMP, you will develop confidence as you explore underlying concepts and work through the examples. The authors provide background and guidance to support and empower you on this journey.

This book is part of the SAS Press program.

Skip carousel

LanguageEnglish

PublisherSAS Institute

Release dateOct 1, 2013

ISBN9781612908298

Author

Ian Cox

Ian Cox currently works in the JMP Division of SAS. Before joining SAS in 1999, he worked for Digital, Motorola, and BBN Software Solutions Ltd. and has been a consultant for many companies on data analysis, process control, and experimental design. A Six Sigma Black Belt, he was a Visiting Fellow at Cranfield University and is a Fellow of the Royal Statistical Society in the United Kingdom. Cox holds a Ph.D. in theoretical physics.

Related authors

Skip carousel

Related to Discovering Partial Least Squares with JMP

Related ebooks

Skip carousel

JMP for Basic Univariate and Multivariate Statistics: Methods for Researchers and Social Scientists, Second Edition
Ebook
JMP for Basic Univariate and Multivariate Statistics: Methods for Researchers and Social Scientists, Second Edition
byAnn Lehman, PhD
Rating: 0 out of 5 stars
0 ratings
JMP for Mixed Models
Ebook
JMP for Mixed Models
byRuth Hummel
Rating: 0 out of 5 stars
0 ratings
Market Data Analysis Using JMP
Ebook
Market Data Analysis Using JMP
byWalter R. Paczkowski
Rating: 0 out of 5 stars
0 ratings
SPSS Data Analysis for Univariate, Bivariate, and Multivariate Statistics
Ebook
SPSS Data Analysis for Univariate, Bivariate, and Multivariate Statistics
byDaniel J. Denis
Rating: 0 out of 5 stars
0 ratings
Handbook of Statistical Analysis and Data Mining Applications
Ebook
Handbook of Statistical Analysis and Data Mining Applications
byRobert Nisbet
Rating: 4 out of 5 stars
4/5
JSL Companion: Applications of the JMP Scripting Language, Second Edition
Ebook
JSL Companion: Applications of the JMP Scripting Language, Second Edition
byTheresa Utlaut
Rating: 0 out of 5 stars
0 ratings
Simulation for Data Science with R
Ebook
Simulation for Data Science with R
byMatthias Templ
Rating: 0 out of 5 stars
0 ratings
Biostatistics Using JMP: A Practical Guide
Ebook
Biostatistics Using JMP: A Practical Guide
byTrevor Bihl
Rating: 0 out of 5 stars
0 ratings
Pharmaceutical Quality by Design Using JMP: Solving Product Development and Manufacturing Problems
Ebook
Pharmaceutical Quality by Design Using JMP: Solving Product Development and Manufacturing Problems
byRob Lievense
Rating: 5 out of 5 stars
5/5
Applying Data Science: Business Case Studies Using SAS
Ebook
Applying Data Science: Business Case Studies Using SAS
byGerhard Svolba
Rating: 0 out of 5 stars
0 ratings
Design and Analysis of Experiments by Douglas Montgomery: A Supplement for Using JMP
Ebook
Design and Analysis of Experiments by Douglas Montgomery: A Supplement for Using JMP
byHeath Rushing
Rating: 0 out of 5 stars
0 ratings
Practical Data Analysis in Chemistry
Ebook
Practical Data Analysis in Chemistry
byMarcel Maeder
Rating: 5 out of 5 stars
5/5
Targeted Biomarker Quantitation by LC-MS
Ebook
Targeted Biomarker Quantitation by LC-MS
byNaidong Weng
Rating: 0 out of 5 stars
0 ratings
Statistics for Biomedical Engineers and Scientists: How to Visualize and Analyze Data
Ebook
Statistics for Biomedical Engineers and Scientists: How to Visualize and Analyze Data
byAndrew P. King
Rating: 0 out of 5 stars
0 ratings
Multivariate Analysis in the Pharmaceutical Industry
Ebook
Multivariate Analysis in the Pharmaceutical Industry
byAna Patricia Ferreira
Rating: 0 out of 5 stars
0 ratings
Time Series Analysis A Complete Guide - 2020 Edition
Ebook
Time Series Analysis A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Mastering Spark for Data Science
Ebook
Mastering Spark for Data Science
byAndrew Morgan
Rating: 0 out of 5 stars
0 ratings
Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information
Ebook
Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information
byJules J. Berman
Rating: 0 out of 5 stars
0 ratings
Informatics for Materials Science and Engineering: Data-driven Discovery for Accelerated Experimentation and Application
Ebook
Informatics for Materials Science and Engineering: Data-driven Discovery for Accelerated Experimentation and Application
byKrishna Rajan
Rating: 5 out of 5 stars
5/5
Descriptive Analysis in Sensory Evaluation
Ebook
Descriptive Analysis in Sensory Evaluation
bySarah E. Kemp
Rating: 0 out of 5 stars
0 ratings
Experimental Design Techniques in Statistical Practice: A Practical Software-Based Approach
Ebook
Experimental Design Techniques in Statistical Practice: A Practical Software-Based Approach
byWilliam P Gardiner
Rating: 3 out of 5 stars
3/5
Information System Implementations: Using a Leadership Quality Matrix for Success: System Implementations, Gain Significant Momentum, an Insiders Guide to What You Need to Know
Ebook
Information System Implementations: Using a Leadership Quality Matrix for Success: System Implementations, Gain Significant Momentum, an Insiders Guide to What You Need to Know
byAndries J. Jacobs
Rating: 0 out of 5 stars
0 ratings
Microfluidics for Pharmaceutical Applications: From Nano/Micro Systems Fabrication to Controlled Drug Delivery
Ebook
Microfluidics for Pharmaceutical Applications: From Nano/Micro Systems Fabrication to Controlled Drug Delivery
byHélder A. Santos
Rating: 0 out of 5 stars
0 ratings
Principles and Practice of Big Data: Preparing, Sharing, and Analyzing Complex Information
Ebook
Principles and Practice of Big Data: Preparing, Sharing, and Analyzing Complex Information
byJules J. Berman
Rating: 0 out of 5 stars
0 ratings
Statistical Method from the Viewpoint of Quality Control
Ebook
Statistical Method from the Viewpoint of Quality Control
byWalter A. Shewhart
Rating: 5 out of 5 stars
5/5
Computational Toxicology: Methods and Applications for Risk Assessment
Ebook
Computational Toxicology: Methods and Applications for Risk Assessment
byBruce A. Fowler
Rating: 0 out of 5 stars
0 ratings
Environmental Analysis
Ebook
Environmental Analysis
byGalen Ewing
Rating: 0 out of 5 stars
0 ratings
Downstream processing Third Edition
Ebook
Downstream processing Third Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Quality Systems and Controls for Pharmaceuticals
Ebook
Quality Systems and Controls for Pharmaceuticals
byDipak Kumar Sarker
Rating: 0 out of 5 stars
0 ratings
Data Processing and Reconciliation for Chemical Process Operations
Ebook
Data Processing and Reconciliation for Chemical Process Operations
byJose A. Romagnoli
Rating: 0 out of 5 stars
0 ratings

Enterprise Applications For You

Skip carousel

Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Excel Formulas and Functions 2020: Excel Academy, #1
Ebook
Excel Formulas and Functions 2020: Excel Academy, #1
byAdam Ramirez
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
Ebook
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
byTJ Books
Rating: 3 out of 5 stars
3/5
101 Ready-to-Use Excel Formulas
Ebook
101 Ready-to-Use Excel Formulas
byMichael Alexander
Rating: 4 out of 5 stars
4/5
Bitcoin For Dummies
Ebook
Bitcoin For Dummies
byPrypto
Rating: 4 out of 5 stars
4/5
Microsoft Power Platform A Deep Dive: Dig into Power Apps, Power Automate, Power BI, and Power Virtual Agents (English Edition)
Ebook
Microsoft Power Platform A Deep Dive: Dig into Power Apps, Power Automate, Power BI, and Power Virtual Agents (English Edition)
byBijay Kumar Sahoo
Rating: 0 out of 5 stars
0 ratings
Enterprise AI For Dummies
Ebook
Enterprise AI For Dummies
byZachary Jarvinen
Rating: 3 out of 5 stars
3/5
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
Ebook
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
byKevin Pitch
Rating: 5 out of 5 stars
5/5
Microsoft Outlook Guide to Success: Learn Smart Email Practices and Calendar Management for a Smooth Workflow [II EDITION]
Ebook
Microsoft Outlook Guide to Success: Learn Smart Email Practices and Calendar Management for a Smooth Workflow [II EDITION]
byKevin Pitch
Rating: 5 out of 5 stars
5/5
Excel 2019 For Dummies
Ebook
Excel 2019 For Dummies
byGreg Harvey
Rating: 3 out of 5 stars
3/5
The New Email Revolution: Save Time, Make Money, and Write Emails People Actually Want to Read!
Ebook
The New Email Revolution: Save Time, Make Money, and Write Emails People Actually Want to Read!
byRobert W. Bly
Rating: 5 out of 5 stars
5/5
Excel for Beginners 2023: A Step-by-Step and Quick Reference Guide to Master the Fundamentals, Formulas, Functions, & Charts in Excel with Practical Examples | A Complete Excel Shortcuts Cheat Sheet
Ebook
Excel for Beginners 2023: A Step-by-Step and Quick Reference Guide to Master the Fundamentals, Formulas, Functions, & Charts in Excel with Practical Examples | A Complete Excel Shortcuts Cheat Sheet
byJames H. Moyle
Rating: 0 out of 5 stars
0 ratings
Learn Windows PowerShell in a Month of Lunches
Ebook
Learn Windows PowerShell in a Month of Lunches
byDon Jones
Rating: 0 out of 5 stars
0 ratings
Excel 2023 for Beginners: A Complete Quick Reference Guide from Beginner to Advanced with Simple Tips and Tricks to Master All Essential Fundamentals, Formulas, Functions, Charts, Tools, & Shortcuts
Ebook
Excel 2023 for Beginners: A Complete Quick Reference Guide from Beginner to Advanced with Simple Tips and Tricks to Master All Essential Fundamentals, Formulas, Functions, Charts, Tools, & Shortcuts
byTerry R. Hoffmann
Rating: 0 out of 5 stars
0 ratings
Excel Guide for Success
Ebook
Excel Guide for Success
byKevin Pitch
Rating: 5 out of 5 stars
5/5
Excel 2019 Bible
Ebook
Excel 2019 Bible
byMichael Alexander
Rating: 4 out of 5 stars
4/5
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
Ebook
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
byKevin Clark
Rating: 5 out of 5 stars
5/5
Excel Formulas That Automate Tasks You No Longer Have Time For
Ebook
Excel Formulas That Automate Tasks You No Longer Have Time For
byErik Kopp
Rating: 5 out of 5 stars
5/5
Experts' Guide to OneNote
Ebook
Experts' Guide to OneNote
byJeremy P. Jones
Rating: 5 out of 5 stars
5/5
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
50 Useful Excel Functions: Excel Essentials, #3
Ebook
50 Useful Excel Functions: Excel Essentials, #3
byM.L. Humphrey
Rating: 5 out of 5 stars
5/5
QuickBooks Online For Dummies
Ebook
QuickBooks Online For Dummies
byDavid H. Ringstrom
Rating: 0 out of 5 stars
0 ratings
Excel Tips and Tricks
Ebook
Excel Tips and Tricks
byM.L. Humphrey
Rating: 0 out of 5 stars
0 ratings
Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program
Ebook
Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program
byJohn Ladley
Rating: 4 out of 5 stars
4/5
Essential Office 365 Third Edition: The Illustrated Guide to Using Microsoft Office
Ebook
Essential Office 365 Third Edition: The Illustrated Guide to Using Microsoft Office
byKevin Wilson
Rating: 3 out of 5 stars
3/5
Learning Microsoft Azure
Ebook
Learning Microsoft Azure
byGeoff Webber-Cross
Rating: 4 out of 5 stars
4/5
QuickBooks 2023 All-in-One For Dummies
Ebook
QuickBooks 2023 All-in-One For Dummies
byStephen L. Nelson
Rating: 0 out of 5 stars
0 ratings
Building Web Services with Microsoft Azure
Ebook
Building Web Services with Microsoft Azure
byAlex Belotserkovskiy
Rating: 0 out of 5 stars
0 ratings
Evernote Essentials Guide (Boxed Set): Evernote Guide For Beginners for Organizing Your Life
Ebook
Evernote Essentials Guide (Boxed Set): Evernote Guide For Beginners for Organizing Your Life
bySpeedy Publishing
Rating: 3 out of 5 stars
3/5
MrExcel XL: The 40 Greatest Excel Tips of All Time
Ebook
MrExcel XL: The 40 Greatest Excel Tips of All Time
byBill Jelen
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

Recycling, Blockchain & Reverse Supply Chains, the Business Case for Sustainability & Recycling Plastics, and Social Environmentalism w/ Stan Chen of Recycle Go
Podcast episode
Recycling, Blockchain & Reverse Supply Chains, the Business Case for Sustainability & Recycling Plastics, and Social Environmentalism w/ Stan Chen of Recycle Go
bySupply Chain Revolution
0 ratings
0% found this document useful
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
Podcast episode
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
byData Engineering Podcast
0 ratings
0% found this document useful
Spanner Myths Busted with Pritam Shah and Vaibhav Govil: This week, we’re busting myths around Google Cloud Spanner with our guests Pritam Shah and Vaibhav Govil. and host this episode and learn about the fantastic capabilities of Cloud Spanner. Our guests give us a quick run-down of Spanner database...
Podcast episode
Spanner Myths Busted with Pritam Shah and Vaibhav Govil: This week, we’re busting myths around Google Cloud Spanner with our guests Pritam Shah and Vaibhav Govil. and host this episode and learn about the fantastic capabilities of Cloud Spanner. Our guests give us a quick run-down of Spanner database...
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Ep. 65 - Data Modeling
Podcast episode
Ep. 65 - Data Modeling
byWhat's Your Baseline? Enterprise Architecture & Business Process Management Demystified
0 ratings
0% found this document useful
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
Podcast episode
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
byData Engineering Podcast
0 ratings
0% found this document useful
Machine Learning in Performance with Gopal Brugalette: Managing the performance of complex systems requires more than simply running load tests. You need to perform a careful analysis of test results and production metrics. The sheer amount of data generated makes analysis a challenge that is often left...
Podcast episode
Machine Learning in Performance with Gopal Brugalette: Managing the performance of complex systems requires more than simply running load tests. You need to perform a careful analysis of test results and production metrics. The sheer amount of data generated makes analysis a challenge that is often left...
byTestGuild Devops Toolchain Podcast
0 ratings
0% found this document useful
22. Luke Marsden - Data Science Infrastructure and MLOps
Podcast episode
22. Luke Marsden - Data Science Infrastructure and MLOps
byTowards Data Science
0 ratings
0% found this document useful
Unpacking The Seven Principles Of Modern Data Pipelines: Data pipelines are the core of every data product, ML model, and business intelligence dashboard. If you're not careful you will end up spending all of your time on maintenance and fire-fighting. The folks at Rivery distilled the seven principles of modern data pipelines that will help you stay out of trouble and be productive with your data. In this episode Ariel Pohoryles explains what they are and how they work together to increase your chances of success.
Podcast episode
Unpacking The Seven Principles Of Modern Data Pipelines: Data pipelines are the core of every data product, ML model, and business intelligence dashboard. If you're not careful you will end up spending all of your time on maintenance and fire-fighting. The folks at Rivery distilled the seven principles of modern data pipelines that will help you stay out of trouble and be productive with your data. In this episode Ariel Pohoryles explains what they are and how they work together to increase your chances of success.
byData Engineering Podcast
0 ratings
0% found this document useful
#08 - Tech stack: Metabase, Superset, Redash, Grafana
Podcast episode
#08 - Tech stack: Metabase, Superset, Redash, Grafana
byTOPP - The Open Podcast Podcast
0 ratings
0% found this document useful
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
Podcast episode
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
byData Engineering Podcast
0 ratings
0% found this document useful
MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2
Podcast episode
MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2
byMLOps.community
0 ratings
0% found this document useful
Putting machine learning into a database: Most data scientists bounce back and forth regula…
Podcast episode
Putting machine learning into a database: Most data scientists bounce back and forth regula…
byLinear Digressions
0 ratings
0% found this document useful
Database Refactoring Patterns with Pramod Sadalage - Episode 22: Evolutionary Database Design and Refactoring (Interview)
Podcast episode
Database Refactoring Patterns with Pramod Sadalage - Episode 22: Evolutionary Database Design and Refactoring (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
How Apple Podcasts Analytics help you understand your audience
Podcast episode
How Apple Podcasts Analytics help you understand your audience
byPodcasting Q&A
0 ratings
0% found this document useful
Automate Your Pipeline Creation For Streaming Data Transformations With SQLake: Managing end-to-end data flows becomes complex and unwieldy as the scale of data and its variety of applications in an organization grows. Part of this complexity is due to the transformation and orchestration of data living in disparate systems. The team at Upsolver is taking aim at this problem with the latest iteration of their platform in the form of SQLake. In this episode Ori Rafael explains how they are automating the creation and scheduling of orchestration flows and their related transforations in a unified SQL interface.
Podcast episode
Automate Your Pipeline Creation For Streaming Data Transformations With SQLake: Managing end-to-end data flows becomes complex and unwieldy as the scale of data and its variety of applications in an organization grows. Part of this complexity is due to the transformation and orchestration of data living in disparate systems. The team at Upsolver is taking aim at this problem with the latest iteration of their platform in the form of SQLake. In this episode Ori Rafael explains how they are automating the creation and scheduling of orchestration flows and their related transforations in a unified SQL interface.
byData Engineering Podcast
0 ratings
0% found this document useful
69: Testing Front End Code: Summary Oren Rubin (@Shexman) goes through why it’s important to not only test the back-end code of our applications but also to test our Front End code, the integration points, and the full user experience. Oren also goes through...
Podcast episode
69: Testing Front End Code: Summary Oren Rubin (@Shexman) goes through why it’s important to not only test the back-end code of our applications but also to test our Front End code, the integration points, and the full user experience. Oren also goes through...
byThe Web Platform Podcast
0 ratings
0% found this document useful
SaaS Analytics
Podcast episode
SaaS Analytics
byThe Cloudcast
0 ratings
0% found this document useful
Designing A Non-Relational Database Engine: Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database.
Podcast episode
Designing A Non-Relational Database Engine: Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database.
byData Engineering Podcast
0 ratings
0% found this document useful
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
Podcast episode
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
byData Engineering Podcast
0 ratings
0% found this document useful
Jamie Genge: Winning Big with Monte Carlo analysis in FP&A: More FP&A teams should take advantage of the secret power of Monte Carlo simulations, argues Jamie Genge, Head of Financial Planning and Analysis at the UK’s National Physical Laboratory (NPL). Genge runs FP&A at the NPL which employs 775 scientists ...
Podcast episode
Jamie Genge: Winning Big with Monte Carlo analysis in FP&A: More FP&A teams should take advantage of the secret power of Monte Carlo simulations, argues Jamie Genge, Head of Financial Planning and Analysis at the UK’s National Physical Laboratory (NPL). Genge runs FP&A at the NPL which employs 775 scientists ...
byFP&A Today
0 ratings
0% found this document useful
013 — Cloud Giants: Exploring Salesforce CPQ with Deepak & Grant
Podcast episode
013 — Cloud Giants: Exploring Salesforce CPQ with Deepak & Grant
byDevOps Diaries
0 ratings
0% found this document useful
Understanding Time-Series Database Patterns
Podcast episode
Understanding Time-Series Database Patterns
byThe Cloudcast
0 ratings
0% found this document useful
Powering your Copilot for Data – with Artem Keydunov of Cube.dev
Podcast episode
Powering your Copilot for Data – with Artem Keydunov of Cube.dev
byLatent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0
0 ratings
0% found this document useful
The View Below The Waterline Of Apache Iceberg And How It Fits In Your Data Lakehouse: Cloud data warehouses have unlocked a massive amount of innovation and investment in data applications, but they are still inherently limiting. Because of their complete ownership of your data they constrain the possibilities of what data you can store and how it can be used. Projects like Apache Iceberg provide a viable alternative in the form of data lakehouses that provide the scalability and flexibility of data lakes, combined with the ease of use and performance of data warehouses. Ryan Blue helped create the Iceberg project, and in this episode he rejoins the show to discuss how it has evolved and what he is doing in his new business Tabular to make it even easier to implement and maintain.
Podcast episode
The View Below The Waterline Of Apache Iceberg And How It Fits In Your Data Lakehouse: Cloud data warehouses have unlocked a massive amount of innovation and investment in data applications, but they are still inherently limiting. Because of their complete ownership of your data they constrain the possibilities of what data you can store and how it can be used. Projects like Apache Iceberg provide a viable alternative in the form of data lakehouses that provide the scalability and flexibility of data lakes, combined with the ease of use and performance of data warehouses. Ryan Blue helped create the Iceberg project, and in this episode he rejoins the show to discuss how it has evolved and what he is doing in his new business Tabular to make it even easier to implement and maintain.
byData Engineering Podcast
0 ratings
0% found this document useful
WBSP204: Grow Your Business by Understanding the Nuances of a Multi-Site ERP Implementation w/ Bob Feathers
Podcast episode
WBSP204: Grow Your Business by Understanding the Nuances of a Multi-Site ERP Implementation w/ Bob Feathers
byWBSRocks: Business Growth with ERP and Digital Transformation
0 ratings
0% found this document useful
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
Podcast episode
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
byData Engineering Podcast
0 ratings
0% found this document useful
The Journey to Change: I assume most of you reading this work with SQL Server, at least for some of your workday. I know there are plenty of you who also support Oracle, MySQL, PostgreSQL, or some other database platform. The results in our (Redgate's) showed that many...
Podcast episode
The Journey to Change: I assume most of you reading this work with SQL Server, at least for some of your workday. I know there are plenty of you who also support Oracle, MySQL, PostgreSQL, or some other database platform. The results in our (Redgate's) showed that many...
byVoice of the DBA
0 ratings
0% found this document useful
?ThursdAI - LAION down, OpenChat beats GPT3.5, Apple is showing where it's going, Midjourney v6 is here & Suno can make music!
Podcast episode
?ThursdAI - LAION down, OpenChat beats GPT3.5, Apple is showing where it's going, Midjourney v6 is here & Suno can make music!
byThursdAI - The top AI news from the past week
0 ratings
0% found this document useful
E84: Using Process Mapping and Regression to Reduce Electricity Usage
Podcast episode
E84: Using Process Mapping and Regression to Reduce Electricity Usage
byLean Six Sigma Bursts
0 ratings
0% found this document useful
RAG vs Fine-Tuning
Podcast episode
RAG vs Fine-Tuning
byDeep Papers
0 ratings
0% found this document useful

Skip carousel

The Three Cornerstones of a Smart Business
Rotman Management
Article
The Three Cornerstones of a Smart Business
Jan 1, 2019
Adaptable Products. Algorithms cannot iterate without the products—the online consumer interface that delivers customer experience directly while gathering consumer feedback to adjust algorithm models. Google’s search bar is a classic example of prod
1 min read
The Verdict
Linux Format
Article
The Verdict
Dec 15, 2020
2 min read
Machine-learning On Your Android Phone?
APC
Article
Machine-learning On Your Android Phone?
Dec 30, 2019
4 min read
An Expert Speaks Up on What You Should Know About Programming Languages
Entrepreneur
Article
An Expert Speaks Up on What You Should Know About Programming Languages
Oct 1, 2015
1 min read
Data Fabric
PC Pro Magazine
Article
Data Fabric
Aug 13, 2020
3 min read
Arq 7 Backup: Uniquely Versatile Local And Online Backup
PCWorld
Article
Arq 7 Backup: Uniquely Versatile Local And Online Backup
Aug 1, 2023
4 min read
Make AI Work For You
Linux Format
Article
Make AI Work For You
Apr 2, 2024
8 min read
A Continuously Improving Workplace
Artichoke
Article
A Continuously Improving Workplace
Aug 27, 2017
3 min read
Mastering Chatgpt
PC Pro Magazine
Article
Mastering Chatgpt
Jan 4, 2024
5 min read
Benchmark your SSD
APC
Article
Benchmark your SSD
Nov 2, 2020
4 min read
In Conversation With portrait Motorsport Images Rob Smedley
GP Racing UK
Article
In Conversation With portrait Motorsport Images Rob Smedley
Jul 8, 2021
3 min read
New Tools for Using the Sherwood Tables for Transceiver Selection
CQ Amateur Radio
Article
New Tools for Using the Sherwood Tables for Transceiver Selection
Jan 1, 2023
Receive performance has been one of the top criteria for transceiver selection by hams for decades. As the well-worn phrase goes, “if you can’t hear ‘em, you can’t work ‘em.” Rob Sherwood has been conducting bench tests on the receive performance of
10 min read
End Of The Line!
Linux Format
Article
End Of The Line!
Nov 15, 2022
"October 2023 may seem like a long time away, but that’s when MySQL 5.7 will hit end-of-life (EOL) status. This normally means no more updates or security patches will be released. For companies running this database in their applications, it is time
1 min read
Inside APC
APC
Article
Inside APC
Mar 20, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
Taming Complexity With Intelligence: A Movement To Help Businesses Along The SAP S/4HANA Journey
The European Business Review
Article
Taming Complexity With Intelligence: A Movement To Help Businesses Along The SAP S/4HANA Journey
Jan 31, 2020
6 min read
Inside APC
APC
Article
Inside APC
Jun 19, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
Inside APC
APC
Article
Inside APC
Apr 20, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
Inside APC
APC
Article
Inside APC
May 22, 2023
2 min read
Inside APC
APC
Article
Inside APC
Feb 20, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
Inside APC
APC
Article
Inside APC
Feb 20, 2023
APC is Australia’s oldest consumer technology magazine – having been consistently in print for over forty years, since our first issue way back in May 1980 – and we take that heritage and responsibility very seriously. While our focus is obviously on
2 min read
Inside APC
APC
Article
Inside APC
Sep 6, 2021
2 min read
Inside APC
APC
Article
Inside APC
Jan 24, 2022
2 min read
Inside APC
APC
Article
Inside APC
Nov 1, 2021
2 min read
Inside APC
APC
Article
Inside APC
Nov 29, 2021
2 min read
Inside APC
APC
Article
Inside APC
Dec 27, 2021
2 min read
Inside APC
APC
Article
Inside APC
Feb 21, 2022
2 min read
Inside APC
APC
Article
Inside APC
May 16, 2022
2 min read
Inside APC
APC
Article
Inside APC
Mar 21, 2022
2 min read
Cancel Your software Subscriptions
Computeractive
Article
Cancel Your software Subscriptions
Aug 3, 2022
Many software companies now allow users to pay for their products by monthly subscription. The benefits are that you can cancel at any time without having to fork out a big sum upfront, and get updated with new features and bug fixes as soon as they
4 min read
20 Computer Performance Tips
Music Tech Focus
Article
20 Computer Performance Tips
Sep 1, 2016
7 min read

Related categories

Skip carousel

Reviews for Discovering Partial Least Squares with JMP

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Discovering Partial Least Squares with JMP - Ian Cox

Discovering Partial Least Squares with JMP®

Ian Cox and Marie Gaudard

support.sas.com/bookstore

The correct bibliographic citation for this manual is as follows: Cox, Ian and Gaudard, Marie. 2013. Discovering Partial Least Squares with JMP®. Cary, NC: SAS Institute Inc.

Discovering Partial Least Squares with JMP®

ISBN 978-1-61290-829-8 (electronic book)

For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc.

For a web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication.

The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted materials. Your support of others’ rights is appreciated.

U.S. Government License Rights; Restricted Rights: The Software and its documentation is commercial computer software developed at private expense and is provided with RESTRICTED RIGHTS to the United States Government. Use, duplication or disclosure of the Software by the United States Government is subject to the license terms of this Agreement pursuant to, as applicable, FAR 12.212, DFAR 227.7202-1(a), DFAR 227.7202-3(a) and DFAR 227.7202-4 and, to the extent required under U.S. federal law, the minimum restricted rights as set out in FAR 52.227-19 (DEC 2007). If FAR 52.227-19 is applicable, this provision serves as notice under clause (c) thereof and no other notice is required to be affixed to the Software or documentation. The Government's rights in Software and documentation shall be only those set forth in this Agreement.

SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513-2414.

October 2013

SAS provides a complete selection of books and electronic products to help customers use SAS® software to its fullest potential. For more information about our offerings, visit support.sas.com/bookstore or call 1-800-727-3228.

SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies.

Preface

A Word to the Practitioner

The Organization of the Book

Required Software

Accessing the Supplementary Content

Chapter 1 Introducing Partial Least Squares

Modeling in General

Partial Least Squares in Today’s World

Transforming, and Centering and Scaling Data

An Example of a PLS Analysis

The Data and the Goal

The Analysis

Testing the Model

Chapter 2 A Review of Multiple Linear Regression

The Cars Example

Estimating the Coefficients

Underfitting and Overfitting: A Simulation

The Effect of Correlation among Predictors: A Simulation

Chapter 3 Principal Components Analysis: A Brief Visit

Principal Components Analysis

Centering and Scaling: An Example

The Importance of Exploratory Data Analysis in Multivariate Studies

Dimensionality Reduction via PCA

Chapter 4 A Deeper Understanding of PLS

Centering and Scaling in PLS

PLS as a Multivariate Technique

Why Use PLS?

How Does PLS Work?

PLS versus PCA

PLS Scores and Loadings

Some Technical Background

An Example Exploring Prediction

One-Factor NIPALS Model

Two-Factor NIPALS Model

Variable Selection

SIMPLS Fits

Choosing the Number of Factors

Cross Validation

Types of Cross Validation

A Simulation of K-Fold Cross Validation

Validation in the PLS Platform

The NIPALS and SIMPLS Algorithms

Useful Things to Remember About PLS

Chapter 5 Predicting Biological Activity

Background

The Data

Data Table Description

Initial Data Visualization

A First PLS Model

Our Plan

Performing the Analysis

The Partial Least Squares Report

The SIMPLS Fit Report

Other Options

A Pruned PLS Model

Model Fit

Diagnostics

Performance on Data from Second Study

Comparing Predicted Values for the Second Study to Actual Values

Comparing Residuals for Both Studies

Obtaining Additional Insight

Conclusion

Chapter 6 Predicting the Octane Rating of Gasoline

Background

The Data

Data Table Description

Creating a Test Set Indicator Column

Viewing the Data

Octane and the Test Set

Creating a Stacked Data Table

Constructing Plots of the Individual Spectra

Individual Spectra

Combined Spectra

A First PLS Model

Excluding the Test Set

Fitting the Model

The Initial Report

A Second PLS Model

Fitting the Model

High-Level Overview

Diagnostics

Score Scatterplot Matrices

Loading Plots

VIPs

Model Assessment Using Test Set

A Pruned Model

Chapter 7 Equation Chapter 1 Section 1Water Quality in the Savannah River Basin

Background

The Data

Data Table Description

Initial Data Visualization

Missing Response Values

Impute Missing Data

Distributions

Transforming AGPT

Differences by Ecoregion

Conclusions from Visual Analysis and Implications

A First PLS Model for the Savannah River Basin

Our Plan

Performing the Analysis

The Partial Least Squares Report

The NIPALS Fit Report

Defining a Pruned Model

A Pruned PLS Model for the Savannah River Basin

Model Fit

Diagnostics

Saving the Prediction Formulas

Comparing Actual Values to Predicted Values for the Test Set

A First PLS Model for the Blue Ridge Ecoregion

Making the Subset

Reviewing the Data

Performing the Analysis

The NIPALS Fit Report

A Pruned PLS Model for the Blue Ridge Ecoregion

Model Fit

Comparing Actual Values to Predicted Values for the Test Set

Conclusion

Chapter 8 Baking Bread That People Like

Background

The Data

Data Table Description

Missing Data Check

The First Stage Model

Visual Exploration of Overall Liking and Consumer Xs

The Plan for the First Stage Model

Stage One PLS Model

Stage One Pruned PLS Model

Stage One MLR Model

Comparing the Stage One Models

Visual Exploration of Ys and Xs

Stage Two PLS Model

Stage Two MLR Model

The Combined Model for Overall Liking

Constructing the Prediction Formula

Viewing the Profiler

Conclusion

Appendix 1: Technical Details

Ground Rules

The Singular Value Decomposition of a Matrix

Definition

Relationship to Spectral Decomposition

Other Useful Facts

Principal Components Regression

The Idea behind PLS Algorithms

NIPALS

The NIPALS Algorithm

Computational Results

Properties of the NIPALS Algorithm

SIMPLS

Optimization Criterion

Implications for the Algorithm

The SIMPLS Algorithm

More on VIPs

The Standardize X Option

Determining the Number of Factors

Cross Validation: How JMP Does It

Appendix 2: Simulation Studies

Introduction

The Bias-Variance Tradeoff in PLS

Introduction

Two Simple Examples

Motivation

The Simulation Study

Results and Discussion

Conclusion

Using PLS for Variable Selection

Introduction

Structure of the Study

The Simulation

Computation of Result Measures

Results

Conclusion

References

Index

Preface

A Word to the Practitioner

Welcome to Discovering Partial Least Squares with JMP. This book introduces you to the exciting area of partial least squares. Partial least squares is a multivariate modeling technique based on the idea of projection—the inspiration for the book’s cover design. You will obtain background understanding and see the technique applied in a number of examples. The book is built around the intuitive and powerful JMP statistical software, which will help you understand and internalize this new topic in a way that just reading simply cannot.

Since our goal is to help you apply partial least squares in your own setting, the textual material exists only to build your understanding and confidence as you progress through the worked examples. Although we endeavor to provide the salient details, the area of partial least squares is very broad and this book is necessarily incomplete. To the extent that we cannot cover certain topics fully, we provide references for your further study.

The Organization of the Book

We open with a number of introductory chapters that describe the concepts behind partial least squares and help position it in the wider world of statistical methodology and application. The meat of the book is found in Chapters 5 through 8, which contain four examples. Working through these examples using JMP prepares you to apply partial least squares to your own data. The book also contains two appendixes that provide further statistical details and the results of some simulation studies. Depending on your level and area of interest, you might find these useful.

Required Software

Although a user of standard JMP 11 or later will find this book useful, many examples require JMP Pro 11 or later. Compared to the standard version of JMP, the Pro version is intended for those who require deeper analytical capabilities. In JMP Pro, the implementation of partial least squares is quite complete.

The book uses JMP Pro 11.0 in screenshots, instructions, and discussions. Even though JMP’s PLS capabilities will continue to be developed, the major features and design shown here will persist. However, in future versions, you may notice very slight differences from the specific instruction sequences and screenshots presented in this book.

Ideally, you will have JMP Pro 11 available as you work through this book. A fully functional version of JMP Pro 11 that runs for 30 days can be requested at http://www.jmp.com/webforms/jmp_pro_eval.shtml.

The standard version of JMP enables you to run some partial least squares analyses through a simplified interface. Using this version you will be able to work through some, but not all, of the examples, and many of the scripts linked to in the book will not function correctly. But the book should still help your understanding of partial least squares, and help you decide if you need the Pro version of JMP.

Accessing the Supplementary Content

The data tables and scripts associated with the book can be accessed at either http://support.sas.com/cox or http://support.sas.com/gaudard, which provides a single ZIP file. Once downloaded, you can unzip the contents to a convenient location on your hard disk. This process creates a master JMP journal file Discovering Partial Least Squares with JMP.jrn, along with a folder for each chapter containing scripts. Data tables are created by running these scripts using the links in the master journal. The master journal file provides a convenient way to access all of the supplementary content, and the instructions in the text assume that you will do this.

The data tables themselves contain saved scripts that are referred to in the chapters. Often, when working through an example, we show the steps that you can follow to generate a report in JMP. In addition, either parenthetically or directly, we give the name of a script that has been saved to the data table and that generates that same analysis.

This way, if you want to see the report without stepping through the selections to create it, you can simply run that script.

The scripts are used to illustrate concepts and to help you develop understanding. Because many of the scripts have an element of randomness built in, it is usually worth running the same script more than once to see the effect over various random choices. Also, be aware that the scripts have been encrypted. If you open one of these scripts directly rather than via the journal file mentioned earlier, you see what appears to be gibberish. Nevertheless, you can right-click within the script window and select Run Script.

1

Introducing Partial Least Squares

Modeling in General

Partial Least Squares in Today’s World

Transforming, and Centering and Scaling Data.

An Example of a PLS Analysis.

The Data and the Goal

The Analysis.

Testing the Model

Modeling in General

Applied statistics can be thought of as a body of knowledge, or even a technology, that supports learning about the real world in the face of uncertainty. The theme of learning is ubiquitous in more or less every context that can be imagined, and along with this comes the idea of a (statistical) model that tries to codify or encapsulate our current understanding.

Many statistical models can be thought of as relating one or more inputs (which we call collectively X) to one or more outputs (collectively Y). These quantities are measured on the items or units of interest, and models are constructed from these observations. Such observations yield quantitative data that can be expressed numerically or coded in numerical form.

By the standards of fundamental physics, chemistry, and biology, at least, statistical models are generally useful when current knowledge is moderately low and the underlying mechanisms that link the values in X and Y are obscure. So although one of the perennial challenges of any modeling activity is to take proper account of whatever is already known, the fact remains that statistical models are generally empirical in nature. This is not in any sense a failing, since there are many situations in research, engineering, the natural sciences, the physical sciences, life science, behavioral science, and other areas in which such empirical knowledge has practical utility or opens new, useful lines of inquiry.

However, along with this diversity of contexts comes a diversity of data. No matter what its intrinsic beauty, a useful model must be flexible enough to adequately support the more specific objectives of prediction from or explanation of the data presented to it. As we shall see, one of the appealing aspects of partial least squares as a modeling approach is that, unlike some more traditional approaches that might be familiar to you, it is able to encompass much of this diversity within a single framework.

A final comment on modeling in general—all data is contextual. Only you can determine the plausibility and relevance of the data that you have, and you overlook this simple fact at your peril. Although statistical modeling can be invaluable, just looking at the data in the right way can and should illuminate and guide the specifics of building empirical statistical models of any kind (Chatfield 1995).

Partial Least Squares in Today’s World

Increasingly, we are finding data everywhere. This data explosion, supported by innovative and convergent technologies, has arguably made data exploration (e-Science) a fourth learning paradigm, joining theory, experimentation, and simulation as a way to drive new understanding (Microsoft Research 2009).

In simple retail businesses, sellers and buyers are wrestling for more leverage over the selling/buying process, and are attempting to make better use of data in this struggle. Laboratories, production lines, and even cars are increasingly equipped with relatively low-cost instrumentation routinely producing data of a volume and complexity that was difficult to foresee even thirty years ago. This book shows you how partial least squares, with its appealing flexibility, fits into this exciting picture.

This abundance of data, supported by the widespread use of automated test equipment, results in data sets with a large number of columns, or variables, v and/or a large number of observations, or rows, n. Often, but not always, it is cheap to increase v and expensive to increase n.

When the interpretation of the data permits a natural separation of variables into predictors and responses, partial least squares, or PLS for short, is a flexible approach to building statistical models for prediction. PLS can deal effectively with the following:

• Wide data (when v >> n, and v is large or very large)

• Tall data (when n >> v, and n is large or very large)

• Square data (when n ~ v, and n is large or very large)

• Collinear variables, namely, variables that convey the same, or nearly the same, information

• Noisy data

Just to whet your appetite, we point out that PLS routinely finds application in the following disciplines as a way of taming multivariate data:

• Psychology

• Education

• Economics

• Political science

• Environmental science

• Marketing

• Engineering

• Chemistry (organic, analytical, medical, and computational)

• Bioinformatics

• Ecology

• Biology

• Manufacturing

Transforming, and Centering and Scaling Data

Data should always be screened for outliers and anomalies prior to any formal analysis, and PLS is no exception. In fact, PLS works best when the variables involved have somewhat symmetric distributions. For that reason, for example, highly skewed variables are often logarithmically transformed prior to any analysis.

Also, the data are usually centered and scaled prior to conducting the PLS analysis. By centering, we mean that, for each variable, the mean of all its observations is subtracted from each observation. By scaling, we mean that each observation is divided by the variable’s standard deviation. Centering and scaling each variable results in a working data table where each variable has mean 0 and standard deviation 1.

The reason that centering and scaling are important is because the weights that form the basis for the PLS model are very sensitive to the measurement units of the variables. Without centering and scaling, variables with higher variance have more influence on the model. The process of centering and scaling puts all variables on an equal footing. If certain variables in X are indeed more important than others, and you want them to have higher influence, you can accomplish this by assigning them a higher scaling weight (Eriksson et al. 2006). As you will see, JMP makes centering and scaling easy.

Later we discuss how PLS relates to other modeling and multivariate methods. But for now, let’s dive into an example so that we can compare and contrast it to the more familiar multivariate linear regression (MLR).

An Example of a PLS Analysis

The Data and the Goal

The data table Spearheads.jmp contains data relating to the chemical composition of spearheads known to originate from one of two African tribes (Figure 1.1). You can open this table by clicking on the correct link in the master journal. A total of 19 spearheads of known origin were studied. The Tribe of origin is recorded in the first column (Tribe A or Tribe B). Chemical measurements of 10 properties were made. These are given in the subsequent columns and are represented in the Columns panel in a column group called Xs. There is a final column called Set, indicating whether an observation will be used in building our model (Training) or in assessing that model (Test).

Figure 1.1: The Spearheads.jmp Data Table

Figure 1.1: The Spearheads.jmp Data Table

Our goal is to build a model that uses the chemical measurements to help us decide whether other spearheads collected in the vicinity were made by Tribe A or Tribe B. Note that there are 10 columns in X (the chemical compositions) and only one column in Y (the attribution of the tribe).

The model will be built using the training set, rows 1–9. The test set, rows 10–19, enables us to assess the ability of the model to predict the tribe of origin for newly discovered spearheads. The column Tribe actually contains the numerical values +1 and –1, with –1 representing Tribe A and +1 representing Tribe B. The Tribe column displays Value Labels for these numerical values. It is the numerical values that the model actually predicts from the chemical measurements.

The table Spearheads.jmp also contains four scripts that help us perform the PLS analysis quickly. In the later chapters containing examples, we walk through the menu options that enable you to conduct such an analysis. But, for now, the scripts expedite the analysis, permitting us to focus on the concepts underlying a PLS analysis.

The Analysis

The first script, Fit Model Launch Window, located in the upper left of the data table as shown in Figure 1.2, enables us to set up the analysis we want. From the red-triangle menu, shown in Figure 1.2, select Run Script. This script only runs if you are using JMP Pro since it uses the Fit Model partial least squares personality. If you are using JMP, you can select Analyze > Multivariate Methods > Partial Least Squares from the JMP menu bar. You will be able to follow the text, but with minor modifications.

Figure 1.2: Running the Script Fit Model Launch Window

Figure 1.2: Running the Script “Fit Model Launch Window”

This script produces a populated Fit Model launch window (Figure 1.3). The column Tribe is entered as a response, Y, while the 10 columns representing metal composition measurements are entered as Model Effects. Note that the Personality is set to Partial Least Squares. In JMP Pro, you can access this launch window directly by selecting Analyze > Fit Model from the JMP menu bar.

Below the Personality drop-down menu, shown in Figure

Enjoying the preview?

Page 1 of 1

Discovering Partial Least Squares with JMP

About this ebook

Ian Cox

Related authors

Related to Discovering Partial Least Squares with JMP

Related ebooks

Enterprise Applications For You

Related podcast episodes

Related articles

Related categories

Reviews for Discovering Partial Least Squares with JMP

What did you think?

Book preview

Discovering Partial Least Squares with JMP - Ian Cox

Discovering Partial Least Squares with JMP®

Ian Cox and Marie Gaudard

Contents

A Word to the Practitioner

The Organization of the Book

Required Software

Accessing the Supplementary Content

1

Introducing Partial Least Squares

Modeling in General

Partial Least Squares in Today’s World

Transforming, and Centering and Scaling Data

The Data and the Goal

The Analysis