Targeting Uplift: An Introduction to Net Scores

Ebook722 pages7 hours

Targeting Uplift: An Introduction to Net Scores

Name: Targeting Uplift: An Introduction to Net Scores
Author: René Michel
ISBN: 9783030226251

By René Michel, Igor Schnakenburg and Tobias von Martens

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book explores all relevant aspects of net scoring, also known as uplift modeling: a data mining approach used to analyze and predict the effects of a given treatment on a desired target variable for an individual observation. After discussing modern net score modeling methods, data preparation, and the assessment of uplift models, the book investigates software implementations and real-world scenarios. Focusing on the application of theoretical results and on practical issues of uplift modeling, it also includes a dedicated chapter on software solutions in SAS, R, Spectrum Miner, and KNIME, which compares the respective tools. This book also presents the applications of net scoring in various contexts, e.g. medical treatment, with a special emphasis on direct marketing and corresponding business cases. The target audience primarily includes data scientists, especially researchers and practitioners in predictive modeling and scoring, mainly, but not exclusively, in the marketing context.

Skip carousel

LanguageEnglish

PublisherSpringer

Release dateSep 9, 2019

ISBN9783030226251

Author

René Michel

Related authors

Skip carousel

Related to Targeting Uplift

Related ebooks

Skip carousel

Mistakes in Quality Statistics: and How to Fix Them
Ebook
Mistakes in Quality Statistics: and How to Fix Them
byDonald W. Benbow
Rating: 0 out of 5 stars
0 ratings
Integer Optimization and its Computation in Emergency Management
Ebook
Integer Optimization and its Computation in Emergency Management
byZhengtian Wu
Rating: 0 out of 5 stars
0 ratings
Sensitivity Analysis in Linear Regression
Ebook
Sensitivity Analysis in Linear Regression
bySamprit Chatterjee
Rating: 0 out of 5 stars
0 ratings
Exploratory and Multivariate Data Analysis
Ebook
Exploratory and Multivariate Data Analysis
byMichel Jambu
Rating: 0 out of 5 stars
0 ratings
Statistics
Ebook
Statistics
byH. T. Hayslett
Rating: 4 out of 5 stars
4/5
Introduction to Statistics in Metrology
Ebook
Introduction to Statistics in Metrology
byStephen Crowder
Rating: 0 out of 5 stars
0 ratings
Probabilistic Programming
Ebook
Probabilistic Programming
byS. Vajda
Rating: 5 out of 5 stars
5/5
Digital Signal Processing (DSP) with Python Programming
Ebook
Digital Signal Processing (DSP) with Python Programming
byMaurice Charbit
Rating: 0 out of 5 stars
0 ratings
Inpainting and Denoising Challenges
Ebook
Inpainting and Denoising Challenges
bySergio Escalera
Rating: 0 out of 5 stars
0 ratings
Statistical Inference in Financial and Insurance Mathematics with R
Ebook
Statistical Inference in Financial and Insurance Mathematics with R
byAlexandre Brouste
Rating: 0 out of 5 stars
0 ratings
Elements of Copula Modeling with R
Ebook
Elements of Copula Modeling with R
byMarius Hofert
Rating: 0 out of 5 stars
0 ratings
Linear and Generalized Linear Mixed Models and Their Applications
Ebook
Linear and Generalized Linear Mixed Models and Their Applications
byJiming Jiang
Rating: 0 out of 5 stars
0 ratings
The Gradient Test: Another Likelihood-Based Test
Ebook
The Gradient Test: Another Likelihood-Based Test
byArtur Lemonte
Rating: 0 out of 5 stars
0 ratings
Statistical Methods for Meta-Analysis
Ebook
Statistical Methods for Meta-Analysis
byLarry V. Hedges
Rating: 4 out of 5 stars
4/5
Regents Exams and Answers Geometry Revised Edition
Ebook
Regents Exams and Answers Geometry Revised Edition
byAndre Castagna
Rating: 0 out of 5 stars
0 ratings
Sample Sizes for Clinical, Laboratory and Epidemiology Studies
Ebook
Sample Sizes for Clinical, Laboratory and Epidemiology Studies
byDavid Machin
Rating: 0 out of 5 stars
0 ratings
The Mathematica Handbook
Ebook
The Mathematica Handbook
byMartha L Abell
Rating: 5 out of 5 stars
5/5
Biostatistics and Computer-based Analysis of Health Data using R
Ebook
Biostatistics and Computer-based Analysis of Health Data using R
byChristophe Lalanne
Rating: 0 out of 5 stars
0 ratings
AI for Good: Applications in Sustainability, Humanitarian Action, and Health
Ebook
AI for Good: Applications in Sustainability, Humanitarian Action, and Health
byJuan M. Lavista Ferres
Rating: 0 out of 5 stars
0 ratings
Random Forests with R
Ebook
Random Forests with R
byRobin Genuer
Rating: 0 out of 5 stars
0 ratings
A Unified Approach to the Finite Element Method and Error Analysis Procedures
Ebook
A Unified Approach to the Finite Element Method and Error Analysis Procedures
byJulian A. T. Dow
Rating: 0 out of 5 stars
0 ratings
Learn Statistics Fast: A Simplified Detailed Version for Students
Ebook
Learn Statistics Fast: A Simplified Detailed Version for Students
byHesbon R.M
Rating: 0 out of 5 stars
0 ratings
Systems with Delays: Analysis, Control, and Computations
Ebook
Systems with Delays: Analysis, Control, and Computations
byA. V. Kim
Rating: 0 out of 5 stars
0 ratings
Fundamentals of Modern Mathematics: A Practical Review
Ebook
Fundamentals of Modern Mathematics: A Practical Review
byDavid B. MacNeil
Rating: 0 out of 5 stars
0 ratings
Evolutionary Algorithms for Food Science and Technology
Ebook
Evolutionary Algorithms for Food Science and Technology
byEvelyne Lutton
Rating: 0 out of 5 stars
0 ratings
Introduction to Stochastic Dynamic Programming
Ebook
Introduction to Stochastic Dynamic Programming
bySheldon M. Ross
Rating: 0 out of 5 stars
0 ratings
Solutions Manual to Accompany Introduction to Quantitative Methods in Business: with Applications Using Microsoft Office Excel
Ebook
Solutions Manual to Accompany Introduction to Quantitative Methods in Business: with Applications Using Microsoft Office Excel
byBharat Kolluri
Rating: 0 out of 5 stars
0 ratings
Fundamental Math
Ebook
Fundamental Math
byRussell Pead
Rating: 0 out of 5 stars
0 ratings
A New Concept for Tuning Design Weights in Survey Sampling: Jackknifing in Theory and Practice
Ebook
A New Concept for Tuning Design Weights in Survey Sampling: Jackknifing in Theory and Practice
bySarjinder Singh
Rating: 0 out of 5 stars
0 ratings
Schaum's Outline of Elements of Statistics I: Descriptive Statistics and Probability
Ebook
Schaum's Outline of Elements of Statistics I: Descriptive Statistics and Probability
byStephen Bernstein
Rating: 0 out of 5 stars
0 ratings

Business For You

Skip carousel

The Intelligent Investor, Rev. Ed: The Definitive Book on Value Investing
Ebook
The Intelligent Investor, Rev. Ed: The Definitive Book on Value Investing
byBenjamin Graham
Rating: 4 out of 5 stars
4/5
Your Next Five Moves: Master the Art of Business Strategy
Ebook
Your Next Five Moves: Master the Art of Business Strategy
byPatrick Bet-David
Rating: 5 out of 5 stars
5/5
Summary of Limitless: by Jim Kwik - Upgrade Your Brain, Learn Anything Faster, and Unlock Your Exceptional Life - A Comprehensive Summary
Ebook
Summary of Limitless: by Jim Kwik - Upgrade Your Brain, Learn Anything Faster, and Unlock Your Exceptional Life - A Comprehensive Summary
byAlexander Cooper
Rating: 4 out of 5 stars
4/5
Becoming Bulletproof: Protect Yourself, Read People, Influence Situations, and Live Fearlessly
Ebook
Becoming Bulletproof: Protect Yourself, Read People, Influence Situations, and Live Fearlessly
byEvy Poumpouras
Rating: 4 out of 5 stars
4/5
Good to Great: Why Some Companies Make the Leap...And Others Don't
Ebook
Good to Great: Why Some Companies Make the Leap...And Others Don't
byJim Collins
Rating: 4 out of 5 stars
4/5
The Richest Man in Babylon: The most inspiring book on wealth ever written
Ebook
The Richest Man in Babylon: The most inspiring book on wealth ever written
byGeorge S. Clason
Rating: 5 out of 5 stars
5/5
Emotional Intelligence: Exploring the Most Powerful Intelligence Ever Discovered
Ebook
Emotional Intelligence: Exploring the Most Powerful Intelligence Ever Discovered
byBenjamin Smith
Rating: 5 out of 5 stars
5/5
Limited Liability Companies For Dummies
Ebook
Limited Liability Companies For Dummies
byJennifer Reuting
Rating: 5 out of 5 stars
5/5
Grant Writing For Dummies
Ebook
Grant Writing For Dummies
byBeverly A. Browning
Rating: 5 out of 5 stars
5/5
The Everything Guide To Being A Paralegal: Winning Secrets to a Successful Career!
Ebook
The Everything Guide To Being A Paralegal: Winning Secrets to a Successful Career!
bySteven Schneider
Rating: 5 out of 5 stars
5/5
Crucial Conversations Tools for Talking When Stakes Are High, Second Edition
Ebook
Crucial Conversations Tools for Talking When Stakes Are High, Second Edition
byKerry Patterson
Rating: 4 out of 5 stars
4/5
Powerful Phrases for Dealing with Difficult People: Over 325 Ready-to-Use Words and Phrases for Working with Challenging Personalities
Ebook
Powerful Phrases for Dealing with Difficult People: Over 325 Ready-to-Use Words and Phrases for Working with Challenging Personalities
byRenee Evenson
Rating: 3 out of 5 stars
3/5
Money. Wealth. Life Insurance.
Ebook
Money. Wealth. Life Insurance.
byJake Thompson
Rating: 5 out of 5 stars
5/5
The Book on Rental Property Investing: How to Create Wealth With Intelligent Buy and Hold Real Estate Investing
Ebook
The Book on Rental Property Investing: How to Create Wealth With Intelligent Buy and Hold Real Estate Investing
byBrandon Turner
Rating: 5 out of 5 stars
5/5
Collaborating with the Enemy: How to Work with People You Don’t Agree with or Like or Trust
Ebook
Collaborating with the Enemy: How to Work with People You Don’t Agree with or Like or Trust
byAdam Kahane
Rating: 4 out of 5 stars
4/5
Tools Of Titans: The Tactics, Routines, and Habits of Billionaires, Icons, and World-Class Performers
Ebook
Tools Of Titans: The Tactics, Routines, and Habits of Billionaires, Icons, and World-Class Performers
byTimothy Ferriss
Rating: 4 out of 5 stars
4/5
How to Get Ideas
Ebook
How to Get Ideas
byJack Foster
Rating: 5 out of 5 stars
5/5
The Book of Beautiful Questions: The Powerful Questions That Will Help You Decide, Create, Connect, and Lead
Ebook
The Book of Beautiful Questions: The Powerful Questions That Will Help You Decide, Create, Connect, and Lead
byWarren Berger
Rating: 4 out of 5 stars
4/5
Who Moved My Cheese: An A-Mazing Way to Deal with Change in Your Work and in Your Life by Spencer Johnson | Key Takeaways, Analysis & Review
Ebook
Who Moved My Cheese: An A-Mazing Way to Deal with Change in Your Work and in Your Life by Spencer Johnson | Key Takeaways, Analysis & Review
by. IRB Media
Rating: 5 out of 5 stars
5/5
Robert's Rules Of Order
Ebook
Robert's Rules Of Order
byBarCharts, Inc.
Rating: 5 out of 5 stars
5/5
Company Rules: Or Everything I Know About Business I Learned from the CIA
Ebook
Company Rules: Or Everything I Know About Business I Learned from the CIA
byMike Baker
Rating: 4 out of 5 stars
4/5
Capitalism and Freedom
Ebook
Capitalism and Freedom
byMilton Friedman
Rating: 4 out of 5 stars
4/5
Lying
Ebook
Lying
bySam Harris
Rating: 4 out of 5 stars
4/5
Leadership and Self-Deception: Getting out of the Box
Ebook
Leadership and Self-Deception: Getting out of the Box
byThe Arbinger Institute
Rating: 4 out of 5 stars
4/5
Set for Life: An All-Out Approach to Early Financial Freedom
Ebook
Set for Life: An All-Out Approach to Early Financial Freedom
byScott Trench
Rating: 4 out of 5 stars
4/5
Crucial Conversations: Tools for Talking When Stakes are High, Third Edition
Ebook
Crucial Conversations: Tools for Talking When Stakes are High, Third Edition
byJoseph Grenny
Rating: 4 out of 5 stars
4/5
The Five Dysfunctions of a Team: A Leadership Fable, 20th Anniversary Edition
Ebook
The Five Dysfunctions of a Team: A Leadership Fable, 20th Anniversary Edition
byPatrick M. Lencioni
Rating: 4 out of 5 stars
4/5
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ebook
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
byBen Horowitz
Rating: 4 out of 5 stars
4/5
Summary of J.L. Collins's The Simple Path to Wealth
Ebook
Summary of J.L. Collins's The Simple Path to Wealth
byIRB Media
Rating: 5 out of 5 stars
5/5
Robert's Rules of Order: The Original Manual for Assembly Rules, Business Etiquette, and Conduct
Ebook
Robert's Rules of Order: The Original Manual for Assembly Rules, Business Etiquette, and Conduct
byHenry Robert
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
Podcast episode
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
byNew Books in Business, Management, and Marketing
0 ratings
0% found this document useful
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
Podcast episode
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
byNew Books in Education
0 ratings
0% found this document useful
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
Podcast episode
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
byNew Books in Public Policy
0 ratings
0% found this document useful
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
Podcast episode
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
byNew Books in Economics
0 ratings
0% found this document useful
Model diagnostics and selection (Ep. 11)
Podcast episode
Model diagnostics and selection (Ep. 11)
byClinical Pharmacology Podcast with Nathan Teuscher
0 ratings
0% found this document useful
How A.I. Will Revolutionize Climate Tech
Podcast episode
How A.I. Will Revolutionize Climate Tech
byThe Interchange: Recharged
0 ratings
0% found this document useful
Alignment Newsletter #164: How well can language models write code?: How well can language models write code?
Podcast episode
Alignment Newsletter #164: How well can language models write code?: How well can language models write code?
byAlignment Newsletter Podcast
0 ratings
0% found this document useful
The APsolute Recap: Biology Edition - Lab Experiments: How many lab experiments did you complete this year? Episode 23 recAPs the lab manual published by the College Board.
Podcast episode
The APsolute Recap: Biology Edition - Lab Experiments: How many lab experiments did you complete this year? Episode 23 recAPs the lab manual published by the College Board.
byThe APsolute RecAP: Biology Edition
0 ratings
0% found this document useful
Convolution Quadrature: Modellansatz 133
Podcast episode
Convolution Quadrature: Modellansatz 133
byModellansatz - English episodes only
0 ratings
0% found this document useful
Climate Risks and Rewards: Rating Companies’ Exposure
Podcast episode
Climate Risks and Rewards: Rating Companies’ Exposure
bySwitched On
0 ratings
0% found this document useful
Changepoint Detection: Secret Weapon of the Data Scientist
Podcast episode
Changepoint Detection: Secret Weapon of the Data Scientist
byDataCafé
0 ratings
0% found this document useful
Climate tech startups need strong techno-economic analysis (TEA): A good TEA is the unsung lynchpin of an early stage climate tech.
Podcast episode
Climate tech startups need strong techno-economic analysis (TEA): A good TEA is the unsung lynchpin of an early stage climate tech.
byCatalyst with Shayle Kann
0 ratings
0% found this document useful
083R_Operationalising a concept: The systematic review of composite indicator building for measuring community disaster resilience (research summary)
Podcast episode
083R_Operationalising a concept: The systematic review of composite indicator building for measuring community disaster resilience (research summary)
byWhat is The Future for Cities?
0 ratings
0% found this document useful
Irina Gaynanova | Replicating Clinical Metrics & Innovating New Methods: Philosophy of Data Science Series Session 3: Data Science Highlight Reel Episode 2: Irina Gaynanova on Replicating Clinical Metrics & Innovating New Methods Who makes it into the highlight reel of data science? Irina Gaynanova for her work on replicat...
Podcast episode
Irina Gaynanova | Replicating Clinical Metrics & Innovating New Methods: Philosophy of Data Science Series Session 3: Data Science Highlight Reel Episode 2: Irina Gaynanova on Replicating Clinical Metrics & Innovating New Methods Who makes it into the highlight reel of data science? Irina Gaynanova for her work on replicat...
byData & Science with Glen Wright Colopy
0 ratings
0% found this document useful
Episode 262 - Teneil Hannah Interview: Eric and Glenn discuss their holiday parties and …
Podcast episode
Episode 262 - Teneil Hannah Interview: Eric and Glenn discuss their holiday parties and …
byDouble Loop Podcast
0 ratings
0% found this document useful
Weather Generator: Modellansatz 148
Podcast episode
Weather Generator: Modellansatz 148
byModellansatz - English episodes only
0 ratings
0% found this document useful
There’s a thousand genes for that! (Ep 98)
Podcast episode
There’s a thousand genes for that! (Ep 98)
byBig Biology
0 ratings
0% found this document useful
"Biological Anchors external review" by Jennifer Lin
Podcast episode
"Biological Anchors external review" by Jennifer Lin
byEA Forum Podcast (Curated & popular)
0 ratings
0% found this document useful
Counting Mitoses: SI(ze) matters!: In this episode, Dr. Ian Cree, Head of The WHO Tumour Classification discusses his team's recent open access publication in Modern Pathology. Historically, mitotic figures counting has been done by expressing the number of mitoses per n...
Podcast episode
Counting Mitoses: SI(ze) matters!: In this episode, Dr. Ian Cree, Head of The WHO Tumour Classification discusses his team's recent open access publication in Modern Pathology. Historically, mitotic figures counting has been done by expressing the number of mitoses per n...
byModPath Chat
0 ratings
0% found this document useful
#12 Technology to fight & cope with Climate Change with Manuella Cunha Brito
Podcast episode
#12 Technology to fight & cope with Climate Change with Manuella Cunha Brito
byLast Week on Earth with GARI
0 ratings
0% found this document useful
Where Are the Gaps in Climate Tech?
Podcast episode
Where Are the Gaps in Climate Tech?
byThe Interchange: Recharged
0 ratings
0% found this document useful
Pressure Enthalpy without Tears: RACT manual co-author Eugene Silberstein joins the podcast to talk about the titular topic of his book, Pressure Enthalpy Without Tears. Pressure Enthalpy Without Tears is a book that introduces engineering concepts to HVAC technicians in a way...
Podcast episode
Pressure Enthalpy without Tears: RACT manual co-author Eugene Silberstein joins the podcast to talk about the titular topic of his book, Pressure Enthalpy Without Tears. Pressure Enthalpy Without Tears is a book that introduces engineering concepts to HVAC technicians in a way...
byHVAC School - For Techs, By Techs
0 ratings
0% found this document useful
Ep 117: Dr. Bastian Minkenberg on Genome Editing: On this episode, Katie is joined by Dr. Bastian Minkenberg, a postdoctoral scholar in the Innovative Genomics Institute’s agricultural genomics branch. He started working on genome-editing in the food staple rice during his time as a...
Podcast episode
Ep 117: Dr. Bastian Minkenberg on Genome Editing: On this episode, Katie is joined by Dr. Bastian Minkenberg, a postdoctoral scholar in the Innovative Genomics Institute’s agricultural genomics branch. He started working on genome-editing in the food staple rice during his time as a...
byResearch in Action | A podcast for faculty & higher education professionals on research design, methods, productivity & more
0 ratings
0% found this document useful
Alignment Newsletter #173: Recent language model results from DeepMind: Recent language model results from DeepMind
Podcast episode
Alignment Newsletter #173: Recent language model results from DeepMind: Recent language model results from DeepMind
byAlignment Newsletter Podcast
0 ratings
0% found this document useful
Varsity A/B Testing: When you want to understand if doing something ca…
Podcast episode
Varsity A/B Testing: When you want to understand if doing something ca…
byLinear Digressions
0 ratings
0% found this document useful

Skip carousel

Pi Is Everywhere – But Do We Really Need More Than 15 Decimal Places?
Guardian Weekly
Article
Pi Is Everywhere – But Do We Really Need More Than 15 Decimal Places?
Aug 27, 2021
2 min read
Greenwashing in Graphs: an ExxonMobil Story
Union of Concerned Scientists
Article
Greenwashing in Graphs: an ExxonMobil Story
Apr 9, 2024
Research Scientist Carly Phillips takes a look at ExxonMobil's latest climate report to see if it bears up to scientific scrutiny (spoiler: nope).
4 min read
Forecasts For Covid-19 Based On Artificial Intelligence
Frontiers of Science
Article
Forecasts For Covid-19 Based On Artificial Intelligence
Apr 21, 2020
3 min read
Test Bench
Sound & Vision
Article
Test Bench
Jun 11, 2019
1 min read
Re-Framing Innovation: Integrating Behavioural Science and Design
Rotman Management
Article
Re-Framing Innovation: Integrating Behavioural Science and Design
May 1, 2019
11 min read
Mizuho's 9 Best Biotech Stocks to Buy for 2019
Kiplinger
Article
Mizuho's 9 Best Biotech Stocks to Buy for 2019
Feb 5, 2019
Biotech stocks are a staple among growth investors. The rewards can be staggering - prices on biotechnology shares can literally double overnight - but the risk is just as high. It's not unusual for these stocks to be cut in half or worse if drug-tri
7 min read
Add A Little Funk To Mathematical Plots
Linux Format
Article
Add A Little Funk To Mathematical Plots
Jul 25, 2023
6 min read
Delay Times Across The Spectrum With MSpectralDelay
Future Music
Article
Delay Times Across The Spectrum With MSpectralDelay
Feb 7, 2023
1 min read
Test Bench
Sound & Vision
Article
Test Bench
Dec 10, 2019
1 min read
Test Bench
Sound & Vision
Article
Test Bench
Apr 16, 2019
FULL-ON/FULL-OFF CONTRAST RATIO: 19,000:1 Measurements were taken in a variety of conditions with the bulk using the Reference preset. The gamma preset was 2.4. All calibration was done with the dynamic laser disabled and the contrast ratio measureme
1 min read

Related categories

Skip carousel

Reviews for Targeting Uplift

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Targeting Uplift - René Michel

R. Michel et al.Targeting Uplifthttps://doi.org/10.1007/978-3-030-22625-1_1

1. Introduction

René Michel¹ , Igor Schnakenburg² and Tobias von Martens¹

(1)

Deutsche Bank AG, Frankfurt am Main, Germany

(2)

DeTeCon International GmbH, Berlin, Germany

1.1 Problem Statement

In various areas of application, treatments are commonly used in order to affect behavior. In the last decades, the systematic collection and analysis of data on behavior by means of advanced statistical methods allowed for the identification of behavioral patterns that may have been hidden before. The application of analytics to estimate the impact of treatments on behavior (i.e., uplift) is not just a natural extension to this but a beneficial challenge, since the effective and efficient control of behavior may be crucial for competitive advantages (or a non-monetary equivalent). This book focuses on uplift analytics and shows how they can be applied.

The following exemplary use cases underline the diversity of application areas in which treatments are used to affect behavior:

Direct marketing tries to convince customers to purchase a product or service.

Churn prevention campaigns strengthen or win back customers’ loyalty.

Medical treatments are applied to help patients to recover from a disease or ease pain.

Fertilizers are used to increase yields in agriculture.

Pre-emptive maintenance is used to avoid machine malfunctioning.

Police forces are used pre-emptively to avoid crimes, especially break-ins.

Some of these treatments—specifically when trying to influence human behavior—may be characterized as nudges as presented in [14], i.e., treatments that softly stimulate people for their advantage without taking away their freedom to decide. One goal of the methods presented in this book is to make the effect of such nudges both measurable and predictable.

Most often, the magnitude of the effect that the treatment exerts on behavior is only assumed but not exactly known in advance. However, a subsequent estimation of the effect is possible in most cases:

If any additional influencing factors can be excluded (e.g., by means of an experimental design), behavior before and after the treatment might be compared.

If there is a structurally identical group of observations not exposed to the treatment (i.e., control group ), the behavior of the group of observations that received the treatment (i.e., target group) can be compared to the behavior of the control group. This is assumed as the standard approach in this book.

Decisions on the utilization of treatments, however, require an estimation of their assumed effect beforehand. Therefore, the development of statistical models for the forecast of treatment impacts on an observation is a challenge that research has dealt with for several years. This forecast, known as uplift modeling , net scoring , incremental response modeling , causal modeling , average causal effect (ACE) modeling , or personalized medicine (the latter two in the medical context), is based on the characteristics of an observation and its environment. In order to support decisions on whether or not to apply the treatment, behavioral changes due to the treatment have to be forecasted and evaluated:

Customers may or may not purchase the product or service after having been targeted by a direct marketing campaign.

Customers may extend or quit their telecommunication contract after they have been addressed by a churn prevention campaign of their provider.

Patients recover or do not recover after they have been drugged with a specific medication.

Fields provide or do not provide greater yields after they have been fertilized.

Machines fail or do not fail to a lesser extent after they have been maintained according to a specific process.

The number of crimes committed in a certain area may or may not reduce after more police forces have been sent out.

All use cases presume that exerting the treatment on all observations is not possible, mostly because of limited (financial) resources, or even not recommended, since the individual behavior of some observations may not or even be negatively affected. Hence, a decision on which observations should receive the treatment is required. Just as Lo (see [6]) emphasizes, uplift modeling is capable of identifying those observations whose response will be positively influenced by the treatment. Whereas gross scoring , the classical scoring, predicts the probability of a certain behavior given (but not necessarily induced by) a treatment, net scoring predicts the difference in behavior given a treatment compared to no treatment (see [9]).

As stated in [2], instead of telling what happens (descriptive analytics ) or what will happen (predictive analytics ), prescriptive analytics aims at telling what should be done to make things happen. Uplift modeling is a core approach in the field of prescriptive analytics.

Despite the fact that some methods for the development of appropriate statistical models have been suggested, the problem is often simplified in practice to the forecast of behavior without taking a (previous) treatment into account.

In direct marketing , for example, a common approach is to address only those customers by campaigns that have an above-the-average affinity towards a product or service. The underlying assumption is that the relative effect of a direct marketing campaign (i.e., the treatment) is approximately equal for all customers, i.e., customers with a high affinity are uplifted more by a campaign than customers with a low affinity. Clearly, the mathematical challenge of considering the effect of treatments is avoided in this case.

However, forecasting behavior while ignoring the effect of treatments can lead to misinterpretations and a wrong allocation of (financial) resources:

Some customers are targeted by a direct marketing campaign but they would have decided in favor of a product or service anyway (Sure Things ). Other customers whose behavior could positively be affected by a direct marketing campaign (Persuadables ) are not considered, since they have a low basic affinity towards this product or service. Customers without inclination to purchase no matter if the treatment is applied or not are usually referred to as Lost Causes .

Some telecommunication or insurance customers (Sleeping Dogs ) may quit their contract because a churn prevention campaign made them actively think about ending their contract and searching for alternatives offered by competitors.

Some patients get a specific medication, although they would have recovered without it, too. Other patients do not get that medication, although they would have taken benefit from it.

Fields are fertilized that would have provided good yields anyway, while other fields remain unfertilized but would provide higher yields if fertilized.

Maintenance is focused on some machines that fail often (which cannot be prevented), while other machines are not maintained, although predictive maintenance on those machines could have prevented their rare failure.

Police resources are sent to areas where crimes cannot be prevented. Intelligent usage of police forces could (also) send them to areas where their presence could reduce crimes.

From a methodological perspective, the consideration of a treatment effects increases complexity, e.g., the problem that an observation cannot be treated and not treated at the same time has to be overcome. The target criterion, e.g., product purchase vs. no product purchase or recovery vs. illness, according to which groups should be discriminated is no longer sufficient. In addition, the treatment has to be regarded as an additional dimension in the modeling algorithm itself. Moreover, historical data rarely allows for the analysis of one group’s behavior after receiving a specific treatment compared to the behavior of another group that received a different (or no) treatment. Therefore, research on this subject does not only have to consider the development of modeling algorithms for treatment data but also the availability of such data.

It is possibly due to the mathematical and practical challenges and increased demands regarding the available data that the uplift approach has not received appropriate attention by research in the past (at least, compared to the vast amount of research on classical scoring methods) and, hence, has not gained the broad application it deserves with respect to its positive impact on effectiveness and efficiency. However, in recent years, the number of publications on uplift modeling has increased. Most of those publications consider a specific issue of uplift modeling, while a general summary of the methods, their comparison, their applications, and practical use cases seems to be missing in the literature. This book intends to fill this gap.

1.2 State-of-the-Art

Most of the contributions to uplift modeling can be found in the data mining literature. Typically, they consider the problem in the context of direct marketing or medical treatments, since these can be regarded as primary areas of application. The suggested approaches range from the enhancement of classical methods to the development of new methods, e.g.:

Lo (see [6]) points out that the classical methodology is not directly designed to solve the appropriate business objective (i.e., identifying the most responsive customers) and suggests a new scoring method based on logistic regression.

Hansotia and Rukstales (see [4]) describe tree- and regression-based approaches to develop incremental decision rules and justify marketing investments.

Radcliffe and Simpson (see [8]) illustrate how retention campaigns based on conventional scoring methods may even provoke some customers to leave. They suggest Qini graphs and the Qini coefficient as generalizations of gains charts and the Gini coefficient, respectively, for the measurement of the discriminatory power of uplift models.

Radcliffe and Surry (see [10]) document the then state-of-the-art in uplift modeling. They propose quality measures and success criteria of uplift modeling and suggest significance-based uplift trees as an appropriate scoring method.

Austin (see [1]) shows ensemble methods in the uplift context, i.e., unifying several models into one common, superior model, and shows their effectiveness with simulated data.

Rzepakowski and Jaroszewicz (see [11]) present tree-based classifiers in order to decide which action out of a set of potential treatments should be used in order to maximize (incremental) profit.

Jaskowski and Jaroszewicz (see [5]) extend standard probabilistic classification models, such as logistic regression, for uplift modeling on clinical trial data. To that end, they apply either class variable transformation or treatment and control classifiers in logistic regression analysis.

Guelman et al. (see [3]) introduce a new, statistically advanced way of uplift modeling based on random forests together with an implementation package in the common statistical software R.

Michel et al. (see [7]) introduce $$\chi ^2_{{\mathrm{net}}}$$ as a modification of the classical χ² statistic for uplift modeling and show detailed net scoring scenarios for marketing.

Devriendt et al. (see [2]) give an overview of the relevant literature regarding uplift modeling. They also raise a lot of open questions, such as the influence of sample sizes and other factors on net scoring performance and model stability as well as suitable business cases to validate the economic impact of net scoring. These aspects will be addressed in this book.

Also, first summaries exist as chapters of introductory books on data mining and predictive analytics, such as [12] and [13].

The contributions mentioned above illustrate that the relevance of uplift modeling has been acknowledged in recent years. All authors share the perception that classical scoring methods currently used in practice are not designed to serve the primary objective, i.e., identifying the most responsive customers (or patients), in order to support decisions on the utilization of treatments. Furthermore, the research contributions at hand are able to prove by means of simulated or real-world data that uplift modeling outperforms classical scoring methods with regard to treatment effectiveness. Hence, this book comprises the recent state-of-the-art of research and is, thus, in the tradition of [10] and [2].

1.3 Structure of the Book

This book aims at examining uplift modeling in all of its facets. It contributes to research, since the state-of-the-art of uplift modeling is summarized and enhanced where research gaps have been identified. The book also contributes to practical experience by addressing the application of uplift modeling and corresponding challenges comprehensively. The scoring methods found in the literature and the methods proposed by the authors are compared to each other both conceptually and by means of simulation studies with current software implementations. Furthermore, topics that have received minor attention so far, such as suitable sample sizes, a closed-loop approach to uplift modeling in practice as well as a systematic identification of potential areas of application, are described.

The book is structured in the following way:

At first, both scoring approaches, i.e., the classical scoring (also referred to as gross scoring in this book) and uplift modeling (also referred to as net scoring), are presented and compared to each other with regard to the problem statement, available methods, and the assessment of modeling results.

After that, main challenges of uplift modeling, such as the assessment of net scoring models as well as variable preselection for modeling, are explored.

Next, focusing on the application of uplift modeling in practice, currently available software implementations are presented and compared by their functionality and performance on a given dataset.

Another important practical aspect, namely the kind of data that has to be available, is investigated and appropriate sample sizes are suggested.

Finally, potential areas of application for uplift modeling are identified. For the marketing use case, a framework for an alignment with the business strategy is proposed. Moreover, a process model for implementing uplift modeling is suggested.

References

P. Austin. Using ensemble-based methods for directly estimating causal effects: An investigation of tree-based g-computation. Multivariate Behavioral Research, 47:115–135, 2012.Crossref

V. Devriendt, D. Moldovan, and W. Verbeke. A literature survey and experimental evaluation of the state-of-the-art in uplift modeling: A stepping stone toward the development of prescriptive analytics. Big Data, 6(1):13–41, 2018. https://doi.org/10.1089/big.2017.0104.Crossref

L. Guelman, M. Guillén, and A.M. Perez-Marin. Optimal personalized treatment rules for marketing interventions: A review of methods, a new proposal, and an insurance case study. UB Riskcenter Working Paper Series, 2014(06), 2014.

B. Hansotia and B. Rukstales. Incremental value modeling. Journal of Interactive Marketing, 16(3):35–46, 2002.Crossref

M. Jaskowski and S. Jaroszewicz. Uplift modeling for clinical trial data. ICML 2012 Workshop on Clinical Data Analysis, 2012.

V. Lo. The true lift model - a novel data mining approach to response modeling in database marketing. SIGKDD Explorations, 4(2):78–86, 2002.Crossref

R. Michel, I. Schnakenburg, and T. von Martens. Effiziente Ressourcenallokation für Vertriebskampagnen durch Nettoscores. Betriebswirtschaftliche Forschung und Praxis, 67(6):665–677, 2015.

N.J. Radcliffe and R. Simpson. Identifying who can be saved and who will be driven away by retention activity. Journal of Telecommunications Management, 1(2):168–176, 2008.

N.J. Radcliffe and P.D. Surry. Quality measures for uplift models. 2011. Working paper. http://stochasticsolutions.com/pdf/kdd2011late.pdf.

10.

N.J. Radcliffe and P.D. Surry. Real-world uplift modeling with significance-based uplift trees. 2011. Technical Report, Stochastic Solutions.

11.

P. Rzepakowski and S. Jaroszewicz. Decision trees for uplift modeling with single and multiple treatments. Knowledge and Information Systems, 32:303–327, 2012.Crossref

12.

E. Siegel. Predictive Analytics: The Power to Predict who will Click, Lie or Die. John Wiley & Sons, 2015.Crossref

13.

J. Strickland. Predictive Analytics Using R. Lulu Pr, 2015.

14.

R. Thaler and C. Sunstein. Nudge: Improving Decisions About Health, Wealth and Happiness. Penguin, 2009.

R. Michel et al.Targeting Uplifthttps://doi.org/10.1007/978-3-030-22625-1_2

2. The Traditional Approach: Gross Scoring

René Michel¹ , Igor Schnakenburg² and Tobias von Martens¹

(1)

Deutsche Bank AG, Frankfurt am Main, Germany

(2)

DeTeCon International GmbH, Berlin, Germany

Model building and scoring as a statistical methodology have been known for decades, and there is a wide variety of literature available for studies, e.g., [4] or [11] as two examples. It is not the intention of this chapter to give a complete overview of model building and scoring. Instead, typical methods of model building and scoring are presented which are required to understand the change in paradigm with the introduction of net scoring and its methods later in Chap. 3. In order to distinguish between both approaches, the classical approach will be referred to as gross scoring , whereas the new approach will be referred to as net scoring or uplift modeling interchangeably. At first, we explain and formalize the problem to be solved. Section 2.2 will introduce common methods for scoring like decision trees or (logistic) regression, always with the generalization to net scoring in mind. Section 2.3 contains an introduction to well-known quality measures for scoring models. This introduction also serves as a preparation to generalize those quality indicators to the net scoring setup in Chap. 4.

Although the facts presented in this chapter may be known to many readers, it is nevertheless recommended to study this chapter in order to get familiar with the way scoring methods are presented and described in this book. This will help to understand the net approaches that will be described later on.

2.1 Problem Statement

To put it simple: The problem in the classical prediction case is to calculate the probability of an event to occur in the future. In reality, either the event does happen or it does not happen, but this is not known at the moment of calculation. The precise context of this general setup can be in different forms which are, however, not important for the mathematical considerations. Some examples where calculated probabilities may trigger an action are the following:

A company aims at predicting product purchases for all registered customers (i.e., customers and their corresponding data are known to this company). The customer may or may not purchase a specific product.

An enterprise aims at predicting the failure of parts of a machine it produces (or uses) in order to have the relevant spare parts available in due time. The respective part may or may not fail.

The police aims at predicting crimes in order to be present and prevent them. The crimes may or may not be committed.

A bank aims at predicting credit default rates on a customer-individual level in order to take appropriate precautions. The credits may or may not default.

A doctor aims at predicting whether a patient can recover from a current disease. The patient may or may not recover.

All of the examples above have the following structural elements in common:

a set of observations, such as customers, patients, or machines

information on the observations in form of explanatory variables , such as age, blood pressure, or type of machine

a target event, such as a product purchase, malfunction, sickness, or recovery

The target variable in the simplest case (which will be the focus in this chapter and most parts of the book) is a binary variable, where 1 describes the occurrence of an event, and 0 describes its non-occurrence. The occurrence of the desired event for every individual observation based on its attributes shall be predicted. This information is then, for example, used to implement some business strategy like approving or rejecting a credit request, repairing a machine, or targeting a customer. Non-binary variables are also possible as targets. A target which can (at least theoretically) assume any numerical value will be referred to as interval-scaled, metric, or continuous. These designations will be used as synonyms throughout the text.

In order to generalize scoring to the net case, a formalization of the setup of the classical gross scoring is useful.

Let X be a random vector of explanatory variables and x a realization of that random vector. In order to ease notations, it is assumed without loss of generality that $${\boldsymbol {x}}\in \mathbb R^s$$ , i.e., any categorical variables in the data at hand are modeled as numbers. Further on, let R be a binary random variable describing the target (occurrence of the desired event) for each observation.

Then, P(R = 1|X = x) denotes the probability of a target event for an observation with the explanatory vector x. This is the conditional probability for an occurrence R to happen under X = x. For didactical reasons, it is neglected that P(X = x) may be 0 and that the conditional probability may not be well defined and has, instead, to be defined with the help of suitable limit considerations. Thus, in order to ease notation, assume that P(X = x) > 0, although this may not always be the case in the probability model.

The central goal of gross scoring is to find estimators $$\hat p_{\boldsymbol {x}}$$ that give reasonable empirical approximations for P(R = 1|X = x), based on the explanatory variables. These estimators will usually be based on n independent and identically distributed copies of the random tuple (R i, X i), i = 1, …, n, as observable, for example, from n customers, n patients, or n machine parts.

When implementing the classical approach to scoring, one usually puts all (potentially explanatory) data about the observations under consideration into one flat file. Additionally, information about the desired event has to be included. This means that historical data is used that contains cases where the event occurred and where it did not. It is important that the explanatory variables are recorded at a point in time before the event occurred. Such a flat file usually starts with an ID for each observation. The observation’s properties can be distinguished into several classes:

Explanatory variables —these may be segmented into the following common categories:

General observation data —e.g., name, date of birth, gender, or ID. If the observation is a customer, then general customer data includes address or contact details. In the context of asset maintenance, general information may include material, texture, or production line. In other contexts, general information is meant to distinguish the specific observation from other observations (if not from all other observations); however, general information is also meant to be stable.

Development data —this information, too, is more or less specific for the observation, but it may vary with time and is not fixed from the beginning. For customers, this may be transactional data, product usage, or (online) behavioral data,¹ while for assets, this may be usage, position, velocity, accelerations, etc. Development data contains historical data as well as recent data from several points in time.

Context data —rather than specifying the observation itself, contextual data describes the environment or the neighbors of the observation. In marketing, this may be the network of people the customer has contact with, or the specific customer behavior in the geographical vicinity (communication impacts); for assets, it may be crucial at which part in a framework they are located (central or boundary) or what happened to surrounding assets (root cause analysis).

Treatment data—information about how the observation has been treated in the past; for assets, this may be the number of repairs in the past, while for customers, it could be a treatment or campaigning history that those individuals have been exposed to (rather than initiating it by themselves).

Finally, there is derived information like all kinds of statistical measures: maximum, minimum, average, regression slope and intercept, deviations but also certain ratios (e.g., wallet shares, amount of credit volume per postal code, average maintenance cost in a region, above- or below-average pressure levels, etc.). There is no limit to bringing in additional data and deriving variables from existing data as long as it can be assigned to a specific observation. Clearly, this is due to subject matter expertise, as not every piece of new information will be correlated with the target variable.

Target variable—in this chapter, a binary target variable is considered, e.g., a customer has purchased or churned, an asset has failed or recovered, etc.

After data collection, a suitable number of observations where the event occurred and a suitable number of observations where it could have occurred,² but did not, are used for the modeling process . Modeling process means that a statistical algorithm is deployed in order to identify connections between the predictor variables and the target variable. Which connections among data are useful, though? Following [4], there are several criteria:

easily understandable

valid on new data

potentially useful

novel

Many different classical algorithms are available to gain understandable, valid, useful, and novel information. Some of the most common ones will be presented in this chapter. We restrict ourselves to methods which will be important for the generalization to net scoring in the next chapter, so this is not thought to be a complete overview. The presented methods are

Decision trees

(Logistic) regression

Neural networks

Nearest neighbor

Bayesian classifiers

With the exception of neural networks, a direct uplift generalization of the methods above will be given in Chap. 3. Neural networks are presented due to the existence of net scoring methods which use a combination of arbitrary gross scoring methods and they have been very popular in recent years among data scientists.

Using these or other algorithms, the rules or formulae may be found and afterwards be deployed on new observations in order to estimate the requested target. The prediction for each observation can then be taken as a base for decision-making or further calculations.

2.2 Methods

The standard process of data mining is, for example, described by CRISP DM (CRoss Industry Standard Process for Data Mining, see [7]). This is an iterative or circular-shaped process which indicates that a task in data mining may never be considered as accomplished in total.

Data mining generally starts with a problem understanding . It is helpful to write down explicitly what needs to be mined. Firstly, because nomenclature in precise problems may differ from real-world vocabulary, and secondly, in order to align expectations about what is doable, feasible, and in scope or out of scope. It also belongs to the understanding of business problems to agree what the results will be used for, how often they need to be produced, and what the ideal structure of the resulting data looks like. In most real-world scenarios, the data mining methods are already decided upon as well as the tool to be used. However, in principle, the data miner should be free to choose the deployed algorithm and the analytical software.

If the goals have been defined, the next step is typically to understand data . What kind of data is available and can be used for the question at hand? Is the data readily accessible from a data warehouse (DWH), from the cloud, or a data lake? What kind of transformations or preprocessing steps are required, or—in the extreme case—is a study required that generates the data to be used for analysis?

When understanding is accomplished, the analyst starts with data preparation . This especially includes the actual retrieval of data. This step is usually very tedious in practice and takes a lot of time.

Once the data is present (usually in some electronic form), the analyst starts examining the data. Data checking comprises several dimensions like the following:

Metadata: The variables that are available, their formats with respect to dimensions (date, currency, alphanumeric, numeric), the formats they come in (csv-files, database tables, txt-files, stream data, video data, voice data), their latency, their aggregation, the filter they come through, history of data

Quality of data: Missing values, corrupted values, precision, logical structure (e.g., fact tables, dimension tables), consistency, outliers

Simple statistics: Frequency counts, minima, maxima, averages, standard deviations of each variable. This is important as certain values may hardly occur which may have a direct impact on the methods and algorithms to be deployed.

Visual explorations: Simple data explorations by means of graphics, such as bar charts or line plots, help to get a feeling for the data.

Simple connections: Correlations, frequency tables, distributions, and various two- or multidimensional plots

The connection between problem understanding and data understanding will not only allow to estimate the effort of cleansing the data and putting it into the right shape for data mining (typically a flat table), but it will also allow to construct new derived variables, e.g., trends, densities, ratios. At this stage, it may even turn out that without further information, the task at hand may not be (sufficiently) solvable.

The next task for the analyst is model building which is in the focus of this section, i.e., the application of statistical methods like decision trees or regressions, usually for being able to make prognoses for the future behavior of the observations. The results then need to be validated in order to see if they answer the question of the business problem sufficiently. At this point, results typically show that some of the earlier steps need to be improved and the next iterations begin. This is continued until the results are satisfactory or until there is convincing evidence that, given current resources, a better result is not possible. The final step then is the deployment of results in some productive system in order to solve the original problem. However, even after deployment, a regular validation of the models is important.

The idea behind CRISP DM is an iterative approach not only as a whole but also for certain parts. It is often necessary to get back to the previous step when new information comes up. For example, when unexpectedly a data field is not available during the data preparation step, a return to data understanding (or even problem understanding) is required to integrate this new knowledge. Furthermore, results from modeling might give indications on how to improve data preparation. A graphical representation of the CRISP DM process is shown in Fig. 2.1.

../images/369368_1_En_2_Chapter/369368_1_En_2_Fig1_HTML.png

Fig. 2.1

Structured overview of the CRISP DM process. The iterative or back-and-forth nature is indicated by the corresponding arrows

Another way of organizing the workflows typical of data science is known as SEMMA (abbreviation for: Select, Examine, Modify, Model, Assess). It has been introduced by the statistical software company SAS , mainly for the functional organization of its main data mining software SAS Enterprise Miner and is, for example, described in Chapter 1 of [10]. The SAS Enterprise Miner is also capable of net scoring and this will be shown in Sect. 7.2. SEMMA assumes problem understanding and deployment as prerequisites but does not mention them. Just like CRISP DM, it emphasizes the importance of examining the available data.

During data exploration, it may turn out that too few relevant data or observations are available which may require additional effort to correct these shortcomings. If only very few observations are available, more observations can be produced artificially by sampling with replacement. In times of Big Data , this method may not seem to be required very often, but it is still used for good reasons. If, on the contrary, more data is available than processable or meaningful, then sampling seems a promising solution, i.e., taking only a random part of the data. If sampling is stratified with respect to the target variable, it is called over- or undersampling depending on whether more or fewer observations than the original fraction of the corresponding target value are selected for the sample.

Finally, the combination of data preparation and model evaluation typically includes the separation of the prepared data into several hold-out samples, for

Enjoying the preview?

Page 1 of 1

Targeting Uplift: An Introduction to Net Scores

About this ebook

René Michel

Related authors

Related to Targeting Uplift

Related ebooks

Business For You

Related podcast episodes

Related articles

Related categories

Reviews for Targeting Uplift

What did you think?

Book preview

Targeting Uplift - René Michel

1. Introduction

1.1 Problem Statement

1.2 State-of-the-Art

1.3 Structure of the Book

References

2. The Traditional Approach: Gross Scoring

2.1 Problem Statement

2.2 Methods