Package 'tools4uplift' reference manual

Title:	Tools for Uplift Modeling
Description:	Uplift modeling aims at predicting the causal effect of an action such as a marketing campaign on a particular individual. In order to simplify the task for practitioners in uplift modeling, we propose a combination of tools that can be separated into the following ingredients: i) quantization, ii) visualization, iii) variable selection, iv) parameters estimation and, v) model validation. For more details, please read "Belbahri, Murua, Gandouet, Partovi Nia - Uplift Regression : The R Package tools4uplift".
Authors:	Mouloud Belbahri, Olivier Gandouet, Alejandro Murua, Vahid Partovi Nia
Maintainer:	Mouloud Belbahri <[email protected]>
License:	GPL-2 \| GPL-3
Version:	1.0.1
Built:	2025-03-05 04:02:02 UTC
Source:	https://github.com/belbahrim/tools4uplift

Uplift barplot

Description

Barplot of observed uplift with respect to predicted uplift sorted from the highest to the lowest.

Usage


## S3 method for class 'PerformanceUplift'
barplot(height, ...)
## S3 method for class 'PerformanceUplift'
barplot(height, ...)

Arguments

`height`	a table that must be the output of `PerformanceUplift` function.
`...`	additional barplot arguments.

Value

a barplot and the associated Kendall's uplift rank correlation

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

model <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")

#performance of the heat map uplift estimation on the training dataset
perf <- PerformanceUplift(data = model, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 5)

barplot(perf)

library(tools4uplift)
data("SimUplift")

model <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")

#performance of the heat map uplift estimation on the training dataset
perf <- PerformanceUplift(data = model, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 5)

barplot(perf)

Qini-based feature selection

Description

Qini-based Uplift Regression in order to select the features that maximize the Qini coefficient.

Usage

BestFeatures(data, treat, outcome, predictors, rank.precision = 2, 
                              equal.intervals = FALSE, nb.group = 10, 
                              validation = TRUE, p = 0.3) 
BestFeatures(data, treat, outcome, predictors, rank.precision = 2, 
                              equal.intervals = FALSE, nb.group = 10, 
                              validation = TRUE, p = 0.3)

Arguments

`data`	a data frame containing the treatment, the outcome and the predictors.
`treat`	name of a binary (numeric) vector representing the treatment assignment (coded as 0/1).
`outcome`	name of a binary response (numeric) vector (coded as 0/1).
`predictors`	a vector of names representing the predictors to consider in the model.
`rank.precision`	precision for the ranking quantiles to compute the Qini coefficient. Must be 1 or 2. If 1, the ranking quantiles will be rounded to the first decimal. If 2, to the second decimal.
`equal.intervals`	flag for using equal intervals (with equal number of observations) or the true ranking quantiles which result in an unequal number of observations in each group to compute the Qini coefficient.
`nb.group`	the number of groups for computing the Qini coefficient if equal.intervals is TRUE - Default is 10.
`validation`	if TRUE, the best features are selected based on cross-validation - Default is TRUE.
`p`	if validation is TRUE, the desired proportion for the validation set. p is a value between 0 and 1 expressed as a decimal, it is set to be proportional to the number of observations per group - Default is 0.3.

Details

The regularization parameter is chosen based on the interaction uplift model that maximizes the Qini coefficient. Using the LASSO penalty, some predictors have coefficients set to zero.

Value

a vector of names representing the selected best features from the penalized logistic regression.

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2020) Qini-based Uplift Regression, <https://arxiv.org/pdf/1911.12474.pdf>

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

features <- BestFeatures(data = SimUplift, treat = "treat", outcome = "y", 
                         predictors = colnames(SimUplift[,3:7]),
                         equal.intervals = TRUE, nb.group = 5, 
                         validation = FALSE)
features

library(tools4uplift)
data("SimUplift")

features <- BestFeatures(data = SimUplift, treat = "treat", outcome = "y", 
                         predictors = colnames(SimUplift[,3:7]),
                         equal.intervals = TRUE, nb.group = 5, 
                         validation = FALSE)
features

Univariate quantization

Description

Univariate optimal partitionning for Uplift Models. The algorithm quantizes a single variable into bins with significantly different observed uplift.

Usage

BinUplift(data, treat, outcome, x, n.split = 10, alpha = 0.05, n.min = 30)
BinUplift(data, treat, outcome, x, n.split = 10, alpha = 0.05, n.min = 30)

Arguments

`data`	a data frame containing the treatment, the outcome and the predictor to quantize.
`treat`	name of a binary (numeric) vector representing the treatment assignment (coded as 0/1).
`outcome`	name of a binary response (numeric) vector (coded as 0/1).
`x`	name of the explanatory variable to quantize.
`n.split`	number of splits to test at each node. For continuous explanatory variables only (must be > 0). If n.split = 10, the test will be executed at each decile of the variable.
`alpha`	significance level of the statistical test (must be between 0 and 1).
`n.min`	minimum number of observations per child node.

Value

out.tree

Descriptive statistics for the different nodes of the tree

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

binX1 <- BinUplift(data = SimUplift, treat = "treat", outcome = "y", x = "X1", 
                  n.split = 100, alpha = 0.01, n.min = 30)

library(tools4uplift)
data("SimUplift")

binX1 <- BinUplift(data = SimUplift, treat = "treat", outcome = "y", x = "X1", 
                  n.split = 100, alpha = 0.01, n.min = 30)

Bivariate quantization

Description

A non-parametric heat map representing the observed uplift in rectangles that explore a bivariate dimension space. The function also returns the individual uplift based on the heatmap.

Usage

BinUplift2d(data, var1, var2, treat, outcome, valid = NULL, 
            n.split = 3, n.min = 30, plotit = FALSE, nb.col = 20)
BinUplift2d(data, var1, var2, treat, outcome, valid = NULL, 
            n.split = 3, n.min = 30, plotit = FALSE, nb.col = 20)

Arguments

`data`	a data frame containing uplift models variables.
`var1`	x-axis variable name. Represents the first dimension of interest.
`var2`	y-axis variable name. Represents the second dimension of interest.
`treat`	name of a binary (numeric) vector representing the treatment assignment (coded as 0/1).
`outcome`	name of a binary response (numeric) vector (coded as 0/1).
`valid`	a validation data frame containing uplift models variables.
`n.split`	the number of intervals to consider per explanatory variable. Must be an integer > 1.
`n.min`	minimum number of observations per group (treatment and control) within each rectangle. Must be an integer > 0.
`plotit`	if TRUE, a heatmap of observed uplift per rectangle is plotted.
`nb.col`	number of colors for the heatmap. Default is 20. Must be an integer and should greater than `n.split` for better visualization.

Value

returns an augmented dataset with Uplift_var1_var2 variable representing a predicted uplift for each observation based on the rectangle it belongs to. The function also plots a heat map of observed uplifts.

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

heatmap <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")

library(tools4uplift)
data("SimUplift")

heatmap <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")

Two-model estimator

Description

Fit the two-model uplift model estimator.

Usage


## S3 method for class 'formula'
DualUplift(formula, treat, data, ...)

## Default S3 method:
DualUplift(data, treat, outcome, predictors, ...)

## S3 method for class 'formula'
DualUplift(formula, treat, data, ...)

## Default S3 method:
DualUplift(data, treat, outcome, predictors, ...)

Arguments

`data`, `formula`	a data frame containing the treatment, the outcome and the predictors or a formula describing the model to be fitted.
`treat`	name of a binary (numeric) vector representing the treatment assignment (coded as 0/1).
`outcome`	name of a binary response (numeric) vector (coded as 0/1).
`predictors`	a vector of names representing the explanatory variables to include in the model.
`...`	additional arguments (other than `formula`, `family`, and `data`) to be passed to `glm` function for each sub-model.

Value

`model0`	Fitted model for control group
`model1`	Fitted model for treatment group

Author(s)

Mouloud Belbahri

References

Hansotia, B., J., and Rukstales B. (2001) Direct marketing for multichannel retailers: Issues, challenges and solutions. Journal of Database Marketing and Customer Strategy Management, Vol. 9(3), 259-266.

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

fit <- DualUplift(SimUplift, "treat", "y", predictors = colnames(SimUplift[, 3:12]))

print(fit)
summary(fit)
library(tools4uplift)
data("SimUplift")

fit <- DualUplift(SimUplift, "treat", "y", predictors = colnames(SimUplift[, 3:12]))

print(fit)
summary(fit)

Interaction estimator

Description

Fit the interaction uplift model estimator.

Usage


## S3 method for class 'formula'
InterUplift(formula, treat, data, ...)

## Default S3 method:
InterUplift(data, treat, outcome, predictors, input = "all", ...)

## S3 method for class 'formula'
InterUplift(formula, treat, data, ...)

## Default S3 method:
InterUplift(data, treat, outcome, predictors, input = "all", ...)

Arguments

`data`, `formula`	a data frame containing the treatment, the outcome and the predictors or a formula describing the model to be fitted.
`treat`	name of a binary (numeric) vector representing the treatment assignment (coded as 0/1).
`outcome`	name of a binary response (numeric) vector (coded as 0/1).
`predictors`	a vector of names representing the explanatory variables to include in the model.
`input`	an option for `predictors` argument. If `"all"` (default), the model assumes that the model has to create the interaction of all varibles with `treat`. If `"best"`, the model assumes that the `predictors` vector is the output of the `BestFeatures` function.
`...`	additional arguments (other than `formula`, `family`, and `data`) to be passed to `glm` function for the interaction model.

Value

an interaction model

Author(s)

Mouloud Belbahri

References

Lo, V., S., Y. (2002) The true lift model: a novel data mining approach to response modeling in database marketing. ACM SIGKDD Explorations Newsletter, Vol. 4(2), 78-86.

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

fit <- InterUplift(SimUplift, "treat", "y", colnames(SimUplift[, 3:12]))

library(tools4uplift)
data("SimUplift")

fit <- InterUplift(SimUplift, "treat", "y", colnames(SimUplift[, 3:12]))

LASSO path for the penalized logistic regression

Description

Fit an interaction uplift model via penalized maximum likelihood. The regularization path is computed for the lasso penalty at a grid of values for the regularization constant.

Usage

LassoPath(data, formula)
LassoPath(data, formula)

Arguments

`data`	a data frame containing the treatment, the outcome and the predictors.
`formula`	an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.

Value

a dataframe containing the coefficients values and the number of nonzeros coefficients for different values of lambda.

Author(s)

Mouloud Belbahri

References

Friedman, J., Hastie, T. and Tibshirani, R. (2010) Regularization Paths for Generalized Linear Models via Coordinate Descent, Journal of Statistical Software, Vol. 33(1), 1-22

Examples

#See glmnet() from library("glmnet") for more information
#See glmnet() from library("glmnet") for more information

Qini curve

Description

Curve of the function Qini, the incremental observed uplift with respect to predicted uplift sorted from the highest to the lowest.

Usage


## S3 method for class 'PerformanceUplift'
lines(x, ...)
## S3 method for class 'PerformanceUplift'
lines(x, ...)

Arguments

`x`	a table that must be the output of `PerformanceUplift` function.
`...`	additional plot arguments.

Value

a Qini curve and the associated Qini coefficient

Author(s)

Mouloud Belbahri

References

Radcliffe, N. (2007). Using control groups to target on predicted lift: Building and assessing uplift models. Direct Marketing Analytics Journal, An Annual Publication from the Direct Marketing Association Analytics Council, pages 14-21.

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

model1 <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")
perf1 <- PerformanceUplift(data = model1, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 3)
                  
model2 <- BinUplift2d(SimUplift, "X3", "X4", "treat", "y")
perf2 <- PerformanceUplift(data = model2, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X3_X4", 
                  equal.intervals = TRUE, nb.group = 3)
                  
plot(perf1, type='b')
lines(perf2, type='b', col='red')

library(tools4uplift)
data("SimUplift")

model1 <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")
perf1 <- PerformanceUplift(data = model1, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 3)
                  
model2 <- BinUplift2d(SimUplift, "X3", "X4", "treat", "y")
perf2 <- PerformanceUplift(data = model2, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X3_X4", 
                  equal.intervals = TRUE, nb.group = 3)
                  
plot(perf1, type='b')
lines(perf2, type='b', col='red')

Performance of an uplift estimator

Description

Table of performance of an uplift model. This table is used in order to vizualise the performance of an uplift model and to compute the qini coefficient.

Usage

PerformanceUplift(data, treat, outcome, prediction, nb.group = 10, 
                  equal.intervals = TRUE, rank.precision = 2)
PerformanceUplift(data, treat, outcome, prediction, nb.group = 10, 
                  equal.intervals = TRUE, rank.precision = 2)

Arguments

`data`	a data frame containing the response, the treatment and predicted uplift.
`treat`	a binary (numeric) vector representing the treatment assignment (coded as 0/1).
`outcome`	a binary response (numeric) vector (coded as 0/1).
`prediction`	a predicted uplift (numeric) vector to sort the observations from highest to lowest uplift.
`nb.group`	if equal.intervals is set to true, the number of groups of equal observations in which to partition the data set to show results.
`equal.intervals`	flag for using equal intervals (with equal number of observations) or the true ranking quantiles which result in an unequal number of observations in each group.
`rank.precision`	precision for the ranking quantiles. Must be 1 or 2. If 1, the ranking quantiles will be rounded to the first decimal. If 2, to the second decimal.

Value

a table with descriptive statistics related to an uplift model estimator.

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

model1 <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")
perf1 <- PerformanceUplift(data = model1, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 3)
                  
                  
print(perf1)
library(tools4uplift)
data("SimUplift")

model1 <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")
perf1 <- PerformanceUplift(data = model1, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 3)
                  
                  
print(perf1)

Qini curve

Description

Curve of the function Qini, the incremental observed uplift with respect to predicted uplift sorted from the highest to the lowest.

Usage


## S3 method for class 'PerformanceUplift'
plot(x, ...)
## S3 method for class 'PerformanceUplift'
plot(x, ...)

Arguments

`x`	a table that must be the output of `PerformanceUplift` function.
`...`	additional plot arguments.

Value

a Qini curve and the associated Qini coefficient

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

model1 <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")
perf1 <- PerformanceUplift(data = model1, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 3)
                  
                  
plot(perf1, type='b')

library(tools4uplift)
data("SimUplift")

model1 <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")
perf1 <- PerformanceUplift(data = model1, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 3)
                  
                  
plot(perf1, type='b')

Prediction from univariate quantization

Description

Predictions from the univariate quantization method, i.e. this function transforms a continuous variable into a categorical one.

Usage


## S3 method for class 'BinUplift'
predict(object, newdata, ...)
## S3 method for class 'BinUplift'
predict(object, newdata, ...)

Arguments

`object`	an object of class `BinUplift`, as that created by the function `BinUplift`.
`newdata`	the variable that was quantized in `object`.
`...`	additional arguments to be passed to `cut` function.

Value

a quantized variable

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

binX1 <- BinUplift(data = SimUplift, treat = "treat", outcome = "y", x = "X1", 
                  n.split = 100, alpha = 0.01, n.min = 30)

quantizedX1 <- predict(binX1, SimUplift$X1)

library(tools4uplift)
data("SimUplift")

binX1 <- BinUplift(data = SimUplift, treat = "treat", outcome = "y", x = "X1", 
                  n.split = 100, alpha = 0.01, n.min = 30)

quantizedX1 <- predict(binX1, SimUplift$X1)

Predictions from a two-model estimator

Description

Predictions from the two-model uplift model estimator with associated model performance.

Usage


## S3 method for class 'DualUplift'
predict(object, newdata, ...)
## S3 method for class 'DualUplift'
predict(object, newdata, ...)

Arguments

`object`	an object of class `DualUplift`, as that created by the function `DualUplift`.
`newdata`	a data frame containing the treatment, the outcome and the predictors of observations at which predictions are required.
`...`	additional arguments to be passed to `predict.glm` function for each sub-model.

Value

a vector of predicted uplift

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

fit <- DualUplift(SimUplift, "treat", "y", predictors = colnames(SimUplift[, 3:12]))

pred <- predict(fit, SimUplift)

library(tools4uplift)
data("SimUplift")

fit <- DualUplift(SimUplift, "treat", "y", predictors = colnames(SimUplift[, 3:12]))

pred <- predict(fit, SimUplift)

Predictions from an interaction estimator

Description

Predictions from the interaction uplift model estimator with associated model performance.

Usage


## S3 method for class 'InterUplift'
predict(object, newdata, treat, ...)

## S3 method for class 'InterUplift'
predict(object, newdata, treat, ...)

Arguments

`object`	an object of class `InterUplift`, as that created by the function `InterUplift`.
`newdata`	a data frame containing the treatment, the outcome and the predictors of observations at which predictions are required.
`treat`	name of a binary (numeric) vector representing the treatment assignment (coded as 0/1).
`...`	additional arguments to be passed to `predict.glm` function for the interaction model.

Value

a vector of predicted uplift

Author(s)

Mouloud Belbahri

References

Lo, V., S., Y. (2002) The true lift model: a novel data mining approach to response modeling in database marketing. ACM SIGKDD Explorations Newsletter, Vol. 4(2), 78-86.

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

fit <- InterUplift(SimUplift, "treat", "y", colnames(SimUplift[, 3:12]))

pred <- predict(fit, SimUplift, "treat")

library(tools4uplift)
data("SimUplift")

fit <- InterUplift(SimUplift, "treat", "y", colnames(SimUplift[, 3:12]))

pred <- predict(fit, SimUplift, "treat")

Qini coefficient

Description

Computes the area under the Qini curve.

Usage

## S3 method for class 'PerformanceUplift'
QiniArea(x, adjusted=FALSE, ...)
## S3 method for class 'PerformanceUplift'
QiniArea(x, adjusted=FALSE, ...)

Arguments

`x`	a table that must be the output of `PerformanceUplift` function.
`adjusted`	if TRUE, returns the Qini coefficient adjusted by the Kendall's uplift rank correlation.
`...`	Generic S3 Method argument.

Value

the Qini or the adjusted Qini coefficient

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2020) Qini-based Uplift Regression, <https://arxiv.org/pdf/1911.12474.pdf>

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

model <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")

#performance of the heat map uplift estimation on the training dataset
perf <- PerformanceUplift(data = model, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 5)
QiniArea(perf)

library(tools4uplift)
data("SimUplift")

model <- BinUplift2d(SimUplift, "X1", "X2", "treat", "y")

#performance of the heat map uplift estimation on the training dataset
perf <- PerformanceUplift(data = model, treat = "treat", 
                  outcome = "y", prediction = "Uplift_X1_X2", 
                  equal.intervals = TRUE, nb.group = 5)
QiniArea(perf)

Qini-based uplift regression

Description

A Qini-based LHS (Latin Hypercube Sampling) uplift model.

Usage

qLHS(data, treat, outcome, predictors, 
     lhs_points = 50, lhs_range = 1, 
     adjusted = TRUE, rank.precision = 2, equal.intervals = FALSE, 
     nb.group = 10, validation = TRUE, p = 0.3)
qLHS(data, treat, outcome, predictors, 
     lhs_points = 50, lhs_range = 1, 
     adjusted = TRUE, rank.precision = 2, equal.intervals = FALSE, 
     nb.group = 10, validation = TRUE, p = 0.3)

Arguments

`data`	a data frame containing the treatment, the outcome and the predictors.
`treat`	name of a binary (numeric) vector representing the treatment assignment (coded as 0/1).
`outcome`	name of a binary response (numeric) vector (coded as 0/1).
`predictors`	a vector of names representing the predictors to consider in the model.
`lhs_points`	number of LHS points to sample for each regularization constant.
`lhs_range`	a multiplicative scalar that controls the variance of the LHS search - Default is 1, the LHS procedure samples points uniformly with variance equal to the variance of the maximum likelihood estimator.
`adjusted`	if TRUE, the adjusted Qini coefficient is used instead of the Qini coefficient.
`rank.precision`	precision for the ranking quantiles to compute the Qini coefficient. Must be 1 or 2. If 1, the ranking quantiles will be rounded to the first decimal. If 2, to the second decimal.
`equal.intervals`	flag for using equal intervals (with equal number of observations) or the true ranking quantiles which result in an unequal number of observations in each group to compute the Qini coefficient.
`nb.group`	the number of groups for computing the Qini coefficient if equal.intervals is TRUE - Default is 10.
`validation`	if TRUE, the best LHS model is selected based on cross-validation - Default is TRUE.
`p`	if validation is TRUE, the desired proportion for the validation set. p is a value between 0 and 1 expressed as a decimal, it is set to be proportional to the number of observations per group - Default is 0.3.

Details

The regularization parameter is chosen based on the interaction uplift model that maximizes the Qini coefficient of the LHS search.

Value

the models with LHS coefficients of class InterUplift.

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2020) Qini-based Uplift Regression, <https://arxiv.org/pdf/1911.12474.pdf>

Examples


library(tools4uplift)
data("SimUplift")

upliftLHS <- qLHS(data = SimUplift, treat = "treat", outcome = "y", 
                  predictors = colnames(SimUplift[,3:7]), lhs_points = 5,
                  lhs_range = 1, adjusted = TRUE, equal.intervals = TRUE, 
                  nb.group = 5, validation = FALSE)

library(tools4uplift)
data("SimUplift")

upliftLHS <- qLHS(data = SimUplift, treat = "treat", outcome = "y", 
                  predictors = colnames(SimUplift[,3:7]), lhs_points = 5,
                  lhs_range = 1, adjusted = TRUE, equal.intervals = TRUE, 
                  nb.group = 5, validation = FALSE)

Synthetic data for uplift modeling

Description

The synthetic data contains 20 predictors, a treatment allocation variable and an outcome binary variable. This dataset is used in the package examples.

Usage

data("SimUplift")data("SimUplift")

Format

A data frame with 1000 observations on the following 22 variables.

y: a binary response vector
treat: a binary treatment allocation vector
X1: a numeric vector
X2: a numeric vector
X3: a numeric vector
X4: a numeric vector
X5: a numeric vector
X6: a numeric vector
X7: a numeric vector
X8: a numeric vector
X9: a numeric vector
X10: a numeric vector
X11: a numeric vector
X12: a numeric vector
X13: a numeric vector
X14: a numeric vector
X15: a numeric vector
X16: a numeric vector
X17: a numeric vector
X18: a numeric vector
X19: a numeric vector
X20: a numeric vector

Examples

data("SimUplift")
data("SimUplift")

Split data with respect to uplift distribution

Description

Split a dataset into training and validation subsets with respect to the uplift sample distribution.

Usage

SplitUplift(data, p, group)
SplitUplift(data, p, group)

Arguments

`data`	a data frame of interest that contains at least the response and the treatment variables.
`p`	The desired sample size. p is a value between 0 and 1 expressed as a decimal, it is set to be proportional to the number of observations per group.
`group`	Your grouping variables. Generally, for uplift modelling, this should be a vector of treatment and response variables names, e.g. c("treat", "y").

Value

`train`	a training data frame of p percent
`valid`	a validation data frame of 1-p percent

Author(s)

Mouloud Belbahri

References

Belbahri, M., Murua, A., Gandouet, O., and Partovi Nia, V. (2021) Uplift Regression : The R Package tools4uplift, <https://arxiv.org/pdf/1901.10867.pdf>

Examples


library(tools4uplift)
data("SimUplift")

split <- SplitUplift(SimUplift, 0.8, c("treat", "y"))
train <- split[[1]]
valid <- split[[2]]

library(tools4uplift)
data("SimUplift")

split <- SplitUplift(SimUplift, 0.8, c("treat", "y"))
train <- split[[1]]
valid <- split[[2]]

Uplift barplot for categorical variables

Description

Computes the observed uplift per category and creates a barplot.

Usage

UpliftPerCat(data, treat, outcome, x, ...)
UpliftPerCat(data, treat, outcome, x, ...)

Arguments

`data`	a data frame containing the treatment, the outcome and the variable of interest.
`treat`	name of a binary (numeric) vector representing the treatment assignment (coded as 0/1).
`outcome`	name of a binary response (numeric) vector (coded as 0/1).
`x`	name of the explanatory variable of interest.
`...`	extra parameters for the barplot.

Value

returns a barplot representing the uplift per category.

Author(s)

Mouloud Belbahri

Examples


library(tools4uplift)
data("SimUplift")

binX1 <- BinUplift(data = SimUplift, treat = "treat", outcome = "y", x = "X1", 
                  n.split = 100, alpha = 0.01, n.min = 30)

SimUplift$quantizedX1 <- predict(binX1, SimUplift$X1)
UpliftPerCat(data = SimUplift, treat = "treat", outcome = "y", 
            x = "quantizedX1", xlab='Quantized X1', ylab='Uplift',
            ylim=c(-1,1))

library(tools4uplift)
data("SimUplift")

binX1 <- BinUplift(data = SimUplift, treat = "treat", outcome = "y", x = "X1", 
                  n.split = 100, alpha = 0.01, n.min = 30)

SimUplift$quantizedX1 <- predict(binX1, SimUplift$X1)
UpliftPerCat(data = SimUplift, treat = "treat", outcome = "y", 
            x = "quantizedX1", xlab='Quantized X1', ylab='Uplift',
            ylim=c(-1,1))

Package 'tools4uplift'

Help Index

Uplift barplot

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Qini-based feature selection

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Univariate quantization

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Bivariate quantization

Description

Usage

Arguments

Value

Author(s)

References

Examples

Two-model estimator

Description

Usage

Arguments

Value

Author(s)

References

Examples

Interaction estimator

Description

Usage

Arguments

Value

Author(s)

References

Examples

LASSO path for the penalized logistic regression

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Qini curve

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Performance of an uplift estimator

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples