Package 'mob'

Title: Monotonic Optimal Binning
Description: Generate the monotonic binning and perform the woe (weight of evidence) transformation for the logistic regression used in the consumer credit scorecard development. The woe transformation is a piecewise transformation that is linear to the log odds. For a numeric variable, all of its monotonic functional transformations will converge to the same woe transformation.
Authors: WenSui Liu
Maintainer: WenSui Liu <[email protected]>
License: GPL (>= 2)
Version: 0.4.2
Built: 2025-02-19 02:57:46 UTC
Source: https://github.com/cran/mob

Help Index


Monotonic binning based on decision tree model

Description

The function arb_bin implements the monotonic binning based on the decision tree.

Usage

arb_bin(x, y)

Arguments

x

A numeric vector

y

A numeric vector with 0/1 binary values

Value

A list of binning outcomes, including a numeric vector with cut points and a dataframe with binning summary

Examples

data(hmeq)
arb_bin(hmeq$DEROG, hmeq$BAD)

Monotonic binning by quantile with cases Y = 1

Description

The function bad_bin implements the quantile-based monotonic binning by the iterative discretization based on cases with Y = 1.

Usage

bad_bin(x, y)

Arguments

x

A numeric vector

y

A numeric vector with 0/1 binary values

Value

A list of binning outcomes, including a numeric vector with cut points and a dataframe with binning summary

Examples

data(hmeq)
bad_bin(hmeq$DEROG, hmeq$BAD)

Apply monotonic binning to all vectors in dataframe

Description

The function batch_bin applies multiple binning algorithms in batch to each vector in the dataframe.

Usage

batch_bin(y, xs, method = 1)

Arguments

y

A numeric vector with 0/1 binary values.

xs

A dataframe with numeric vectors to discretize.

method

A integer from 1 to 7 referring to implementations below: 1. Implementation of iso_bin() 2. Implementation of qtl_bin() 3. Implementation of bad_bin() 4. Implementation of rng_bin() 5. Implementation of gbm_bin() 6. Implementation of kmn_bin() 7. Implementation of arb_bin()

Value

A list of binning outcomes with 2 dataframes: bin_sum: A dataframe of binning summary. bin_out: A list of binning output from binning functions, e.g. qtl_bin().

Examples

data(hmeq)
batch_bin(hmeq$BAD, hmeq[, c('DEROG', 'DELINQ')])

Apply WoE transformations to vectors in dataframe

Description

The function batch_woe applies WoE transformations to vectors in the dataframe.

Usage

batch_woe(xs, bin_out)

Arguments

xs

A dataframe with numeric vectors to discretize.

bin_out

A binning output from the function batch_bin().

Value

A dataframe with identical headers as the input xs. However, values of each variable have been transformed to WoE values.

Examples

data(hmeq)
bin_out <- batch_bin(hmeq$BAD, hmeq[, c('DEROG', 'DELINQ')])$bin_out
head(batch_woe(hmeq[, c('DEROG', 'DELINQ')], bin_out))

Perform WoE transformation of a numeric variable

Description

The function cal_woe applies the WoE transformation to a numeric vector based on the binning outcome from a binning function, e.g. qtl_bin() or iso_bin().

Usage

cal_woe(x, bin)

Arguments

x

A numeric vector that will be transformed to WoE values.

bin

A list with the binning outcome from the binning function, e.g. qtl_bin() or iso_bin()

Value

A numeric vector with WoE transformed values.

Examples

data(hmeq)
bin_out <- qtl_bin(hmeq$DEROG, hmeq$BAD)
cal_woe(hmeq$DEROG[1:10], bin_out)

Monotonic binning based on generalized boosted model

Description

The function gbm_bin implements the monotonic binning based on the generalized boosted model (GBM).

Usage

gbm_bin(x, y)

Arguments

x

A numeric vector

y

A numeric vector with 0/1 binary values

Value

A list of binning outcomes, including a numeric vector with cut points and a dataframe with binning summary

Examples

data(hmeq)
gbm_bin(hmeq$DEROG, hmeq$BAD)

Credit attributes of 5,960 home equity loans

Description

A dataset containing characteristics and delinquency information for 5,960 home equity loans.

Usage

hmeq

Format

A data frame with 5960 rows and 13 variables:

BAD

indicator of applicant defaulted on loan or seriously delinquent

LOAN

Amount of the loan request, in dollar

MORTDUE

Amount due on existing mortgage, in dollar

VALUE

Value of current property, in dollar

REASON

DebtCon = debt consolidation; HomeImp = home improvement

JOB

Occupational categories

YOJ

Years at present job

DEROG

Number of major derogatory reports

DELINQ

Number of delinquent credit lines

CLAGE

Age of oldest credit line in months

NINQ

Number of recent credit inquiries

CLNO

Number of credit lines

DEBTINC

Debt-to-income ratio

Source

http://www.creditriskanalytics.net/datasets-private2.html


Monotonic binning based on isotonic regression

Description

The function iso_bin implements the monotonic binning based on the isotonic regression.

Usage

iso_bin(x, y)

Arguments

x

A numeric vector

y

A numeric vector with 0/1 binary values

Value

A list of binning outcomes, including a numeric vector with cut points and a dataframe with binning summary

Examples

data(hmeq)
iso_bin(hmeq$DEROG, hmeq$BAD)

Monotonic binning based on k-means clustering

Description

The function kmn_bin implements the monotonic binning based on the k-means clustering

Usage

kmn_bin(x, y)

Arguments

x

A numeric vector

y

A numeric vector with 0/1 binary values

Value

A list of binning outcomes, including a numeric vector with cut points and a dataframe with binning summary

Examples

data(hmeq)
kmn_bin(hmeq$DEROG, hmeq$BAD)

Monotonic binning for the pool data

Description

The function pool_bin implements the monotonic binning for the pool data based on the generalized boosted model (GBM).

Usage

pool_bin(x, num, den, log = FALSE)

Arguments

x

A numeric vector

num

A numeric vector with integer values for numerators to calculate bad rates

den

A numeric vector with integer values for denominators to calculate bad rates

log

A logical constant either TRUE or FALSE. The default is FALSE

Value

A list of binning outcomes, including a numeric vector with cut points and a dataframe with binning summary

Examples

data(hmeq)
df <- rbind(Reduce(rbind, 
                   lapply(split(hmeq, floor(hmeq$CLAGE)),
                          function(d) data.frame(AGE = unique(floor(d$CLAGE)),
                                                 NUM = sum(d$BAD),
                                                 DEN = nrow(d)))),
            data.frame(AGE = NA, 
                       NUM = sum(hmeq[is.na(hmeq$CLAGE), ]$BAD),
                       DEN = nrow(hmeq[is.na(hmeq$CLAGE), ])))
pool_bin(df$AGE, df$NUM, df$DEN, log = TRUE)

Discretizing a numeric vector

Description

The function qcut discretizes a numeric vector into N pieces based on quantiles.

Usage

qcut(x, n)

Arguments

x

A numeric vector.

n

An integer indicating the number of categories to discretize.

Value

A numeric vector to divide the vector x into n categories.

Examples

x <- 1:10
# [1]  1  2  3  4  5  6  7  8  9 10
v <- qcut(1:10, 4)
# [1] 3 5 8
findInterval(x, sort(c(v, -Inf, Inf)), left.open = TRUE)
# [1] 1 1 1 2 2 3 3 3 4 4

Monotonic binning by quantile

Description

The function qtl_bin implements the quantile-based monotonic binning by the iterative discretization

Usage

qtl_bin(x, y)

Arguments

x

A numeric vector

y

A numeric vector with 0/1 binary values

Value

A list of binning outcomes, including a numeric vector with cut points and a dataframe with binning summary

Examples

data(hmeq)
qtl_bin(hmeq$DEROG, hmeq$BAD)

Monotonic binning by quantile based on value range

Description

The function rng_bin implements the quantile-based monotonic binning by the iterative discretization based on the equal-width range of values.

Usage

rng_bin(x, y)

Arguments

x

A numeric vector

y

A numeric vector with 0/1 binary values

Value

A list of binning outcomes, including a numeric vector with cut points and a dataframe with binning summary

Examples

data(hmeq)
rng_bin(hmeq$DEROG, hmeq$BAD)