`mosum`¶

Moving Sum Based Procedures for Changes in the Mean

Submodules¶

Package Contents¶

Functions¶

`mosum`
`criticalValue`(n, G_left, G_right, alpha)	Computes the asymptotic critical value for the MOSUM test
`multiscale_bottomUp`(x[, G, threshold, alpha, ...])	Multiscale MOSUM algorithm with bottom-up merging
`multiscale_localPrune`(x[, G, max_unbalance, ...])	Multiscale MOSUM algorithm with localised pruning
`bandwidths_default`(→ int)	Default choice for the set of multiple bandwidths
`testData`([model, lengths, means, sds, rand_gen, seed, ...])	Test data with piecewise constant mean
`persp3D_multiscaleMosum`(x[, mosum_args, threshold, ...])	3D Visualisation of multiscale MOSUM statistics

mosum.mosum(x, G, G_right=float('nan'), var_est_method=['mosum', 'mosum_min', 'mosum_max', 'custom'][0], var_custom=None, boundary_extension=True, threshold=['critical_value', 'custom'][0], alpha=0.1, threshold_custom=float('nan'), criterion=['eta', 'epsilon'][0], eta=0.4, epsilon=0.2, do_confint=False, level=0.05, N_reps=1000)¶

MOSUM procedure for multiple change point estimation

Computes the MOSUM detector, detects (multiple) change points and estimates their locations.

Parameters:

x (list) – input data
G (int) – bandwidth; should be less than ‘len(x)/2’
G_right (int) – if ‘G.right != G}, the asymmetric bandwidth ‘(G, G.right)’ will be used; if ‘max(G, G.right)/min(G, G.right) > 4’, a warning message is generated
var_est_method (how the variance is estimated; possible values are) – ‘mosum’ : both-sided MOSUM variance estimator ‘mosum_min’ : minimum of the sample variance estimates from the left and right summation windows ‘mosum_max’ : maximum of the sample variance estimates from the left and right summation windows ‘custom’ : a vector of ‘len(x)’ is to be parsed by the user; use ‘var.custom’ in this case to do so
var_custom (float) – vector (of the same length as ‘x}) containing local estimates of the variance or long run variance; use iff ‘var.est.method = “custom”’
boundary_extension (bool) – a logical value indicating whether the boundary values should be filled in with CUSUM values
threshold (Str) – indicates which threshold should be used to determine significance. By default, it is chosen from the asymptotic distribution at the given significance level ‘alpha`. Alternatively it is possible to parse a user-defined numerical value with ‘threshold.custom’.
alpha (float) – numeric value for the significance level with ‘0 <= alpha <= 1’; use iff ‘threshold = “critical_value”’
threshold_custom (float) – value greater than 0 for the threshold of significance; use iff ‘threshold = “custom”’
criterion (Str) – indicates how to determine whether each point ‘k’ at which MOSUM statistic exceeds the threshold is a change point; possible values are ‘eta’ : there is no larger exceeding in an ‘eta*G’ environment of ‘k’ ‘epsilon’ : ‘k’ is the maximum of its local exceeding environment, which has at least size ‘epsilon*G’
eta (float) – a positive numeric value for the minimal mutual distance of changes, relative to moving sum bandwidth (iff ‘criterion = “eta”’)
epsilon (float) – a numeric value in (0,1] for the minimal size of exceeding environments, relative to moving sum bandwidth (iff ‘criterion = “epsilon”’)
do_confint (bool) – flag indicating whether to compute the confidence intervals for change points
level (float) – use iff ‘do_confint = True’; a numeric value (‘0 <= level <= 1’) with which ‘100(1-level)%’ confidence interval is generated
N_reps (int) – use iff ‘do.confint = True’; number of bootstrap replicates to be generated

Returns:

mosum_obj object containing
x (list) – input data
G_left, G_right (int) – bandwidths
var_est_method, var_custom, boundary_extension (Str) – input
stat (list) – MOSUM statistics
rollsums (list) – MOSUM detector
var_estimation (list) – local variance estimates
threshold, alpha, threshold_custom – input
threshold_value (float) – threshold of MOSUM test
criterion, eta, epsilon – input
cpts (ndarray) – estimated change point
cpts_info (DataFrame) – information on change points, including detection bandwidths, asymptotic p-values, scaled jump sizes
do_confint (bool) – input
ci – confidence intervals

Examples

>>> import mosum
>>> xx = mosum.testData("blocks")["x"]
>>> xx_m  = mosum.mosum(xx, G = 50, criterion = "eta", boundary_extension = True)
>>> xx_m.summary()
>>> xx_m.print()

mosum.criticalValue(n, G_left, G_right, alpha)¶: Computes the asymptotic critical value for the MOSUM test

mosum.multiscale_bottomUp(x, G=None, threshold=['critical_value', 'custom'][0], alpha=0.1, threshold_function=None, eta=0.4, do_confint=False, level=0.05, N_reps=1000)¶

Multiscale MOSUM algorithm with bottom-up merging

Parameters:

x (list) – input data
G (int) –

vector of bandwidths; given as either integers less than len(x)/2,
or numbers between 0 and 0.5 describing the moving sum bandwidths relative to len(x)
threshold (Str) – indicates which threshold should be used to determine significance. By default, it is chosen from the asymptotic distribution at the given significance level ‘alpha`. Alternatively it is possible to parse a user-defined function with ‘threshold_function’.
alpha (float) – numeric value for the significance level with ‘0 <= alpha <= 1’; use iff ‘threshold = “critical_value”’
threshold_function (function) –
eta (float) – a positive numeric value for the minimal mutual distance of changes, relative to moving sum bandwidth (iff ‘criterion = “eta”’)
do_confint (bool) – flag indicating whether to compute the confidence intervals for change points
level (float) – use iff ‘do_confint = True’; a numeric value (‘0 <= level <= 1’) with which ‘100(1-level)%’ confidence interval is generated
N_reps (int) – use iff ‘do.confint = True’; number of bootstrap replicates to be generated

Returns:

multiscale_cpts object containing
x (list) – input data
G (int) – bandwidth vector
threshold, alpha, threshold_function, eta – input
cpts (ndarray) – estimated change point
cpts_info (DataFrame) – information on change points, including detection bandwidths, asymptotic p-values, scaled jump sizes
pooled_cpts (ndarray) – change point candidates
do_confint (bool) – input
ci – confidence intervals

Examples

>>> import mosum
>>> xx = mosum.testData("blocks")["x"]
>>> xx_m  = mosum.multiscale_bottomUp(xx, G = [50,100])
>>> xx_m.summary()
>>> xx_m.print()

mosum.multiscale_localPrune(x, G=None, max_unbalance=4, threshold='critical_value', alpha=0.1, threshold_function=None, criterion='eta', eta=0.4, epsilon=0.2, rule='pval', penalty='log', pen_exp=1.01, do_confint=False, level=0.05, N_reps=1000)¶

Multiscale MOSUM algorithm with localised pruning

xlist
input data

Gint

vector of bandwidths; given as either integers less than len(x)/2,
or numbers between 0 and 0.5 describing the moving sum bandwidths relative to len(x)

max_unbalancefloat
a numeric value for the maximal ratio between maximal and minimal bandwidths to be used for candidate generation, at least 1

thresholdStr
indicates which threshold should be used to determine significance. By default, it is chosen from the asymptotic distribution at the given significance level ‘alpha`. Alternatively it is possible to parse a user-defined function with ‘threshold_function’.

alphafloat
numeric value for the significance level with ‘0 <= alpha <= 1’; use iff ‘threshold = “critical_value”’

threshold_functionfunction

criterion : Str

indicates how to determine whether each point ‘k’ at which MOSUM statistic exceeds the threshold is a change point; possible values are ‘eta’ : there is no larger exceeding in an ‘eta*G’ environment of ‘k’ ‘epsilon’ : ‘k’ is the maximum of its local exceeding environment, which has at least size ‘epsilon*G’

etafloat

a positive numeric value for the minimal mutual distance of changes, relative to moving sum bandwidth (iff ‘criterion = “eta”’)

epsilonfloat

a numeric value in (0,1] for the minimal size of exceeding environments, relative to moving sum bandwidth (iff ‘criterion = “epsilon”’)

ruleStr

Choice of sorting criterion for change point candidates in merging step. Possible values are: ‘pval’ : smallest p-value ‘jump’ : largest (rescaled) jump size

penaltyStr

Type of penalty term to be used in Schwarz criterion; possible values are: ‘log’ : use ‘penalty = log(len(x))**pen_exp’ ‘polynomial’ : use ‘penalty = len(x)**pen_exp’

pen_expfloat

penalty exponent

do_confintbool

flag indicating whether to compute the confidence intervals for change points

levelfloat

use iff ‘do_confint = True’; a numeric value (‘0 <= level <= 1’) with which ‘100(1-level)%’ confidence interval is generated

N_repsint

use iff ‘do.confint = True’; number of bootstrap replicates to be generated

multiscale_cpts object containing x : list

input data

Gint: bandwidth vector
threshold, alpha, threshold_function, eta: input
cptsndarray: estimated change point
cpts_infoDataFrame: information on change points, including detection bandwidths, asymptotic p-values, scaled jump sizes
pooled_cptsndarray: change point candidates
do_confintbool: input
ci: confidence intervals

>>> import mosum
>>> xx = mosum.testData("mix")["x"]
>>> xx_m  = mosum.multiscale_localPrune(xx, G = [8,15,30,70])
>>> xx_m.summary()
>>> xx_m.print()

mosum.bandwidths_default(n, d_min=10, G_min=10, G_max=None) → int¶: Default choice for the set of multiple bandwidths

mosum.testData(model=['custom', 'blocks', 'fms', 'mix', 'stairs10', 'teeth10'][1], lengths=None, means=None, sds=None, rand_gen=np.random.normal, seed=None, rand_gen_args=[0, 1])¶

Test data with piecewise constant mean

Generate piecewise stationary time series with independent innovations and change points in the mean.

Parameters:

model (str) – custom or pre-defined signal
lengths (int) – vector of segment lengths (custom only)
means (int) – vector of segment means (custom only)
sds (int) – vector of segment standard deviations (custom only)
rand_gen (function) – innovation function
seed (int) – random seed
rand_gen_args (ndarray) – arguments for rand_gen

Returns:

x (ndarray) – simulated data series
mu (ndarray) – signal
sigma (float) – standard deviation
cpts (ndarray) – true change points

Examples

>>> mosum.testData()
>>> mosum.testData("custom", lengths = [100,100], means=[0,1], sds= [1,1])

mosum.persp3D_multiscaleMosum(x, mosum_args=dict(), threshold=['critical_value', 'custom'][0], alpha=0.1, threshold_function=None, palette=cm.coolwarm, xlab='G', ylab='time', zlab='MOSUM')¶

3D Visualisation of multiscale MOSUM statistics

Parameters:

x (list) – input data
mosum_args (dict) – dictionary of keyword arguments to mosum
threshold (Str) – indicates which threshold should be used to determine significance. By default, it is chosen from the asymptotic distribution at the given significance level ‘alpha`. Alternatively it is possible to parse a user-defined function with ‘threshold_function’.
alpha (float) – numeric value for the significance level with ‘0 <= alpha <= 1’; use iff ‘threshold = “critical_value”’
threshold_function (function) –
palette (matplotlib.colors.LinearSegmentedColormap) – colour palette for plotting, accessible from matplotlib.cm
xlab (Str) – axis labels for plot
ylab (Str) – axis labels for plot
zlab (Str) – axis labels for plot

Examples

>>> import mosum
>>> xx = mosum.testData("blocks")["x"]
>>> mosum.persp3D_multiscaleMosum(xx)

mosum¶

Submodules¶

Package Contents¶

Functions¶

`mosum`¶